将 Dataframe 插入MySQL

flvtvl50  于 2022-10-31  发布在  Mysql
关注(0)|答案(2)|浏览(176)

我有一个如下所示的 Dataframe :

import pandas as pd

student_dict = {
"ID":[101,102,103,104,105],
"Student":["AAA","BBB","CCC","DDD","EEE"],
"Mark":[50,100,99,60,80],
"Address":["St.AAA","St.BBB","St.CCC","St.DDD","St.EEE"],
"PhoneNo":[1111111111,2222222222,3333333333,4444444444,5555555555]
}

df = pd.DataFrame(student_dict)

ID  Student Mark    Address PhoneNo
101 AAA       50    St.AAA  1111111111
102 BBB      100    St.BBB  2222222222
103 CCC       99    St.CCC  3333333333
104 DDD       60    St.DDD  4444444444
105 EEE       80    St.EEE  5555555555

MySql数据表(StudentTable):

CREATE TABLE StudentTable(
    ID int,
    Student varchar(255),
    Mark int,
    Address varchar(255),
    PhoneNo varchar(20),
    primary key (ID)
);

为了将 Dataframe df插入到MySQL表中,我使用了:

import pymysql
pymysql.install_as_MySQLdb()
from sqlalchemy import create_engine

# Create engine

engine = create_engine('mysql://user:pass@127.0.0.1/dbname')

# Create the connection and close it(whether successed of failed)

with engine.begin() as connection:
    df.to_sql(name='StudentTable', con=connection, if_exists='append', index=False)

并且 Dataframe 成功插入

,如果我尝试插入另一个df,如下所示:

student_dict2 = {
"ID":[122,102,123],
"Student":["XXX","YYY","ZZZ"],
"Mark":[60,80,95],
"Address":["St.XXX","St.YYY","St.ZZZ"],
"PhoneNo":[1111111110,2222222222,3333333315]
}

df = pd.DataFrame(student_dict2)

ID  Student Mark    Address PhoneNo
122 XXX       60    St.XXX  1111111110
102 YYY       80    St.YYY  2222222222
123 ZZZ       95    St.ZZZ  3333333315

它将抛出如下错误

IntegrityError: (pymysql.err.IntegrityError) (1062, "Duplicate entry '102' for key 'StudentTable.PRIMARY'")

它不会将不重复的其他行插入到StudentTable中。我需要的是,*它需要将不重复的行插入到表中,并且 * 仅对重复的行抛出错误 。是否有任何方法可以在StudentTable中创建一个Timestamp列,以自动捕获数据插入时间

wko9yo5t

wko9yo5t1#

CREATE TABLE StudentTable(
    ID int,
    Student varchar(255),
    Mark int,
    Address varchar(255),
    PhoneNo varchar(20),
    created_at TIMESTAMP DEFAULT NOW() ON UPDATE CURRENT_TIMESTAMP,
    primary key (ID) 
);

我认为这就是创建时间戳列的方法,据我所知,该列将捕获数据插入时间。
关于只插入那些不重复的元素,我不确定,但我认为你应该迭代student_dict中的每一行,忽略那些有重复id的行。

juzqafwq

juzqafwq2#

不能插入重复的键ID。键值对于数据库中的每条记录必须是唯一的

相关问题