我需要将一个巨大的Pandas数据集加载到MS SQL服务器中。遗憾的是,to_SQL()方法的运行速度非常慢,即使是使用‘MULTI’方法也是如此。这就是为什么我选择使用bcp库的大容量复制方法。以下是我尝试过的:
import bcp
conn = bcp.Connection(host='HOST', driver='mssql')
my_bcp = bcp.BCP(conn)
table_name = 'test17'
csv_filename= r'c:temp%s.csv' % table_name
df.to_csv(csv_filename, index=False, header=False, sep=';')
file = bcp.DataFile(file_path=csv_filename, delimiter=';')
my_bcp.load(input_file=file, table='QUANT_work..test17')
以下是错误:
AttributeError Traceback (most recent call last)
C:UsersABENHA~1AppDataLocalTemp/ipykernel_5112/883077605.py in <module>
----> 1 my_bcp.load(input_file=file, table='QUANT_work..test17')
C:AppsAnaconda3libsite-packagesbcpcore.py in load(self, input_file, table)
58 else:
59 raise DriverNotSupportedException
---> 60 load.execute()
61
62 def dump(self, query: str, output_file: 'DataFile'):
C:AppsAnaconda3libsite-packagesbcpdialectsmssql.py in execute(self)
80 This will run the instance's command via the BCP utility
81 """
---> 82 subprocess.run(f'bcp {self.command}', check=True)
83
84 @property
C:AppsAnaconda3libsite-packagesbcpdialectsmssql.py in command(self)
90 the command that will be passed into the BCP command line utility
91 """
---> 92 return f'{self.table} in "{self.file.path}" {self.connection} {self.config} {self.logging} {self.error}'
93
94 @property
C:AppsAnaconda3libsite-packagesbcpfiles.py in path(self)
54 @property
55 def path(self) -> Path:
---> 56 return self.file.absolute()
57
58
AttributeError: 'str' object has no attribute 'absolute'
谢谢。
1条答案
按热度按时间jfgube3f1#
尝试在bcp.DataFile中使用Path对象而不是字符串变量。
来源:https://github.com/fivestack/bcp/issues/15