我正在尝试将SQL上传到运行在Docker容器中的SQL Server。下面是docker-compose.yml:
version: '3.4'
services:
sqlserver:
image: mcr.microsoft.com/mssql/server
container_name: sqlserver
environment:
- ACCEPT_EULA=Y
- SA_PASSWORD=Password123
ports:
- "1433:1433"
volumes:
- ./sqldata:/var/opt/mssql/data
- ./data:/app/scripts
下面是我的python代码,用于将csv加载到db中:
def connectSQLServer():
# Set up the connection information
server = 'localhost'
database = 'SLQBook'
username = 'sa'
password = 'Password123'
# Connect to the database
conn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password, autocommit=True)
# Set up a cursor to execute SQL statements
cursor = conn.cursor()
return conn, cursor
def loadDataBulk(cursor, mapping, tableName, delim = ','):
dataFileLoc = mapping[tableName.lower()]
print("dataFileLoc=", dataFileLoc)
cursor.fast_executemany = True
with open (dataFileLoc, 'r') as f:
reader = csv.reader(f, delimiter = delim)
headers = next(reader)
data = list(reader)[1:]
query = 'insert into {0} values ({1})'
query = query.format(tableName, ','.join('?' * len(headers)))
cursor.executemany(query, data)
print(f'{len(data)} rows inserted to the {tableName} table')
conn, cursor = connectSQLServer()
mapping = {"subscribers":"data/Subscribers.txt"}
loadDataBulk(cursor, mapping, tableName='subscribers', delim = '\t')
当我运行这段代码时,我得到一个错误:
pyodbc.OperationalError: ('08S01', '[08S01] [Microsoft][ODBC Driver 17 for SQL Server]TCP Provider: An existing connection was forcibly closed by the remote host.\r\n (10054) (SQLExecute); [08S01] [Microsoft][ODBC Driver 17 for SQL Server]Communication link failure (10054)')
当我在没有cursor.fast_executemany = True
的情况下加载数据时,或者当我使用cursor.execute(query, data)
逐个加载数据时,数据加载成功,但非常慢。有人能帮助我理解为什么只有当我用cursor.fast_executemany = True
批量加载数据时才会出现这个错误吗?
1条答案
按热度按时间hgc7kmma1#
对于
fast_executemany=False
(默认值),pyodbc为每一行发送一个单独的INSERT语句。这是一种“更安全”的方法,但它可能会很慢。fast_executemany=True
告诉pyodbc将所有行的参数值打包到一个名为“parameterarray”的二进制ODBC结构中,并一次性发送它们(本质上)。这可以极大地加快批量插入的速度,但也可能遇到资源不足的问题:1.在客户端,Python进程可能会耗尽内存,或者
1.在服务器端,SQL Server示例可能没有足够的空间一次接收所有参数值。
你似乎正在经历后者。
解决方法是以块的形式发送行: