python “PyODBC [错误10054]:TCP提供程序:远程主机“”使用fast_executemmany =True强制关闭了现有连接

6tdlim6h  于 2023-05-16  发布在  Python
关注(0)|答案(1)|浏览(236)

我正在尝试将SQL上传到运行在Docker容器中的SQL Server。下面是docker-compose.yml:

version: '3.4'

services:
  sqlserver:
    image: mcr.microsoft.com/mssql/server
    container_name: sqlserver
    environment:
      - ACCEPT_EULA=Y
      - SA_PASSWORD=Password123
    ports:
      - "1433:1433"
    volumes:
      - ./sqldata:/var/opt/mssql/data
      - ./data:/app/scripts

下面是我的python代码,用于将csv加载到db中:

def connectSQLServer():

    # Set up the connection information
    server = 'localhost'
    database = 'SLQBook'
    username = 'sa'
    password = 'Password123'
    
    # Connect to the database
    conn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password, autocommit=True)
    
    # Set up a cursor to execute SQL statements
    cursor = conn.cursor()
    
    return conn, cursor

def loadDataBulk(cursor, mapping, tableName, delim = ','):
    
    dataFileLoc = mapping[tableName.lower()]
    
    print("dataFileLoc=", dataFileLoc)
    
    cursor.fast_executemany = True
    
    with open (dataFileLoc, 'r') as f:
        reader = csv.reader(f, delimiter = delim)
        headers = next(reader)
        data = list(reader)[1:]
        
        query = 'insert into {0} values ({1})'
        query = query.format(tableName, ','.join('?' * len(headers)))
        
        cursor.executemany(query, data)
        
        print(f'{len(data)} rows inserted to the {tableName} table')

conn, cursor = connectSQLServer()
mapping = {"subscribers":"data/Subscribers.txt"}
loadDataBulk(cursor, mapping, tableName='subscribers', delim = '\t')

当我运行这段代码时,我得到一个错误:

pyodbc.OperationalError: ('08S01', '[08S01] [Microsoft][ODBC Driver 17 for SQL Server]TCP Provider: An existing connection was forcibly closed by the remote host.\r\n (10054) (SQLExecute); [08S01] [Microsoft][ODBC Driver 17 for SQL Server]Communication link failure (10054)')

当我在没有cursor.fast_executemany = True的情况下加载数据时,或者当我使用cursor.execute(query, data)逐个加载数据时,数据加载成功,但非常慢。有人能帮助我理解为什么只有当我用cursor.fast_executemany = True批量加载数据时才会出现这个错误吗?

hgc7kmma

hgc7kmma1#

对于fast_executemany=False(默认值),pyodbc为每一行发送一个单独的INSERT语句。这是一种“更安全”的方法,但它可能会很慢。
fast_executemany=True告诉pyodbc将所有行的参数值打包到一个名为“parameterarray”的二进制ODBC结构中,并一次性发送它们(本质上)。这可以极大地加快批量插入的速度,但也可能遇到资源不足的问题:
1.在客户端,Python进程可能会耗尽内存,或者
1.在服务器端,SQL Server示例可能没有足够的空间一次接收所有参数值。
你似乎正在经历后者。
解决方法是以块的形式发送行:

import csv
import io

csv_str = io.StringIO(
    """\
product_id,description,price
1,widget,7.99
2,gadget,11.99
3,pie,3.14
"""
)
reader = csv.reader(csv_str)
headers = next(reader)
data = list(reader)
chunksize = 2
chunk_number = 0
for list_index in range(0, len(data), chunksize):
    chunk_number += 1
    chunk = data[list_index: list_index + chunksize]
    print(f"chunk {chunk_number}: {chunk}")
"""console output:
chunk 1: [['1', 'widget', '7.99'], ['2', 'gadget', '11.99']]
chunk 2: [['3', 'pie', '3.14']]
"""

相关问题