使用Node.js将大数据集插入MySQL;报价问题;连接池问题

v6ylcynt  于 2023-06-22  发布在  Node.js
关注(0)|答案(1)|浏览(89)

我使用node将记录插入到一个MySQL表中,使用mysql module
工作代码为:

dataSet.forEach(async (item, index) => {
  setTimeout(db.addData, 130000, item)
})

我有一套455张唱片。无论我做什么,插入件都会在第150项处失败。我以为连接池应该防止这种情况?

Error: Too many connections
    at PromisePool.query (/PROJECTS/project1/node_modules/mysql2/promise.js:341:22)
    at Timeout.addData [as _onTimeout] (/PROJECTS/project1/db/db.js:170:42)
    at listOnTimeout (node:internal/timers:571:11)
    at process.processTimers (node:internal/timers:512:7) {
  code: 'ER_CON_COUNT_ERROR',
  errno: 1040,
  sql: undefined,
  sqlState: '',
  sqlMessage: 'Too many connections'
}

运行插入的函数:

async function addData(data) {
  const connection = mysql.createPool(DB)
  let results;
  const input = [
    data.data1,
    data.data2,
    data.data3
  ]

let SQL = `INSERT INTO destination (col1, col2, col3) VALUES (date_format(?, '%Y-%m-%d %h:%i:%s'),?,?)`

  try {
    results = await connection.promise().query(SQL, input)
  } catch(err) {
    console.log(err)
  }  finally {
    return results
  }
}

module.exports = {
  addData
}

数据库配置为

const DB = {
  host: process.env.DBHOST,
  port: process.env.DBPORT,
  user: process.env.DBUSER,
  password: process.env.DBPW,
  database: process.env.DBTEST,
  charset: 'utf8mb4',
    waitForConnections: true,
    queueLimit: 0, // 0 = unlimited queueing
    connectionLimit: 10 // 0 = unlimited connections  
}

不知道如何使这个工作更好地作为一个单一的记录插入。
因此,我尝试按照w3schools的示例构建一个批处理插入

async function addBatchData(data) {
  const connection = mysql.createConnection(DB)
  let batch = data.map((element, index) => {
      return [`date_format('${element.data1}', '%Y-%m-%d %h:%i:%s')`, ${element.data2}, `${element.data3.replaceAll("‘", "'").replaceAll("’", "'")}`]
  }).filter(Boolean);

  console.log(batch)
  
  let SQL = `INSERT INTO destination (col1, col2, col3) VALUES ?`

  try {
    results = await connection.promise().query(SQL, batch)
  } catch(err) {
    console.log(err)
  }  finally {
    connection.end()
    return results
  }
}

这里的问题是mysql模块转义了所有的单引号和双反斜杠,当一个是正确的,导致插入失败:

[
  [
    "date_format('2020-04-26 19:55:00', '%Y-%m-%d %h:%i:%s')",
    '1',
    "'Blah Blah' blah| Whatever"
  ],
  [
    "date_format('2020-04-26 21:00:00', '%Y-%m-%d %h:%i:%s')",
    '2',
    "'Other Blah' | Whatever"
  ],
  [
    "date_format('2020-04-26 21:00:00', '%Y-%m-%d %h:%i:%s')",
    '3',
    "'More Blah' | Whatever"
  ],
  [
    "date_format('2020-04-28 03:00:00', '%Y-%m-%d %h:%i:%s')",
    '4',
    "'Additional Blah' | Whatever"
  ]
]
Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''date_format(\'2020-04-26 19:55:00\', \'%Y-%m-%d %h:%i:%s\')', '1', '\'Blah Bl' at line 1
    at PromiseConnection.query (/PROJECTS/project1/node_modules/mysql2/promise.js:93:22)
    at Object.addBatchData (/PROJECTS/project1/db/db.js:139:42)
    at processData (/PROJECTS/project1/applePodcast.js:40:8)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  code: 'ER_PARSE_ERROR',
  errno: 1064,
  sql: "INSERT INTO destination (col1, col2, col3) VALUES 'date_format(\\'2020-04-26 19:55:00\\', \\'%Y-%m-%d %h:%i:%s\\')', '1', '\\'Blah Blah\\' blah | Whatever'",
  sqlState: '42000',
  sqlMessage: "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''date_format(\\'2020-04-26 19:55:00\\', \\'%Y-%m-%d %h:%i:%s\\')', '1', '\\'Blah Bl' at line 1"
}

我选择灵活性来处理源数据很大的情况。
如能就这两个问题提出建议和/或见解,将不胜感激。

    • 关于池问题的更新**:

显然我不明白如何正确地实现这一点。
在db模块中我创建了一个函数

async function createPool() {
  let result = mysql.createPool(DB) // DB is a global context in the module
  return result
}

然后在我的主代码中,我创建一个池,并在大型数据集上调用插入语句,如Andrey在可接受的解决方案中所述。
就像我想的那样。
现在批量插入的报价仍然是一个奇怪的问题。

fcy6dtqo

fcy6dtqo1#

您正在为数据集中的每个项创建新池,然后从该池查询连接
试着改变成这样:

const pool = mysql.createPool(DB);
for(const item of dataSet) {
  await insertItem(pool, item)
}
await pool.end();

这应该解决“太多连接”的问题。
重新报价-看看mysql.raw()助手

相关问题