firebase 如何使用批处理在Firestore中更新超过500个文档?

gijlo24d  于 2022-11-17  发布在  其他
关注(0)|答案(8)|浏览(171)

我正在尝试使用Firestore管理时间戳更新包含500多个文档的集合中的字段timestamp

const batch = db.batch();
const serverTimestamp = admin.firestore.FieldValue.serverTimestamp();

db
  .collection('My Collection')
  .get()
  .then((docs) => {
    serverTimestamp,
  }, {
    merge: true,
  })
  .then(() => res.send('All docs updated'))
  .catch(console.error);

这将引发错误

{ Error: 3 INVALID_ARGUMENT: cannot write more than 500 entities in a single call
    at Object.exports.createStatusError (C:\Users\Growthfile\Desktop\cf-test\functions\node_modules\grpc\src\common.js:87:15)
    at Object.onReceiveStatus (C:\Users\Growthfile\Desktop\cf-test\functions\node_modules\grpc\src\client_interceptors.js:1188:28)
    at InterceptingListener._callNext (C:\Users\Growthfile\Desktop\cf-test\functions\node_modules\grpc\src\client_interceptors.js:564:42)
    at InterceptingListener.onReceiveStatus (C:\Users\Growthfile\Desktop\cf-test\functions\node_modules\grpc\src\client_interceptors.js:614:8)
    at callback (C:\Users\Growthfile\Desktop\cf-test\functions\node_modules\grpc\src\client_interceptors.js:841:24)
  code: 3,
  metadata: Metadata { _internal_repr: {} },
  details: 'cannot write more than 500 entities in a single call' }

有没有一种方法,我可以写一个递归的方法,创建一个批处理对象,一个一个地更新一批500个文档,直到所有的文档都被更新。
从文档中,我知道删除操作可以使用这里提到的递归方法:
https://firebase.google.com/docs/firestore/manage-data/delete-data#collections
但是,对于更新,我不知道如何结束执行,因为文档没有被删除。

yptwkmov

yptwkmov1#

我喜欢这个简单的解决方案:

const users = await db.collection('users').get()

const batches = _.chunk(users.docs, 500).map(userDocs => {
    const batch = db.batch()
    userDocs.forEach(doc => {
        batch.set(doc.ref, { field: 'myNewValue' }, { merge: true })
    })
    return batch.commit()
})

await Promise.all(batches)

只要记住在顶部添加import * as _ from "lodash"。基于this answer

lbsnaicq

lbsnaicq2#

您可以使用默认BulkWriter。此方法使用500/50/5规则。
示例:

let bulkWriter = firestore.bulkWriter();

bulkWriter.create(documentRef, {foo: 'bar'});
bulkWriter.update(documentRef2, {foo: 'bar'});
bulkWriter.delete(documentRef3);
await close().then(() => {
  console.log('Executed all writes');
});
eqqqjvef

eqqqjvef3#

如上所述,@塞巴斯蒂安的回答很好,我也投了赞成票。虽然在一次性更新25000多个文档时遇到了一个问题。逻辑调整如下。

console.log(`Updating documents...`);
let collectionRef = db.collection('cities');
try {
  let batch = db.batch();
  const documentSnapshotArray = await collectionRef.get();
  const records = documentSnapshotArray.docs;
  const index = documentSnapshotArray.size;
  console.log(`TOTAL SIZE=====${index}`);
  for (let i=0; i < index; i++) {
    const docRef = records[i].ref;
    // YOUR UPDATES
    batch.update(docRef, {isDeleted: false});
    if ((i + 1) % 499 === 0) {
      await batch.commit();
      batch = db.batch();
    }
  }
  // For committing final batch
  if (!(index % 499) == 0) {
    await batch.commit();
  }
  console.log('write completed');
} catch (error) {
  console.error(`updateWorkers() errored out : ${error.stack}`);
  reject(error);
}
yc0p9oo0

yc0p9oo04#

对先前评论的解释已经说明了这个问题。
我分享了我为自己构建和工作的最终代码,因为我需要以一种更解耦的方式工作的东西,而不是上面介绍的大多数解决方案的工作方式。

import { FireDb } from "@services/firebase"; // = firebase.firestore();

type TDocRef = FirebaseFirestore.DocumentReference;
type TDocData = FirebaseFirestore.DocumentData;

let fireBatches = [FireDb.batch()];
let batchSizes = [0];
let batchIdxToUse = 0;

export default class FirebaseUtil {
  static addBatchOperation(
    operation: "create",
    ref: TDocRef,
    data: TDocData
  ): void;
  static addBatchOperation(
    operation: "update",
    ref: TDocRef,
    data: TDocData,
    precondition?: FirebaseFirestore.Precondition
  ): void;
  static addBatchOperation(
    operation: "set",
    ref: TDocRef,
    data: TDocData,
    setOpts?: FirebaseFirestore.SetOptions
  ): void;
  static addBatchOperation(
    operation: "create" | "update" | "set",
    ref: TDocRef,
    data: TDocData,
    opts?: FirebaseFirestore.Precondition | FirebaseFirestore.SetOptions
  ): void {
    // Lines below make sure we stay below the limit of 500 writes per
    // batch
    if (batchSizes[batchIdxToUse] === 500) {
      fireBatches.push(FireDb.batch());
      batchSizes.push(0);
      batchIdxToUse++;
    }
    batchSizes[batchIdxToUse]++;

    const batchArgs: [TDocRef, TDocData] = [ref, data];
    if (opts) batchArgs.push(opts);

    switch (operation) {
      // Specific case for "set" is required because of some weird TS
      // glitch that doesn't allow me to use the arg "operation" to
      // call the function
      case "set":
        fireBatches[batchIdxToUse].set(...batchArgs);
        break;
      default:
        fireBatches[batchIdxToUse][operation](...batchArgs);
        break;
    }
  }

  public static async runBatchOperations() {
    // The lines below clear the globally available batches so we
    // don't run them twice if we call this function more than once
    const currentBatches = [...fireBatches];
    fireBatches = [FireDb.batch()];
    batchSizes = [0];
    batchIdxToUse = 0;

    await Promise.all(currentBatches.map((batch) => batch.commit()));
  }
}
fd3cxomn

fd3cxomn5#

简单的解决方案只是发射两次?我的数组是“resultsFinal”我发射一次批与限制490,第二次与限制长度的数组(results.lenght)工作对我来说很好:)你怎么检查它?你去firebase和删除你的收集,firebase说你删除XXX文档,相同的长度你的数组?好的,所以你是好去

async function quickstart(results) {
    // we get results in parameter for get the data inside quickstart function
    const resultsFinal = results;
    // console.log(resultsFinal.length);
    let batch = firestore.batch();
    // limit of firebase is 500 requests per transaction/batch/send 
    for (i = 0; i < 490; i++) {
        const doc = firestore.collection('testMore490').doc();
        const object = resultsFinal[i];
        batch.set(doc, object);
    }
    await batch.commit();
    // const batchTwo = firestore.batch();
    batch = firestore.batch();

    for (i = 491; i < 776; i++) {
        const objectPartTwo = resultsFinal[i];
        const doc = firestore.collection('testMore490').doc();
        batch.set(doc, objectPartTwo);
    }
    await batch.commit();

}
8wigbo56

8wigbo566#

基于以上所有答案,我将以下代码片段放在一起,可以将它们放入JavaScript后端和前端的模块中,以便轻松使用Firestore批写入,而不必担心500次写入的限制。

后端(Node.js)

// The Firebase Admin SDK to access Firestore.
const admin = require("firebase-admin");
admin.initializeApp();

// Firestore does not accept more than 500 writes in a transaction or batch write.
const MAX_TRANSACTION_WRITES = 499;

const isFirestoreDeadlineError = (err) => {
  console.log({ err });
  const errString = err.toString();
  return (
    errString.includes("Error: 13 INTERNAL: Received RST_STREAM") ||
    errString.includes("Error: 4 DEADLINE_EXCEEDED: Deadline exceeded")
  );
};

const db = admin.firestore();

// How many transactions/batchWrites out of 500 so far.
// I wrote the following functions to easily use batchWrites wthout worrying about the 500 limit.
let writeCounts = 0;
let batchIndex = 0;
let batchArray = [db.batch()];

// Commit and reset batchWrites and the counter.
const makeCommitBatch = async () => {
  console.log("makeCommitBatch");
  await Promise.all(batchArray.map((bch) => bch.commit()));
};

// Commit the batchWrite; if you got a Firestore Deadline Error try again every 4 seconds until it gets resolved.
const commitBatch = async () => {
  try {
    await makeCommitBatch();
  } catch (err) {
    console.log({ err });
    if (isFirestoreDeadlineError(err)) {
      const theInterval = setInterval(async () => {
        try {
          await makeCommitBatch();
          clearInterval(theInterval);
        } catch (err) {
          console.log({ err });
          if (!isFirestoreDeadlineError(err)) {
            clearInterval(theInterval);
            throw err;
          }
        }
      }, 4000);
    }
  }
};

//  If the batchWrite exeeds 499 possible writes, commit and rest the batch object and the counter.
const checkRestartBatchWriteCounts = () => {
  writeCounts += 1;
  if (writeCounts >= MAX_TRANSACTION_WRITES) {
    batchIndex++;
    batchArray.push(db.batch());
    writeCounts = 0;
  }
};

const batchSet = (docRef, docData) => {
  batchArray[batchIndex].set(docRef, docData);
  checkRestartBatchWriteCounts();
};

const batchUpdate = (docRef, docData) => {
  batchArray[batchIndex].update(docRef, docData);
  checkRestartBatchWriteCounts();
};

const batchDelete = (docRef) => {
  batchArray[batchIndex].delete(docRef);
  checkRestartBatchWriteCounts();
};

module.exports = {
  admin,
  db,
  MAX_TRANSACTION_WRITES,
  checkRestartBatchWriteCounts,
  commitBatch,
  isFirestoreDeadlineError,
  batchSet,
  batchUpdate,
  batchDelete,
};

前端

// Firestore does not accept more than 500 writes in a transaction or batch write.
const MAX_TRANSACTION_WRITES = 499;

const isFirestoreDeadlineError = (err) => {
  return (
    err.message.includes("DEADLINE_EXCEEDED") ||
    err.message.includes("Received RST_STREAM")
  );
};

class Firebase {
  constructor(fireConfig, instanceName) {
    let app = fbApp;
    if (instanceName) {
      app = app.initializeApp(fireConfig, instanceName);
    } else {
      app.initializeApp(fireConfig);
    }
    this.name = app.name;
    this.db = app.firestore();
    this.firestore = app.firestore;
    // How many transactions/batchWrites out of 500 so far.
    // I wrote the following functions to easily use batchWrites wthout worrying about the 500 limit.
    this.writeCounts = 0;
    this.batch = this.db.batch();
    this.isCommitting = false;
  }

  async makeCommitBatch() {
    console.log("makeCommitBatch");
    if (!this.isCommitting) {
      this.isCommitting = true;
      await this.batch.commit();
      this.writeCounts = 0;
      this.batch = this.db.batch();
      this.isCommitting = false;
    } else {
      const batchWaitInterval = setInterval(async () => {
        if (!this.isCommitting) {
          this.isCommitting = true;
          await this.batch.commit();
          this.writeCounts = 0;
          this.batch = this.db.batch();
          this.isCommitting = false;
          clearInterval(batchWaitInterval);
        }
      }, 400);
    }
  }

  async commitBatch() {
    try {
      await this.makeCommitBatch();
    } catch (err) {
      console.log({ err });
      if (isFirestoreDeadlineError(err)) {
        const theInterval = setInterval(async () => {
          try {
            await this.makeCommitBatch();
            clearInterval(theInterval);
          } catch (err) {
            console.log({ err });
            if (!isFirestoreDeadlineError(err)) {
              clearInterval(theInterval);
              throw err;
            }
          }
        }, 4000);
      }
    }
  }

  async checkRestartBatchWriteCounts() {
    this.writeCounts += 1;
    if (this.writeCounts >= MAX_TRANSACTION_WRITES) {
      await this.commitBatch();
    }
  }

  async batchSet(docRef, docData) {
    if (!this.isCommitting) {
      this.batch.set(docRef, docData);
      await this.checkRestartBatchWriteCounts();
    } else {
      const batchWaitInterval = setInterval(async () => {
        if (!this.isCommitting) {
          this.batch.set(docRef, docData);
          await this.checkRestartBatchWriteCounts();
          clearInterval(batchWaitInterval);
        }
      }, 400);
    }
  }

  async batchUpdate(docRef, docData) {
    if (!this.isCommitting) {
      this.batch.update(docRef, docData);
      await this.checkRestartBatchWriteCounts();
    } else {
      const batchWaitInterval = setInterval(async () => {
        if (!this.isCommitting) {
          this.batch.update(docRef, docData);
          await this.checkRestartBatchWriteCounts();
          clearInterval(batchWaitInterval);
        }
      }, 400);
    }
  }

  async batchDelete(docRef) {
    if (!this.isCommitting) {
      this.batch.delete(docRef);
      await this.checkRestartBatchWriteCounts();
    } else {
      const batchWaitInterval = setInterval(async () => {
        if (!this.isCommitting) {
          this.batch.delete(docRef);
          await this.checkRestartBatchWriteCounts();
          clearInterval(batchWaitInterval);
        }
      }, 400);
    }
  }
}
9rnv2umw

9rnv2umw7#

没有引用或文档,这个代码是我自己发明的,对我来说它工作,看起来干净,简单的阅读和使用。如果有人喜欢它,那么也可以使用它。
最好进行自动测试,因为代码使用私有变量_ops,该变量可以在软件包升级后更改。例如,在旧版本中,它可以是_mutations

async function commitBatch(batch) {
  const MAX_OPERATIONS_PER_COMMIT = 500;

  while (batch._ops.length > MAX_OPERATIONS_PER_COMMIT) {
    const batchPart = admin.firestore().batch();

    batchPart._ops = batch._ops.splice(0, MAX_OPERATIONS_PER_COMMIT - 1);

    await batchPart.commit();
  }

  await batch.commit();
}

用法:

const batch = admin.firestore().batch();

batch.delete(someRef);
batch.update(someRef);

...

await commitBatch(batch);
nszi6y05

nszi6y058#

我也遇到了这个问题,以更新500多个文档内的Firestore收集。我想分享我是如何解决这个问题的。
我使用云函数在Firestore中更新我的集合,但这也应该在客户端代码上工作。
该解决方案计算对批处理执行的每个操作,在达到限制后,将创建一个新批处理并将其推送到batchArray
完成所有更新后,代码循环通过batchArray并提交数组内的每个批处理。

对批次进行的每个操作set(), update(), delete()进行计数非常重要,因为它们都计数到500个操作限制。

const documentSnapshotArray = await firestore.collection('my-collection').get();

const batchArray = [];
batchArray.push(firestore.batch());
let operationCounter = 0;
let batchIndex = 0;

documentSnapshotArray.forEach(documentSnapshot => {
    const documentData = documentSnapshot.data();

    // update document data here...

    batchArray[batchIndex].update(documentSnapshot.ref, documentData);
    operationCounter++;

    if (operationCounter === 499) {
      batchArray.push(firestore.batch());
      batchIndex++;
      operationCounter = 0;
    }
});

batchArray.forEach(async batch => await batch.commit());

return;

相关问题