node.js child_process#spawn bypass stdin/stdout内部缓冲区

pbwdgjma  于 11个月前  发布在  Node.js
关注(0)|答案(1)|浏览(138)

我使用child_process#spawn通过node.js使用外部二进制文件。每个二进制搜索字符串中的精确单词,取决于语言,并根据输入文本产生输出。它们没有内部缓冲区。用法示例:

  • echo "I'm a random input" | ./my-english-binary生成类似The word X is in the sentence的文本
  • cat /dev/urandom | ./my-english-binary产生无限输出

我想把这些二进制文件中的每一个都用作“服务器”。我想在满足以前从未找到的语言后启动一个新的二进制示例,必要时用stdin.write()向它发送数据,并直接通过stdout.on('data')事件获取其输出。问题是在大量数据发送到stdin.write(). stdout或stdin(或两者)之前,不会调用stdout.on('data')可能有内部阻塞缓冲区.但我希望尽快输出,因为否则,程序可能会等待数小时才出现新的输入并解锁stdin.write()或stdout.on('data ')。我如何更改它们的内部缓冲区大小?或者我可以使用另一个非阻塞系统吗?
我的代码是:

const spawn = require('child_process').spawn;
const path = require('path');

class Driver {

  constructor() {
    // I have one binary per language
    this.instances = {
      frFR: {
        instance: null,
        path: path.join(__dirname, './my-french-binary')
      },
      enGB: {
        instance: null,
        path: path.join(__dirname, './my-english-binary')
      }
    }
  };

  // this function just check if an instance is running for a language
  isRunning(lang) {
    if (this.instances[lang] === undefined)
      throw new Error("Language not supported by TreeTagger: " + lang);
    return this.instances[lang].instance !== null;
  }

  // launch a binary according to a language and attach the function 'onData' to the stdout.on('data') event
  run(lang, onData) {
    const instance = spawn(this.instances[lang].path,{cwd:__dirname});
    instance.stdout.on('data', buf => onData(buf.toString()));
    // if a binary instance is killed, it will be relaunched later
    instance.on('close', () => this.instances[lang].instance = null );
    this.instances[lang].instance = instance;
  }

  /**
   * indefinitely write to instance.stdin()
   * I want to avoid this behavior by just writing one time to stdin
   * But if I write only one time, stdout.on('data') is never called
   * Everything works if I use stdin.end() but I don't want to use it
   */
  write(lang, text) {
    const id = setInterval(() => {
      console.log('setInterval');
      this.instances[lang].instance.stdin.write(text + '\n');
    }, 1000);
  }

};

// simple usage example
const driver = new Driver;
const txt = "This is a random input.";

if (driver.isRunning('enGB') === true)
  driver.write('enGB', txt);
else {
  /** 
   * the arrow function is called once every N stdin.write()
   * While I want it to be called after each write
   */
  driver.run('enGB', data => console.log('Data received!', data));
  driver.write('enGB', txt);
}

字符串
我试着:

  • 在stdin.write()周围使用cork()和uncork()。
  • 将child_process.stdout()传递给自定义Readable和Socket。
  • 将stdin、stdout和上述Readable中的highWaterMark值设置为1和0
  • 很多其他的事情我都忘了...

此外,我不能使用stdin.end(),因为我不想每次有新文本到达时都杀死我的二进制示例。

i5desfxk

i5desfxk1#

对于任何人回顾这个主题7年后,cat /dev/urandom ...将永远提供无限的输出。如果你想要一个特定的位数,你需要使用像head -c 256 /dev/urandom > bytefilename.bytesdd if=/dev/urandom count=1 bs=256 | sha256sum的东西(我最近在加密项目中使用了后一个例子的熵例子)

相关问题