Spawn vs Execute

在我观看学习 Node 的在线培训视频中,解说员说: “对于涉及大量数据的较长进程来说,产生更好,而对于短数据来说,执行更好。”

这是为什么? Node.js 中的 child _ process spawn 和 execute 函数之间的区别是什么? 什么时候知道使用哪个函数?

55091 次浏览

A good place to start is the NodeJS documentation.

For 'spawn' the documentation state:

The child_process.spawn() method spawns a new process using the given command, with command line arguments in args. If omitted, args defaults to an empty array.

While for 'exec':

Spawns a shell then executes the command within that shell, buffering any generated output. The command string passed to the exec function is processed directly by the shell and special characters (vary based on shell) need to be dealt with accordingly.

The main thing appears to be whether you need handle the output of the command or not, which I imagine could be the factor impacting performance (I haven't compared). If you care only about process completion then 'exec' would be your choice. Spawn opens streams for stdout and stderr with ondata events, exec just returns a buffer with stdout and stderr as strings.

The main difference is that spawn is more suitable for long-running processes with huge output. That's because spawn streams input/output with a child process. On the other hand, exec buffers output in a small (by default 200K) buffer. exec first spawns a subshell, and then tries to execute your process. To cut a long story short, use spawn in case you need a lot of data streamed from a child process and exec if you need features like shell pipes, redirects or even more than one program at a time.

Some useful links - DZone Hacksparrow

  • child process created by spawn()

    • does not spawn a shell
    • streams the data returned by the child process (data flow is constant)
    • has no data transfer size limit
  • child process created by exec()

    • does spawn a shell in which the passed command is executed
    • buffers the data (waits till the process closes and transfers the data in on chunk)
    • maximum data transfer up to Node.js v.12.x was 200kb (by default), but since Node.js v.12x was increased to 1MB (by default)

-main.js (file)

var {spawn, exec} = require('child_process');


// 'node' is an executable command (can be executed without a shell)
// uses streams to transfer data (spawn.stout)
var spawn = spawn('node', ['module.js']);
spawn.stdout.on('data', function(msg){
console.log(msg.toString())
});


// the 'node module.js' runs in the spawned shell
// transfered data is handled in the callback function
var exec = exec('node module.js', function(err, stdout, stderr){
console.log(stdout);
});

-module.js (basically returns a message every second for 5 seconds than exits)

var interval;
interval = setInterval(function(){
console.log( 'module data' );
if(interval._idleStart > 5000) clearInterval(interval);
}, 1000);
  • the spawn() child process returns the message module data every 1 second for 5 seconds, because the data is 'streamed'
  • the exec() child process returns one message only module data module data module data module data module data after 5 seconds (when the process is closed) this is because the data is 'buffered'

NOTE that neither the spawn() nor the exec() child processes are designed for running node modules, this demo is just for showing the difference, (if you want to run node modules as child processes use the fork() method instead)

A quote from the official docs:

For convenience, the child_process module provides a handful of synchronous and asynchronous alternatives to child_process.spawn() and child_process.spawnSync(). Each of these alternatives are implemented on top of child_process.spawn() or child_process.spawnSync().