find -exec cmd {} + vs | xargs

Which one is more efficient over a very large set of files and should be used?

find . -exec cmd {} +

or

find . | xargs cmd

(Assume that there are no funny characters in the filenames)

81724 次浏览
find . | xargs cmd

is more efficient (it runs cmd as few times as possible, unlike exec, which runs cmd once for each match). However, you will run into trouble if filenames contain spaces or funky characters.

The following is suggested to be used:

find . -print0 | xargs -0 cmd

this will work even if filenames contain funky characters (-print0 makes find print NUL-terminated matches, -0 makes xargs expect this format.)

Speed difference will be insignificant.

But you have to make sure that:

  1. Your script will not assume that no file will have space, tab, etc in file name; the first version is safe, the second is not.

  2. Your script will not treat a file starting with "-" as an option.

So your code should look like this:

find . -exec cmd -option1 -option2 -- {} +

or

find . -print0 | xargs -0 cmd -option1 -option2 --

The first version is shorter and easier to write as you can ignore 1, but the second version is more portable and safe, as "-exec cmd {} +" is a relatively new option in GNU findutils (since 2005, lots of running systems will not have it yet) and it was buggy recently. Also lots of people do not know this "-exec cmd {} +", as you can see from other answers.

Modern xargs's versions often support parallel pipeline execution.

Obviously it might be a pivot point when it comes to choice between find … -exec and … | xargs