如何在一个文件夹及其所有子文件夹中搜索特定类型的文件

我试图在一个给定的文件夹中搜索所有给定类型的文件,并将它们复制到一个新的文件夹中。

我需要指定一个根文件夹,并在该文件夹及其所有子文件夹中搜索与给定类型匹配的任何文件。

如何搜索根文件夹的子文件夹及其子文件夹?这似乎是一个递归的方法,但我不能正确地实现一个。

58421 次浏览

You want the Find module. Find.find takes a string containing a path, and will pass the parent path along with the path of each file and sub-directory to an accompanying block. Some example code:

require 'find'


pdf_file_paths = []
Find.find('path/to/search') do |path|
pdf_file_paths << path if path =~ /.*\.pdf$/
end

That will recursively search a path, and store all file names ending in .pdf in an array.

Try this:

Dir.glob("#{folder}/**/*.pdf")

which is the same as

Dir["#{folder}/**/*.pdf"]

Where the folder variable is the path to the root folder you want to search through.

As a small improvement to Jergason and Matt's answer above, here's how you can condense to a single line:

pdf_file_paths = Find.find('path/to/search').select { |p| /.*\.pdf$/ =~ p }

This uses the Find method as above, but leverages the fact that the result is an enumerable (and as such we can use select) to get an array back with the set of matches

If speed is a concern, prefer Dir.glob over Find.find.

Warming up --------------------------------------
Find.find   124.000  i/100ms
Dir.glob   515.000  i/100ms
Calculating -------------------------------------
Find.find      1.242k (± 4.7%) i/s -      6.200k in   5.001398s
Dir.glob      5.249k (± 4.5%) i/s -     26.265k in   5.014632s


Comparison:
Dir.glob:     5248.5 i/s
Find.find:     1242.4 i/s - 4.22x slower

 

require 'find'
require 'benchmark/ips'


dir = '.'


Benchmark.ips do |x|
x.report 'Find.find' do
Find.find(dir).select { |f| f =~ /\*\.pdf/ }
end


x.report 'Dir.glob' do
Dir.glob("#{dir}/**/*\.pdf")
end


x.compare!
end

Using ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-darwin15]

Another fast way of doing this is delegating the task to the shell command "find" and splitting the output:

pdf_file_paths = `find #{dir} -name "*.pdf"`.split("\n")

Does not work on Windows.