如何区分目录中只有特定类型的文件?

我有个关于差异指令的问题 如果我想要一个递归目录 diff,但只针对一个特定的文件类型,如何做到这一点?

我尝试使用排除选项,但只能使用一种模式:

$ diff /destination/dir/1 /destination/dir/2 -r -x *.xml

使用命令我只能排除 xml 文件类型,即使文件夹图像类型(pnggifjpg)、 txtphp等中有文件

如何只区分特定的文件类型。

115021 次浏览

Taken from ( a version of) the man page:

-x PAT  --exclude=PAT
Exclude files that match PAT.


-X FILE    --exclude-from=FILE
Exclude files that match any pattern in FILE.

So it looks like -x only accepts one pattern as you report but if you put all the patterns you want to exclude in a file (presumably one per line) you could use the second flag like so:

$ diff /destination/dir/1 /destination/dir/2 -r -X exclude.pats

where exclude.pats is:

*.jpg
*.JPG
*.xml
*.XML
*.png
*.gif

You can specify -x more than once.

diff -x '*.foo' -x '*.bar' -x '*.baz' /destination/dir/1 /destination/dir/2

From the Comparing Directories section of info diff (on my system, I have to do info -f /usr/share/info/diff.info.gz):

To ignore some files while comparing directories, use the '-x PATTERN' or '--exclude=PATTERN' option. This option ignores any files or subdirectories whose base names match the shell pattern PATTERN. Unlike in the shell, a period at the start of the base of a file name matches a wildcard at the start of a pattern. You should enclose PATTERN in quotes so that the shell does not expand it. For example, the option -x '*.[ao]' ignores any file whose name ends with '.a' or '.o'.

This option accumulates if you specify it more than once. For example, using the options -x 'RCS' -x '*,v' ignores any file or subdirectory whose base name is 'RCS' or ends with ',v'.

In case you find it convenient, you could use the following Makefile. Just run: "make patch"

#Makefile for patches


#Exlude following file endings
SUFFIX += o
SUFFIX += so
SUFFIX += exe
SUFFIX += pdf
SUFFIX += swp


#Exlude following folders
FOLDER += bin
FOLDER += lib
FOLDER += Image
FOLDER += models


OPTIONS = Naur


patch:
rm test.patch
diff -$(OPTIONS) \
$(foreach element, $(SUFFIX) , -x '*.$(element)') \
$(foreach element, $(FOLDER) , -x '$(element)*') \
org/ new/ > test.patch


unpatch:
rm test.unpatch
diff -$(OPTIONS) \
$(foreach element, $(SUFFIX) , -x '*.$(element)') \
$(foreach element, $(FOLDER) , -x '$(element)*') \
new/ org/ > test.unpatch

The lack of a complementary --include makes it necessary to use such convoluted heuristic patterns as

*.[A-Zb-ik-uw-z]*

to find (mostly) java files!

If you want to differ sources and keep it simple:

diff -rqx "*.a" -x "*.o" -x "*.d" ./PATH1 ./PATH2 | grep "\.cpp " | grep "^Files"

Remove the last grep if you want to get the files which exist in only one of the paths.

The lack of a complementary --include ... .

We can do one workaround, a exclude file with all files but what we want include. So we create file1 with a find all files which don't have extensions that we want include, sed catch the filename and is just :

diff --exclude-from=file1  PATH1/ PATH2/

For example:

find  PATH1/ -type f | grep --text -vP "php$|html$" | sed 's/.*\///' | sort -u > file1
diff PATH1/ PATH2/ -rq -X file1

You can also use find with -exec to call diff:

cd /destination/dir/1
find . -name *.xml -exec diff {} /destination/dir/2/{} \;

I used the following command to find the diff of all *.tmpl files between DIR1 and DIR2. In my case this didn't yield any false positives, but it may for you, depending on the contents of your DIRS.

diff --brief DIR1 DIR2 | grep tmpl

Whilst it does not avoid the actual diff of other files, if your goal is to produce a patch file, or similar then you can use filterdiff from the patchutils package, e.g. to patch only your .py changes:

diff -ruNp /path/1 /path/2 | filterdiff -i "*.py" | tee /path/to/file.patch