Usage:
dtreetrawl [OPTION...] "/trawl/me" [path2,...]
Help Options:
-h, --help Show help options
Application Options:
-t, --terse Produce a terse output; parsable.
-j, --json Output as JSON
-d, --delim=: Character or string delimiter/separator for terse output(default ':')
-l, --max-level=N Do not traverse tree beyond N level(s)
--hash Enable hashing(default is MD5).
-c, --checksum=md5 Valid hashing algorithms: md5, sha1, sha256, sha512.
-R, --only-root-hash Output only the root hash. Blank line if --hash is not set
-N, --no-name-hash Exclude path name while calculating the root checksum
-F, --no-content-hash Do not hash the contents of the file
-s, --hash-symlink Include symbolic links' referent name while calculating the root checksum
-e, --hash-dirent Include hash of directory entries while calculating root checksum
一个人类友好输出的片段:
...
... //clipped
...
/home/lab/linux-4.14-rc8/CREDITS
Base name : CREDITS
Level : 1
Type : regular file
Referent name :
File size : 98443 bytes
I-node number : 290850
No. directory entries : 0
Permission (octal) : 0644
Link count : 1
Ownership : UID=0, GID=0
Preferred I/O block size : 4096 bytes
Blocks allocated : 200
Last status change : Tue, 21 Nov 17 21:28:18 +0530
Last file access : Thu, 28 Dec 17 00:53:27 +0530
Last file modification : Tue, 21 Nov 17 21:28:18 +0530
Hash : 9f0312d130016d103aa5fc9d16a2437e
Stats for /home/lab/linux-4.14-rc8:
Elapsed time : 1.305767 s
Start time : Sun, 07 Jan 18 03:42:39 +0530
Root hash : 434e93111ad6f9335bb4954bc8f4eca4
Hash type : md5
Depth : 8
Total,
size : 66850916 bytes
entries : 12484
directories : 763
regular files : 11715
symlinks : 6
block devices : 0
char devices : 0
sockets : 0
FIFOs/pipes : 0
import os, hashlib
def hash_for_directory(path, hashfunc=hashlib.sha1):
filenames = sorted(os.path.join(dp, fn) for dp, _, fns in os.walk(path) for fn in fns)
index = '\n'.join('{}={}'.format(os.path.relpath(fn, path), hashfunc(open(fn, 'rb').read()).hexdigest()) for fn in filenames)
return hashfunc(index.encode('utf-8')).hexdigest()
hashdir:
A command-line utility to checksum directories and files.
Usage:
hashdir [options] [<item>...] [command]
Arguments:
<item> Directory or file to hash/check
Options:
-t, --tree Print directory tree
-s, --save Save the checksum to a file
-i, --include-hidden-files Include hidden files
-e, --skip-empty-dir Skip empty directories
-a, --algorithm <md5|sha1|sha256|sha384|sha512> The hash function to use [default: sha1]
--version Show version information
-?, -h, --help Show help and usage information
Commands:
check <item> Verify that the specified hash file is valid.
# 1. How to get a sha256 hash over all file contents in a folder, including
# hashing over the relative file paths within that folder to check the
# filenames themselves (get this bash function below).
sha256sum_dir "path/to/folder"
# 2. How to quickly compare two folders (get the `diff_dir` bash function below)
diff_dir "path/to/folder1" "path/to/folder2"
# OR:
diff -r -q "path/to/folder1" "path/to/folder2"
# This one works, but don't use it, because its hash output does NOT
# match that of my `sha256sum_dir` function. I recommend you use
# the "1-liner" just below, therefore, instead.
time ( \
starting_dir="$(pwd)" \
&& target_dir="path/to/folder" \
&& cd "$target_dir" \
&& find . -not -type d -print0 | sort -zV \
| xargs -0 sha256sum | sha256sum; \
cd "$starting_dir"
)
# Use this one, as its output matches that of my `sha256sum_dir`
# function exactly.
all_hashes_str="$( \
starting_dir="$(pwd)" \
&& target_dir="path/to/folder" \
&& cd "$target_dir" \
&& find . -not -type d -print0 | sort -zV | xargs -0 sha256sum \
)"; \
cd "$starting_dir"; \
printf "%s" "$all_hashes_str" | sha256sum
# Take the sha256sum of all files in an entire dir, and then sha256sum that
# entire output to obtain a _single_ sha256sum which represents the _entire_
# dir.
# See:
# 1. [my answer] https://stackoverflow.com/a/72070772/4561887
sha256sum_dir() {
return_code="$RETURN_CODE_SUCCESS"
if [ "$#" -eq 0 ]; then
echo "ERROR: too few arguments."
return_code="$RETURN_CODE_ERROR"
fi
# Print help string if requested
if [ "$#" -eq 0 ] || [ "$1" = "-h" ] || [ "$1" = "--help" ]; then
# Help string
echo "Obtain a sha256sum of all files in a directory."
echo "Usage: ${FUNCNAME[0]} [-h|--help] <dir>"
return "$return_code"
fi
starting_dir="$(pwd)"
target_dir="$1"
cd "$target_dir"
# See my answer: https://stackoverflow.com/a/72070772/4561887
filenames="$(find . -not -type d | sort -V)"
IFS=$'\n' read -r -d '' -a filenames_array <<< "$filenames"
time all_hashes_str="$(sha256sum "${filenames_array[@]}")"
cd "$starting_dir"
echo ""
echo "Note: you may now call:"
echo "1. 'printf \"%s\n\" \"\$all_hashes_str\"' to view the individual" \
"hashes of each file in the dir. Or:"
echo "2. 'printf \"%s\" \"\$all_hashes_str\" | sha256sum' to see that" \
"the hash of that output is what we are using as the final hash" \
"for the entire dir."
echo ""
printf "%s" "$all_hashes_str" | sha256sum | awk '{ print $1 }'
return "$?"
}
# Note: I prefix this with my initials to find my custom functions easier
alias gs_sha256sum_dir="sha256sum_dir"
# Compare dir1 against dir2 to see if they are equal or if they differ.
# See:
# 1. How to `diff` two dirs: https://stackoverflow.com/a/16404554/4561887
diff_dir() {
return_code="$RETURN_CODE_SUCCESS"
if [ "$#" -eq 0 ]; then
echo "ERROR: too few arguments."
return_code="$RETURN_CODE_ERROR"
fi
# Print help string if requested
if [ "$#" -eq 0 ] || [ "$1" = "-h" ] || [ "$1" = "--help" ]; then
echo "Compare (diff) two directories to see if dir1 contains the same" \
"content as dir2."
echo "NB: the output will be **empty** if both directories match!"
echo "Usage: ${FUNCNAME[0]} [-h|--help] <dir1> <dir2>"
return "$return_code"
fi
dir1="$1"
dir2="$2"
time diff -r -q "$dir1" "$dir2"
return_code="$?"
if [ "$return_code" -eq 0 ]; then
echo -e "\nDirectories match!"
fi
# echo "$return_code"
return "$return_code"
}
# Note: I prefix this with my initials to find my custom functions easier
alias gs_diff_dir="diff_dir"
$ gs_sha256sum_dir ~/temp2
real 0m0.007s
user 0m0.000s
sys 0m0.007s
Note: you may now call:
1. 'printf "%s\n" "$all_hashes_str"' to view the individual hashes of each
file in the dir. Or:
2. 'printf "%s" "$all_hashes_str" | sha256sum' to see that the hash of that
output is what we are using as the final hash for the entire dir.
b86c66bcf2b033f65451e8c225425f315e618be961351992b7c7681c3822f6a3
下面是 cmd 和 diff_dir的输出,用于比较两个 dir 是否相等。这是检查复制整个目录到我的 SD 卡刚才正常工作。每当出现这种情况时,我都将输出指示为 Directories match!!:
$ gs_diff_dir "path/to/sd/card/tempdir" "/home/gabriel/tempdir"
real 0m0.113s
user 0m0.037s
sys 0m0.077s
Directories match!