テスト用にディレクトリとファイルを作成。
$ mkdir -p src{0,1}/dir{0,1}; touch src{0,1}/dir{0,1}/{different,same,empty}-file.txt; $ find . -type f -name 'different*' -exec sh -ec "echo {} > {};" \; $ find . -type f -name 'same*' -exec sh -ec "echo same > {};" \; $ find . | sort . ./src0 ./src0/dir0 ./src0/dir0/different-file.txt ./src0/dir0/empty-file.txt ./src0/dir0/same-file.txt ./src0/dir1 ./src0/dir1/different-file.txt ./src0/dir1/empty-file.txt ./src0/dir1/same-file.txt ./src1 ./src1/dir0 ./src1/dir0/different-file.txt ./src1/dir0/empty-file.txt ./src1/dir0/same-file.txt ./src1/dir1 ./src1/dir1/different-file.txt ./src1/dir1/empty-file.txt ./src1/dir1/same-file.txt
この状態で src0 と src1 ディレクトリ内の内容を再帰的に dst よび dst_bkup ディレクトリに移動したい。src0 と src1 から見た時の相対パスが同じファイルについて、内容が同じなら dst へ移動。内容が異なるなら、src1 以下のファイルを dst 以下に移動、src0 以下のファイルを dst_bkup 以下に移動。src0 のみおよび src1 のみに含まれるファイルについては dst へ移動。
diff src{0,1}/dir0/different-file.txt | 1 | mv {src1,dst}/dir0/different-file.txt; mv {src0,dst_bkup}/dir0/different-file.txt; |
diff src{0,1}/dir0/empty-file.txt | 0 | mv {src1,dst}/dir0/empty-file.txt; rm src0/dir0/empty-file.txt; |
diff src{0,1}/dir0/same-file.txt | 0 | mv {src1,dst}/dir0/same-file.txt; rm src0/dir0/same-file.txt; |
diff src{0,1}/dir0/only-in-src0-file.txt | 2 | mv {src0,dst}/dir0/only-in-src0-file.txt; |
diff src{0,1}/dir0/only-in-src1-file.txt | 2 | mv {src1,dst}/dir0/only-in-src1-file.txt; |
$ diff -sqr src0/ src1/ Files src0/dir0/different-file.txt and src1/dir0/different-file.txt differ Files src0/dir0/empty-file.txt and src1/dir0/empty-file.txt are identical Files src0/dir0/same-file.txt and src1/dir0/same-file.txt are identical Files src0/dir1/different-file.txt and src1/dir1/different-file.txt differ Files src0/dir1/empty-file.txt and src1/dir1/empty-file.txt are identical Files src0/dir1/same-file.txt and src1/dir1/same-file.txt are identical
一般にこの問題を解決するには rsync を使う。ただし、--checksum を使っているので、巨大ファイルが含まれる場合にかなり遅くなる。
$ mkdir dst $ rsync -avv --checksum --backup --backup-dir=/full/path/to/dst_bkup/ /full/path/to/src0/ /full/path/to/dst backup_dir is /full/path/to/dst_bkup/ sending incremental file list delta-transmission disabled for local transfer or --whole-file dir0/ dir0/different-file.txt dir0/empty-file.txt dir0/same-file.txt dir1/ dir1/different-file.txt dir1/empty-file.txt dir1/same-file.txt total: matches=0 hash_hits=0 false_alarms=0 data=72 sent 647 bytes received 137 bytes 1568.00 bytes/sec total size is 72 speedup is 0.09 $ rsync -avv --checksum --backup --backup-dir=/full/path/to/dst_bkup/ /full/path/to/src1/ /full/path/to/dst backup_dir is /full/path/to/dst_bkup/ sending incremental file list delta-transmission disabled for local transfer or --whole-file dir0/different-file.txt dir0/empty-file.txt is uptodate dir0/same-file.txt is uptodate dir1/different-file.txt dir1/empty-file.txt is uptodate dir1/same-file.txt is uptodate backed up dir0/different-file.txt to /full/path/to/dst_bkup/dir0/different-file.txt backed up dir1/different-file.txt to /full/path/to/dst_bkup/dir1/different-file.txt total: matches=0 hash_hits=0 false_alarms=0 data=62 sent 485 bytes received 73 bytes 1116.00 bytes/sec total size is 72 speedup is 0.13 $ find . | sort . ./dst ./dst_bkup ./dst_bkup/dir0 ./dst_bkup/dir0/different-file.txt ./dst_bkup/dir1 ./dst_bkup/dir1/different-file.txt ./dst/dir0 ./dst/dir0/different-file.txt ./dst/dir0/empty-file.txt ./dst/dir0/same-file.txt ./dst/dir1 ./dst/dir1/different-file.txt ./dst/dir1/empty-file.txt ./dst/dir1/same-file.txt ./src0 ./src0/dir0 ./src0/dir0/different-file.txt ./src0/dir0/empty-file.txt ./src0/dir0/same-file.txt ./src0/dir1 ./src0/dir1/different-file.txt ./src0/dir1/empty-file.txt ./src0/dir1/same-file.txt ./src1 ./src1/dir0 ./src1/dir0/different-file.txt ./src1/dir0/empty-file.txt ./src1/dir0/same-file.txt ./src1/dir1 ./src1/dir1/different-file.txt ./src1/dir1/empty-file.txt ./src1/dir1/same-file.txt
上に挙げた rsync --checksum は一般的な解決策ではあるが、移動元と移動先のすべてのファイルに対してチェックサム計算が行われ、移動元と移動先に同じファイル名で同じチェックサムで同じファイルサイズのファイルがあれば移動されない。つまり、移動元と移動先に同じファイル名をもつファイルが含まれない場合はすべてのチェックサム計算は無駄になる。ということは、src1 と src0 に含まれるディレクトリ以外の要素の相対パスを比較して、両者に共通の相対パスを持つ要素がないことが確認できればチェックサム計算は不要。
$ comm -12 <(find ./src1/ ! -type d -printf "%P\n" | sort) <(find ./src0/ ! -type d -printf "%P\n" | sort)
src0 を dst に移動。これで src0 に残るものは何もなくなる。
$ mkdir -p ./dst $ mv -v -T ./src0/ ./dst `./src0/' -> `./dst' $ find . | sort . ./dst ./dst/dir0 ./dst/dir0/different-file.txt ./dst/dir0/empty-file.txt ./dst/dir0/only-in-src0-file.txt ./dst/dir0/same-file.txt ./dst/dir1 ./dst/dir1/different-file.txt ./dst/dir1/empty-file.txt ./dst/dir1/only-in-src0-file.txt ./dst/dir1/same-file.txt ./src1 ./src1/dir0 ./src1/dir0/different-file.txt ./src1/dir0/empty-file.txt ./src1/dir0/only-in-src1-file.txt ./src1/dir0/same-file.txt ./src1/dir1 ./src1/dir1/different-file.txt ./src1/dir1/empty-file.txt ./src1/dir1/only-in-src1-file.txt ./src1/dir1/same-file.txt
src1 に含まれて dst に含まれないファイルだけを dst に移動。これで src1 に残るものは dst に含まれるファイルと同じ相対パスを持つファイルだけになる。ここで src1 にファイルが残らなければ終了してかまわない。チェックサム計算がなくなるので速度は改善。
$ rsync -avv --remove-source-files --ignore-existing ./src1/ ./dst sending incremental file list delta-transmission disabled for local transfer or --whole-file dir0/different-file.txt exists dir0/empty-file.txt exists dir0/only-in-src1-file.txt dir0/same-file.txt exists dir1/different-file.txt exists dir1/empty-file.txt exists dir1/only-in-src1-file.txt dir1/same-file.txt exists sender removed dir0/only-in-src1-file.txt sender removed dir1/only-in-src1-file.txt total: matches=0 hash_hits=0 false_alarms=0 data=0 sent 337 bytes received 61 bytes 796.00 bytes/sec total size is 72 speedup is 0.18 $ find . | sort . ./dst ./dst/dir0 ./dst/dir0/different-file.txt ./dst/dir0/empty-file.txt ./dst/dir0/only-in-src0-file.txt ./dst/dir0/only-in-src1-file.txt ./dst/dir0/same-file.txt ./dst/dir1 ./dst/dir1/different-file.txt ./dst/dir1/empty-file.txt ./dst/dir1/only-in-src0-file.txt ./dst/dir1/only-in-src1-file.txt ./dst/dir1/same-file.txt ./src1 ./src1/dir0 ./src1/dir0/different-file.txt ./src1/dir0/empty-file.txt ./src1/dir0/same-file.txt ./src1/dir1 ./src1/dir1/different-file.txt ./src1/dir1/empty-file.txt ./src1/dir1/same-file.txt
dst に含まれるファイルと同じ相対パスを持つ src1 に含まれるファイルのチェックサムを確認しながら移動。チェックサムが同じ場合は src1 に含まれるファイルを削除、異なる場合は dst に含まれるファイルを dst_bkup に移動、src1 に含まれるファイルを dst に移動。
$ rsync -avv --remove-source-files --existing --checksum --backup --backup-dir=/full/path/to/dst_bkup/ ./src1/ ./dst backup_dir is /full/path/to/dst_bkup/ sending incremental file list delta-transmission disabled for local transfer or --whole-file dir0/ dir0/different-file.txt dir0/empty-file.txt is uptodate sender removed dir0/empty-file.txt dir0/same-file.txt is uptodate sender removed dir0/same-file.txt dir1/ dir1/different-file.txt dir1/empty-file.txt is uptodate sender removed dir1/empty-file.txt dir1/same-file.txt is uptodate sender removed dir1/same-file.txt backed up dir0/different-file.txt to /full/path/to/dst_bkup/dir0/different-file.txt sender removed dir0/different-file.txt backed up dir1/different-file.txt to /full/path/to/dst_bkup/dir1/different-file.txt sender removed dir1/different-file.txt total: matches=0 hash_hits=0 false_alarms=0 data=62 sent 469 bytes received 73 bytes 361.33 bytes/sec total size is 72 speedup is 0.13 $ find . | sort . ./dst ./dst_bkup ./dst_bkup/dir0 ./dst_bkup/dir0/different-file.txt ./dst_bkup/dir1 ./dst_bkup/dir1/different-file.txt ./dst/dir0 ./dst/dir0/different-file.txt ./dst/dir0/empty-file.txt ./dst/dir0/only-in-src0-file.txt ./dst/dir0/only-in-src1-file.txt ./dst/dir0/same-file.txt ./dst/dir1 ./dst/dir1/different-file.txt ./dst/dir1/empty-file.txt ./dst/dir1/only-in-src0-file.txt ./dst/dir1/only-in-src1-file.txt ./dst/dir1/same-file.txt ./src1 ./src1/dir0 ./src1/dir1