R. Ayanokouzi et al.

[rdfind] 同じ内容のファイルをハードリンクにする

同じ内容の含まれるディレクトリが沢山。削除する前にハードリンクにしてみよう。

まずはマウント位置の異なるファイルシステムにテストディレクトリとテストファイルを作成。すなわち、同じファイルシステムと別のファイルシステムに内容の同じファイルがたくさんある状況を考えている。

$ rm -fr ./mnt/sd{b,d,e,f}1/rdfind_test_dir
$ mkdir -p ./mnt/sd{b,d,e,f}1/rdfind_test_dir
$ df -haT ./mnt/sd{b,d,e,f}1/rdfind_test_dir
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/sdc1      ext4  2.7T  2.1T  487G  82% /*****************/mnt/sdb1
/dev/sdb1      ext4  2.7T  2.0T  780G  72% /*****************/mnt/sdd1
/dev/sdd1      ext4  2.7T  2.1T  607G  78% /*****************/mnt/sde1
/dev/sdf1      ext4  2.7T  2.7T  7.1G 100% /*****************/mnt/sdf1
$ for I in ./mnt/sd{b,d,e,f}1/rdfind_test_dir/rdfind_test_file{0,1}; do echo 1 > ${I}; done;

rdfind に処理させたいディレクトリの素性は以下。

$ ls -lai ./mnt/sd{b,d,e,f}1/rdfind_test_dir
./mnt/sdb1/rdfind_test_dir:
total 16
55959553 drwxr-xr-x 2 **** **** 4096 Dec 31 16:25 .
       2 drwxr-xr-x 5 **** **** 4096 Dec 31 16:25 ..
55959554 -rw-r--r-- 1 **** ****    2 Dec 31 16:26 rdfind_test_file0
55959555 -rw-r--r-- 1 **** ****    2 Dec 31 16:26 rdfind_test_file1

./mnt/sdd1/rdfind_test_dir:
total 16
169771009 drwxr-xr-x 2 **** **** 4096 Dec 31 16:25 .
        2 drwxr-xr-x 5 **** **** 4096 Dec 31 16:25 ..
169771010 -rw-r--r-- 1 **** ****    2 Dec 31 16:26 rdfind_test_file0
169771011 -rw-r--r-- 1 **** ****    2 Dec 31 16:26 rdfind_test_file1

./mnt/sde1/rdfind_test_dir:
total 16
111935489 drwxr-xr-x 2 **** **** 4096 Dec 31 16:25 .
        2 drwxr-xr-x 5 **** **** 4096 Dec 31 16:25 ..
111935490 -rw-r--r-- 1 **** ****    2 Dec 31 16:26 rdfind_test_file0
111935491 -rw-r--r-- 1 **** ****    2 Dec 31 16:26 rdfind_test_file1

./mnt/sdf1/rdfind_test_dir:
total 16
105906177 drwxr-xr-x 2 **** **** 4096 Dec 31 16:25 .
        2 drwxr-xr-x 5 **** **** 4096 Dec 31 16:25 ..
105906178 -rw-r--r-- 1 **** ****    2 Dec 31 16:26 rdfind_test_file0
105906179 -rw-r--r-- 1 **** ****    2 Dec 31 16:26 rdfind_test_file1

rdfind -dryrun true でハードリンクになるかテスト。ファイルシステムをまたぐものですらハードリンクにしようとしてしまっている。

$ rdfind -dryrun true -makehardlinks true ./mnt/sd{b,d,e,f}1/rdfind_test_dir
(DRYRUN MODE) Now scanning "./mnt/sdb1/rdfind_test_dir", found 2 files.
(DRYRUN MODE) Now scanning "./mnt/sdd1/rdfind_test_dir", found 2 files.
(DRYRUN MODE) Now scanning "./mnt/sde1/rdfind_test_dir", found 2 files.
(DRYRUN MODE) Now scanning "./mnt/sdf1/rdfind_test_dir", found 2 files.
(DRYRUN MODE) Now have 8 files in total.
(DRYRUN MODE) Removed 0 files due to nonunique device and inode.
(DRYRUN MODE) Now removing files with zero size from list...removed 0 files
(DRYRUN MODE) Total size is 16 bytes or 16 b
(DRYRUN MODE) Now sorting on size:removed 0 files due to unique sizes from list.8 files left.
(DRYRUN MODE) Now eliminating candidates based on first bytes:removed 0 files from list.8 files left.
(DRYRUN MODE) Now eliminating candidates based on last bytes:removed 0 files from list.8 files left.
(DRYRUN MODE) Now eliminating candidates based on md5 checksum:removed 0 files from list.8 files left.
(DRYRUN MODE) It seems like you have 8 files that are not unique
(DRYRUN MODE) Totally, 14 b can be reduced.
(DRYRUN MODE) Now making results file results.txt
(DRYRUN MODE) Now making hard links.
hardlink ./mnt/sdb1/rdfind_test_dir/rdfind_test_file0 to ./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
hardlink ./mnt/sdd1/rdfind_test_dir/rdfind_test_file0 to ./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
hardlink ./mnt/sdd1/rdfind_test_dir/rdfind_test_file1 to ./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
hardlink ./mnt/sde1/rdfind_test_dir/rdfind_test_file1 to ./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
hardlink ./mnt/sde1/rdfind_test_dir/rdfind_test_file0 to ./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
hardlink ./mnt/sdf1/rdfind_test_dir/rdfind_test_file1 to ./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
hardlink ./mnt/sdf1/rdfind_test_dir/rdfind_test_file0 to ./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
Making 7 links.

-dryrun true を外して実際に実行した結果は以下。同じファイルシステムにあるファイルは正しくハードリンクになってるけど、ファイルシステムを跨いでハードリンクを作成しようとしているので失敗したよと言われている。

$ rdfind -makehardlinks true ./mnt/sd{b,d,e,f}1/rdfind_test_dir
Now scanning "./mnt/sdb1/rdfind_test_dir", found 2 files.
Now scanning "./mnt/sdd1/rdfind_test_dir", found 2 files.
Now scanning "./mnt/sde1/rdfind_test_dir", found 2 files.
Now scanning "./mnt/sdf1/rdfind_test_dir", found 2 files.
Now have 8 files in total.
Removed 0 files due to nonunique device and inode.
Now removing files with zero size from list...removed 0 files
Total size is 16 bytes or 16 b
Now sorting on size:removed 0 files due to unique sizes from list.8 files left.
Now eliminating candidates based on first bytes:removed 0 files from list.8 files left.
Now eliminating candidates based on last bytes:removed 0 files from list.8 files left.
Now eliminating candidates based on md5 checksum:removed 0 files from list.8 files left.
It seems like you have 8 files that are not unique
Totally, 14 b can be reduced.
Now making results file results.txt
Now making hard links.
failed to make hardlink ./mnt/sdd1/rdfind_test_dir/rdfind_test_file0 to ./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
Rdutil.cc: Failed to apply function f on it.
failed to make hardlink ./mnt/sdd1/rdfind_test_dir/rdfind_test_file1 to ./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
Rdutil.cc: Failed to apply function f on it.
failed to make hardlink ./mnt/sde1/rdfind_test_dir/rdfind_test_file1 to ./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
Rdutil.cc: Failed to apply function f on it.
failed to make hardlink ./mnt/sde1/rdfind_test_dir/rdfind_test_file0 to ./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
Rdutil.cc: Failed to apply function f on it.
failed to make hardlink ./mnt/sdf1/rdfind_test_dir/rdfind_test_file1 to ./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
Rdutil.cc: Failed to apply function f on it.
failed to make hardlink ./mnt/sdf1/rdfind_test_dir/rdfind_test_file0 to ./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
Rdutil.cc: Failed to apply function f on it.
Making 1 links.
$ cat results.txt
# Automatically generated
# duptype id depth size device inode priority name
DUPTYPE_FIRST_OCCURRENCE 1 0 2 2081 55959555 1 ./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
DUPTYPE_OUTSIDE_TREE -1 0 2 2065 169771010 2 ./mnt/sdd1/rdfind_test_dir/rdfind_test_file0
DUPTYPE_OUTSIDE_TREE -1 0 2 2065 169771011 2 ./mnt/sdd1/rdfind_test_dir/rdfind_test_file1
DUPTYPE_OUTSIDE_TREE -1 0 2 2097 111935491 3 ./mnt/sde1/rdfind_test_dir/rdfind_test_file1
DUPTYPE_OUTSIDE_TREE -1 0 2 2097 111935490 3 ./mnt/sde1/rdfind_test_dir/rdfind_test_file0
DUPTYPE_OUTSIDE_TREE -1 0 2 2129 105906179 4 ./mnt/sdf1/rdfind_test_dir/rdfind_test_file1
DUPTYPE_OUTSIDE_TREE -1 0 2 2129 105906178 4 ./mnt/sdf1/rdfind_test_dir/rdfind_test_file0
# end of file
$ ls -lai ./mnt/sd{b,d,e,f}1/rdfind_test_dir
./mnt/sdb1/rdfind_test_dir:
total 16
55959553 drwxr-xr-x 2 de de 4096 Dec 31 16:27 .
       2 drwxr-xr-x 5 de de 4096 Dec 31 16:25 ..
55959555 -rw-r--r-- 2 de de    2 Dec 31 16:26 rdfind_test_file0
55959555 -rw-r--r-- 2 de de    2 Dec 31 16:26 rdfind_test_file1

./mnt/sdd1/rdfind_test_dir:
total 8
169771009 drwxr-xr-x 2 de de 4096 Dec 31 16:27 .
        2 drwxr-xr-x 5 de de 4096 Dec 31 16:25 ..

./mnt/sde1/rdfind_test_dir:
total 8
111935489 drwxr-xr-x 2 de de 4096 Dec 31 16:27 .
        2 drwxr-xr-x 5 de de 4096 Dec 31 16:25 ..

./mnt/sdf1/rdfind_test_dir:
total 8
105906177 drwxr-xr-x 2 de de 4096 Dec 31 16:27 .
        2 drwxr-xr-x 5 de de 4096 Dec 31 16:25 ..

-makehardlinks true の代わりに -makesymlinks true とすればシンボリックリンクが作られるが、同じファイルシステム内ではハードリンクにしたいという要求は満足されない。

$ rm -fr ./mnt/sd{b,d,e,f}1/rdfind_test_dir
$ mkdir -p ./mnt/sd{b,d,e,f}1/rdfind_test_dir
$ for I in ./mnt/sd{b,d,e,f}1/rdfind_test_dir/rdfind_test_file{0,1}; do echo 1 > ${I}; done;
$ rdfind -makesymlinks true ./mnt/sd{b,d,e,f}1/rdfind_test_dir
Now scanning "./mnt/sdb1/rdfind_test_dir", found 2 files.
Now scanning "./mnt/sdd1/rdfind_test_dir", found 2 files.
Now scanning "./mnt/sde1/rdfind_test_dir", found 2 files.
Now scanning "./mnt/sdf1/rdfind_test_dir", found 2 files.
Now have 8 files in total.
Removed 0 files due to nonunique device and inode.
Now removing files with zero size from list...removed 0 files
Total size is 16 bytes or 16 b
Now sorting on size:removed 0 files due to unique sizes from list.8 files left.
Now eliminating candidates based on first bytes:removed 0 files from list.8 files left.
Now eliminating candidates based on last bytes:removed 0 files from list.8 files left.
Now eliminating candidates based on md5 checksum:removed 0 files from list.8 files left.
It seems like you have 8 files that are not unique
Totally, 14 b can be reduced.
Now making results file results.txt
Now making symbolic links. creating
Making 7 links.
$ cat results.txt
# Automatically generated
# duptype id depth size device inode priority name
DUPTYPE_FIRST_OCCURRENCE 1 0 2 2081 55959555 1 ./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
DUPTYPE_WITHIN_SAME_TREE -1 0 2 2081 55959554 1 ./mnt/sdb1/rdfind_test_dir/rdfind_test_file0
DUPTYPE_OUTSIDE_TREE -1 0 2 2065 169771010 2 ./mnt/sdd1/rdfind_test_dir/rdfind_test_file0
DUPTYPE_OUTSIDE_TREE -1 0 2 2065 169771011 2 ./mnt/sdd1/rdfind_test_dir/rdfind_test_file1
DUPTYPE_OUTSIDE_TREE -1 0 2 2097 111935491 3 ./mnt/sde1/rdfind_test_dir/rdfind_test_file1
DUPTYPE_OUTSIDE_TREE -1 0 2 2097 111935490 3 ./mnt/sde1/rdfind_test_dir/rdfind_test_file0
DUPTYPE_OUTSIDE_TREE -1 0 2 2129 105906179 4 ./mnt/sdf1/rdfind_test_dir/rdfind_test_file1
DUPTYPE_OUTSIDE_TREE -1 0 2 2129 105906178 4 ./mnt/sdf1/rdfind_test_dir/rdfind_test_file0
# end of file
$ ls -lai ./mnt/sd{b,d,e,f}1/rdfind_test_dir
./mnt/sdb1/rdfind_test_dir:
total 12
55959553 drwxr-xr-x 2 **** **** 4096 Dec 31 16:49 .
       2 drwxr-xr-x 5 **** **** 4096 Dec 31 16:48 ..
55959554 lrwxrwxrwx 1 **** ****   53 Dec 31 16:49 rdfind_test_file0 -> /*********/./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
55959555 -rw-r--r-- 1 **** ****    2 Dec 31 16:49 rdfind_test_file1

./mnt/sdd1/rdfind_test_dir:
total 8
169771009 drwxr-xr-x 2 **** **** 4096 Dec 31 16:49 .
        2 drwxr-xr-x 5 **** **** 4096 Dec 31 16:48 ..
169771010 lrwxrwxrwx 1 **** ****   53 Dec 31 16:49 rdfind_test_file0 -> /*********/./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
169771011 lrwxrwxrwx 1 **** ****   53 Dec 31 16:49 rdfind_test_file1 -> /*********/./mnt/sdb1/rdfind_test_dir/rdfind_test_file1

./mnt/sde1/rdfind_test_dir:
total 8
111935489 drwxr-xr-x 2 **** **** 4096 Dec 31 16:49 .
        2 drwxr-xr-x 5 **** **** 4096 Dec 31 16:48 ..
111935490 lrwxrwxrwx 1 **** ****   53 Dec 31 16:49 rdfind_test_file0 -> /*********/./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
111935491 lrwxrwxrwx 1 **** ****   53 Dec 31 16:49 rdfind_test_file1 -> /*********/./mnt/sdb1/rdfind_test_dir/rdfind_test_file1

./mnt/sdf1/rdfind_test_dir:
total 8
105906177 drwxr-xr-x 2 **** **** 4096 Dec 31 16:49 .
        2 drwxr-xr-x 5 **** **** 4096 Dec 31 16:48 ..
105906178 lrwxrwxrwx 1 **** ****   53 Dec 31 16:49 rdfind_test_file0 -> /*********/./mnt/sdb1/rdfind_test_dir/rdfind_test_file1
105906179 lrwxrwxrwx 1 **** ****   53 Dec 31 16:49 rdfind_test_file1 -> /*********/./mnt/sdb1/rdfind_test_dir/rdfind_test_file1

ソーシャルブックマーク

  1. はてなブックマーク
  2. Google Bookmarks
  3. del.icio.us

ChangeLog

  1. Posted: 2008-05-05T16:12:44+09:00
  2. Modified: 2008-05-05T16:12:44+09:00
  3. Generated: 2021-03-31T07:51:34+09:00