From: Sam Vilain (sam_at_vilain.net)
Date: Sun 08 Aug 2004 - 13:19:18 BST
Jörn Engel wrote:
>>The chances of bits on your hard drive platter randomly losing their
>>magnetism or capacitors in your RAM losing charge and changing are
>>probably higher than two different files having an SHA1 collision :-).
>I used to have the same opinion. Then I read this:
>>relative path and file size. My assumption was that if these all match,
>>the files are probably going to be the same anyway.
> In that case, you can ignore the hashes anyway. Do a direct
> comparison, nothing lost.
I think this is ultimately a matter of faith. Personally my gut feeling
is that the cryptographers know more than the skeptics, and wait with
keen interest for them to show that their birthday paradox actually
happens in real life when applied to SHA-1. So, every extra hash bit is
actually only sqrt(2) of extra randomness. sqrt(2^160) is still 2^80
which is still a very large number (sqrt(365) = 19).
When I read the original CryptoBytes newsletter about the MD5 hash
function weakness, I was left with the impression that the only thing
they thought possible was *inserted* blocks that do not affect the MD5
value. The actual MD5 hash function has (to my knowledge) no known
flaw. This is one of the things that they fixed with the SHA-* suite.
Each to their own of course! Maybe a full comparison should be the
default behaviour, but personally I'm happy with the digest.
>>Failing active monitoring, as a simple compromise there's no reason that
>>unify-dirs couldn't optionally store its internal inode/stat/SHA1 hash
>>cache in a Berkeley database, and run the script every hour or so via
>>cron. It would certainly prevent the copious stat()'ing that the script
>>does, at the expense of not noticing unlikely unification situations
>>until the DB cache entries expire.
>Your problem is simpler, compared to the one I want to solve. Also,
>with final cowlinks, it's perfectly sane to combine two files with
>different owners, permissions, [amc]times, etc. Both will have
>seperate inodes, just the data is identical.
Yes, if you can do some kind of kernel side inode -> inode "semi-soft"
(Bagua?) link like COWlinks you get these advantages.
It just makes the unification script have to save a whole lot more state
information, and personally for my purposes I consider that unnecessary;
but then, I'm mostly concerned about saving space for system libraries
and binaries across a whole load of virtually identical vservers.
-- Sam Vilain, sam /\T vilain |><>T net, PGP key ID: 0x05B52F13 (include my PGP key ID in personal replies to avoid spam filtering)
_______________________________________________ Vserver mailing list Vserver_at_list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver