Re: [vserver] Possible Hashify Corruption - Update

From: Gordan Bobic <gordan_at_bobich.net>
Date: Wed 20 Oct 2010 - 13:03:57 BST
Message-ID: <4CBEDAAD.2070708@bobich.net>

I just tried with the latest util-vserver (0.30.216-1.pre2914), and the
effect is the same. As soon as a guest gets hashified, it's files are
trashed.

The effect is quite interesting, though. The corruption isn't observable
until after a reboot.

Doing echo 3 > /proc/sys/vm/drop_caches results in the cache memory
going down to about 15MB. After hashifying, it dropping the caches again
makes it not go below 350MB. At this point, the files still check out as
being the same.

Then do a reboot, and all the hashified files will end up being trashed.
It doesn't matter if the files are merged between servers - the
unhashified servers stay healthy, but the hashified one (even if it is
the only hashified server!) is completely corrupted.

That implies that somehow the healthy files get wedged in caches that
cannot be dropped.

This is still with the 2.6.30.10-vs2.3.0.36.14-pre8 kernel. I haven't
tried a new kernel yet, that is next. I just wanted to rule out the
(extremely unlikely) possibility that the user-space was to blame. Will
report if the problem goes away with 2.6.35.7-vs2.3.0.36.33. If it
doesn't, I'll try with ext2 (rather than ext4 without the journal).

The only good thing about this is that the exercise seems to be 100%
repeatable. If anybody can think of any test cases to run that would
help get to the bottom of the cause of this, please do let me know.

Gordan

Gordan Bobic wrote:
> Ghislain wrote:
>>> For completenes, the userspace packages I use are:
>>>
>>> util-vserver-lib-0.30.215+svn2847-143596525.fc12.x86_64
>>> util-vserver-build-0.30.215+svn2847-143596525.fc12.x86_64
>>> util-vserver-core-0.30.215+svn2847-143596525.fc12.x86_64
>>> util-vserver-0.30.215+svn2847-143596525.fc12.x86_64
>>> util-vserver-sysv-0.30.215+svn2847-143596525.fc12.x86_64
>>>
>>> The kernel is 2.6.30.10-vs2.3.0.36.14-pre8.
>>
>> hi,
>>
>> using vserver util from stable with 2.3 experimental kernel patch will
>> lead to troubles. You need to have a userland that match your kernel or
>> strange thing will happen. With 2.3 you MUST use 0.30.216, the latest
>> the better.
>
> I don't see 0.30.216 here yet:
> http://ftp.linux-vserver.org/pub/utils/util-vserver/
>
>> john use the right combo of utils/kernel but you do not. I think this
>> can explain your weird issues (perhaps)
>
> I would have thought the hashify feature is pretty user-space (check
> hashes, and create hard-links, it's pretty much a job for a bash
> script), but I did find this:
> http://people.linux-vserver.org/~dhozac/t/uv-testing/
> so I'll do a test with the current kernel and the latest bleeding edge
> user-space.
>
> My main suspect at the moment is an ext4 bug, though. If it doesn't go
> away with the latest kernel and userspace I'll try with ext2 and see if
> that makes the problem go away. But even if it does, this still needs to
> be investigated and the root cause identified, since it verifiably leads
> to file corruption.
>
> Gordan
Received on Wed Oct 20 13:04:09 2010

[Next/Previous Months] [Main vserver Project Homepage] [Howto Subscribe/Unsubscribe] [Paul Sladen's vserver stuff]
Generated on Wed 20 Oct 2010 - 13:04:09 BST by hypermail 2.1.8