From: Gebhardt Thomas (gebhardt_at_hrz.uni-marburg.de)
Date: Mon 02 Feb 2004 - 10:05:32 GMT
about once a week we run into trouble with one of our master/vservers:
suddenly all vservers on that host seem to be offline, they do not
reply a ping request and all tcp connections get disconnected. The
master is fine, however. I can ssh to the master, enter one vserver
from there and initiate a network connection from within that vserver
to the outer world: No problem, and from that moment this specific
vserver gets "online" again. During the failure state I have no time
for a detailed analysis of the flaw; just trying to get the services
running again. I was not yet able to reproduce that situation on
a test server, however.
Any hints what I could do?
* At least one time the failure happened when I caused some substantial
load on the master server (rsync'ed a vserver to another host)
* The vservers stay "offline" until I actively initiate a network connection
from within the vserver (or if I reboot the vserver)
* At least a few minutes after the failure occurred, there are nor arp
entries for the affected vservers in the routers arp tables, even if I try
to ping the vserver. So it seems that the master does not reply to
arp requests for the alias interfaces.
I'm running 2.4.24-vs1.22 (There is a HA environment with
heartbeat and drbd on the affected host, but I dont't think that this is
Thanks for any hint!
Vserver mailing list