From: Jacques Gelinas (jack_at_solucorp.qc.ca)
Date: Wed 08 Oct 2003 - 06:01:40 BST

On Mon, 6 Oct 2003 23:17:02 -0500, Lyn St George wrote
There was a bug in the vserver script when it fails to parse the vserver
context number and then claimed the vserver was not running while
it was.

I suggest you upgrade the vserver package.

Now, the security context may be high because you may be probing
non running vserver or because of the bug explain above. Each time
to "enter" a non running vserver, a new security context is allocated always
greater until it reaches a maximum and then loop and then try to find un-unsed
one. Now if the vserver script thinks the vserver is not running and you
call the script often to do some probing, new security context will be allocated
each time.

Please upgrade the utilities and see how it goes.

> Over the last couple of weeks a number of vservers have been
> triggering my monitoring system, because either services inside
> a vserver become invisible to a 'ps ax', or the entire vserver seems
> to have stopped when in fact it hasn't.
> Example: 2 minutes ago I got an alert for a 4th vserver having gone
> down, because a 'ps ax' from the host produced the line "x not running".
> Entering the server and then doing a 'ps ax' produced this:
> Server xxx is not running
> ipv4root is now xxx
> Host name is now xxx
> New security context is 1008
> [root_at_xxx /]ps ax
> 1 ? S 5:37 init [3]
> 12225 pts/21 S 0:00 /bin/bash -login
> 12284 pts/21 R 0:00 ps ax
> Despite this, everything is running, and ports 21, 22, 25, 80, 110 443 can
> all be connected to.
> What is plain is that the security context number is ridiculously high. On
> a 'good' vserver this number is something like 26 or 32. On the 'bad' ones,
> ie where 'ps ax' no longer shows the stuff that is running, this number is over
> 1000
> What is going on?? Does anybody have any ideas at all, or has anyone
> seen anything even remotely similar?
> (kernel 2.4.18, vserver v12. Uptime 145 days with no such problems ever
> over the last year or more. This host has 1Gb of memory, Athlon, current
> load average: 0.75, 0.81, 0.72)
> Any help will be greatly appreciated.
