[vserver] Wierd SIGTERM problem.

From: Robin Lee Powell <rlpowell_at_digitalkingdom.org>
Date: Wed 17 Oct 2007 - 12:19:05 BST
Message-ID: <20071017111905.GA14269@digitalkingdom.org>

I don't know that this is a vserver problem, but I can't find any
other source for it, so...

I'm running mooix (a MUD) under vserver; uname for the outer host
says 2.6.17.8-vs2.0.2-rc29

At some point recently, idle connections to the MUD started getting
dropped for no apparent reason at all. It looks like certain
processes are getting SIGTERMed, and I can't figure out why.

Any help here would be very appreciated.

There's a process called "prompt", which polls a file for changes,
that is often the one TERMd; either that or the process listening on
the other end of a socket from "prompt". Here's some strace output
from "prompt" getting killed:

1192618964.154960 fstat64(9, {st_mode=S_IFREG|0660, st_size=0, ...}) = 0
1192618964.155046 select(9, [8], NULL, NULL, {0, 300000}) = 0 (Timeout)
1192618964.454959 fstat64(9, {st_mode=S_IFREG|0660, st_size=0, ...}) = 0
1192618964.455046 select(9, [8], NULL, NULL, {0, 300000}) = 0 (Timeout)
1192618964.782783 fstat64(9, {st_mode=S_IFREG|0660, st_size=0, ...}) = 0
1192618964.796336 select(9, [8], NULL, NULL, {0, 300000}) = 0 (Timeout)
1192618965.094850 fstat64(9, {st_mode=S_IFREG|0660, st_size=0, ...}) = 0
1192618965.094936 select(9, [8], NULL, NULL, {0, 300000}) = 0 (Timeout)
1192618965.394805 fstat64(9, {st_mode=S_IFREG|0660, st_size=0, ...}) = 0
1192618965.394890 select(9, [8], NULL, NULL, {0, 300000}) = ? ERESTARTNOHAND (To be restarted)
1192618965.590008 --- SIGTERM (Terminated) @ 0 (0) ---
1192618965.590136 ioctl(8, TCXONC, TCOON) = 0
1192618965.590192 rt_sigprocmask(SIG_BLOCK, [INT], [TERM], 8) = 0
1192618965.590252 ioctl(8, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig -icanon -echo ...}) = 0
1192618965.591181 +++ killed by SIGKILL +++

And here's it's parent at the same time:

1192618965.591276 --- SIGCHLD (Child exited) @ 0 (0) ---
1192618965.591301 waitpid(8572, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL}], WSTOPPED) = 8572
1192618965.591362 write(4, "\0\0\0\0\t\0\0\0", 8) = -1 EPIPE (Broken pipe)
1192618965.591409 --- SIGPIPE (Broken pipe) @ 0 (0) ---
1192618965.591468 close(4) = 0
1192618965.591513 sigreturn() = ? (mask now [])
1192618965.591578 read(0, "", 20) = 0
1192618965.591616 close(0) = 0
1192618965.591654 close(0) = -1 EBADF (Bad file descriptor)
1192618965.592013 exit_group(0) = ?

I have no idea where that SIGTERM is coming from. strace of
basically everything else on the system can't find the sender.
Sometimes it comes after ~5 minutes, sometimes after ~25. It's
really wierd.

-Robin

-- 
Lojban Reason #17: http://en.wikipedia.org/wiki/Buffalo_buffalo
Proud Supporter of the Singularity Institute - http://singinst.org/
http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/
Received on Wed Oct 17 12:19:45 2007
[Next/Previous Months] [Main vserver Project Homepage] [Howto Subscribe/Unsubscribe] [Paul Sladen's vserver stuff]
Generated on Wed 17 Oct 2007 - 12:19:50 BST by hypermail 2.1.8