Re: [vserver] Getting a real pid 1 init in a container

From: Grzegorz Nosek <grzegorz.nosek_at_gmail.com>
Date: Tue 20 Mar 2012 - 10:31:10 GMT
Message-ID: <4F685C6E.5070507@gmail.com>

W dniu 19.03.2012 21:03, Daniel Hokka Zakrisson pisze:
> Grzegorz Nosek wrote:
>> Hi,
>>
>> Continuing my quest to run Ubuntu 12.04 under Linux-VServer (so far only
>> on kernels I have on hand, probably considered ancient around these parts).
>
> Versions are always relevant...

patch-2.6.27.52-vs2.3.0.36.9.diff

and

patch-2.6.35.8-vs2.3.0.36.33.diff

on top of latest matching kernel versions plus a bunch of fixes
(entirely unrelated, mostly various security patch backports from later
versions).

util-vserver is a rather random pre2772 (possibly because it came with
Debian Lenny as that's on my testing VM).

>> Can anybody please explain to me what is the semantics of the fakeinit
>> vserver flag? I changed /sbin/init to the following script to see what's
>> going on:
>
> I'd go further and also make it start a daemon and attempt to wait for
> it...
>
>> #!/bin/sh
>>
>> echo $$
>> exec /sbin/init.real
>>
>> With initstyle=plain and various combinations of fakeinit and PID
>> namespaces I'm getting:
>>
>> fakeinit, no pidns:
>>
>> pid is 1 and upstart (init.real) apparently starts successfully but does
>> not receive SIGCHLD when a process inside the container dies, thus
>> breaking start/stop/restart tools (and waitpid(-1) returns -ESRCH). I
>> did not instrument the real init to see if the SIGCHLD goes there instead.
>
> ... since unless your kernel is broken, I don't think that is happening.
> But, the kernel debugging should tell you exactly.

I see vx_set_init and vx_set_reaper called with the right xid and pid of
the upstart process.

So we have the victim:

root@vmanager:/# ps -C rsyslogd -o pid,pgid,ppid,comm
   PID PGID PPID COMMAND
  7189 7180 1 rsyslogd
root@vmanager:/# stop rsyslog

Now upstart does a lot of things which end up with:

18:58:26.731193 getpgid(0x1c15) = 7180 <0.000054>

0x1c15 is the pid of the rsyslog daemon.

18:58:26.731354 kill(-7180, SIGTERM) = 0 <0.005931>

rsyslog shuts down cleanly after this.

18:58:26.737608 clock_gettime(CLOCK_MONOTONIC, {5514, 785320506}) = 0
<0.000000>
18:58:26.737695 close(-1) = -1 EBADF (Bad file descriptor)
<0.000028>
18:58:26.737863 clock_gettime(CLOCK_MONOTONIC, {5514, 785604002}) = 0
<0.000053>
18:58:26.737975 select(11, [4 6 7 8 9 10], [], [8 9 10], {5, 0}) = 0
(Timeout) <5.016292>
18:58:31.754680 read(4, 0xffc6e94f, 1) = -1 EAGAIN (Resource
temporarily unavailable) <0.000165>
18:58:31.755173 waitid(P_ALL, 0, 0xffc6e6d8,
WNOHANG|WEXITED|WSTOPPED|WCONTINUED, NULL) = -1 ECHILD (No child
processes) <0.000208>

OK, so it isn't waitpid() but waitid(). My bad, I guess I haven't heard
of waitid() before.

18:58:31.755598 clock_gettime(CLOCK_MONOTONIC, {5519, 803480047}) = 0
<0.000225>
18:58:31.756161 getpgid(0x1c15) = -1 ESRCH (No such process)
<0.000087>
18:58:31.756511 kill(7189, SIGKILL) = -1 ESRCH (No such process)
<0.000159>

It doesn't exist? So let's kill it again, for good measure.

18:58:31.757137 select(11, [4 6 7 8 9 10], [], [8 9 10], NULL

Back to the main loop (probably), with confused upstart and a "stop
rsyslog" still hanging. After killing the stop command we do:

root@vmanager:/# status rsyslog
rsyslog stop/killed, process 7189
root@vmanager:/# start rsyslog

Another hang, no rsyslog.

BTW, a "vps auxwf" from the host shows something interesting:

root 7147 948 vmanager 0.0 0.6 3112 1544 ? Ss
19:52 0:00 /sbin/init.real
root 7147 948 vmanager 0.0 0.6 3112 1544 ? Ss
19:52 0:00 /sbin/init.real

i.e. the same process listed twice.

SIGCHLD isn't always missing as I have noticed when I misconfigured ssh
and it kept restarting. So it might actually be an upstart bug but
anyway, shouldn't the SIGCHLD of the dying rsyslogd get delivered to
upstart? (BTW, I love that you can strace the container init)

> No, you're telling it not to get pid 1, so it doesn't.
>
> Pid namespaces are not implemented yet.

OK. So the right way is fakeinit, no pidns.

> No, there is no relation. fakeinit sets pid 1, --initpid to vcontext
> sets the reaper.

OK.

Best regards,
  Grzegorz Nosek
Received on Tue Mar 20 10:31:22 2012

[Next/Previous Months] [Main vserver Project Homepage] [Howto Subscribe/Unsubscribe] [Paul Sladen's vserver stuff]
Generated on Tue 20 Mar 2012 - 10:31:23 GMT by hypermail 2.1.8