From: Herbert Poetzl (herbert_at_13thfloor.at)
Date: Tue 02 Nov 2004 - 18:17:34 GMT
On Tue, Nov 02, 2004 at 12:52:29PM +0100, Enrico Scholz wrote:
> sam_at_vilain.net (Sam Vilain) writes:
> > The following patch, to vservers.functions in the util-vserver
> > distribution, will do something of a `namespace cleanup' in lieu of
> > the rework to the vserver startup and mount cleanup process that
> > Enrico has planned (I'm told).
> Currently there are two conflicting requirements:
> (a) 'vserver ... enter' and operating from the outside in the vserver, and
> (b) cleaning /proc/mounts
first, I would like to split up (a) into
(a1) 'vserver ... enter' and
(a2) operating from the outside in the vserver
> I am tending to keep (a) but it was told that people *require* (b).
> There is an experimental 'UV_NAMESPACE_AFTER_CHROOT' branch in CVS which
> enables (b). Basically, it calls vc_set_namespace(2) after chroot(2) and
> kernel makes some cleanups then. As expected, this breaks (a), because
> | vnamespace --enter ...
> your host-context process will be chrooted into the (hostile) vserver-dir.
why chrooted? why not _moved_ into the namespace
which just knows some vfsmount as / and whatever
the vserver had mounted inside?
> Some explanations about the current (a) method:
> 1. a new namespace is created
> 2. all extra-mountings will be executed (from the vserver-fstab)
> 3. 'mount --rbind /vservers/... /' will be executed
> 4. the context will be created
> 5. the chroot will be entered and the init-process be executed
why not do it this way:
1. get a new namespace
2. create the vfsmount (for example via --bind)
3. pivot_root (or similar, maybe new cmd?) to the vfsmount
4. cleanup the namespace (remove host stuff)
5. do all required/listed mounts inside that namespace
6. create the context
processes from outside can migrate into the namespace
to satisfy (a1), and the host administrator can
operate on the files in /vservers/<name> from the host
(there could even be some script which does the same
mounts the vserver did, on the host, if somebody needs
> There is an alternative approach implementing (b): doing 4. before 2.
> which would tag the mountpoints in 2. and would allow a later cleanup of
> all untagged ones. Since scripts are involved, this approach requires
> multiple, parallel processes whose synchronization can be done by bash
> in an ugly way only. Therefore, a binary implementation of vserver.start
> is planned. But this will not solve the conflict between (a) and (b).
> Back to (a): After 3. your /proc/mounts looks like
> | / ## 1
> | /dev/pts
> | /usr
> | /var
> | / ## 2
> | /tmp
> | /dev/pts
> The last entries (after #2) are from the '--rbind'. The current process's
> '/' (/proc/self/root) is still the '/' #1 (you never changed it). Therefore,
> every 'cd /' or 'cat /blahblub' will operate in the original filesystem.
> After executing the chroot('/vservers/...'), the '/' #1 will be
> disassociated (you do not have any reference on it anymore), but it
> still exists somewhere. Now, you lost any reference-point and operate in
> the most recent filesystem-hierarchy which is this behind '/' #2.
> Therefore, the 'chroot("..')' escape tricks will not work: '/' #2 is
> your '/' and you can not go before it. Nevertheless, with help from the
> outside you can exchange FDs and go back into #1. Removing unneeded
> mountpoints from the namespace as suggested would remove some entries
> between #1 and #2 also. But it would not protect against the FD exchange
> trick completely (e.g. /vservers/... filesystems are usually on the same
> device and FD exchanging would work there).
> I really do not know how to solve the conflict between (a) and (b):
> currently you can have one but not both features.
I still do not see _any_ conflict ...
> Vserver mailing list
Vserver mailing list