From: Jacques Gelinas (jack_at_solucorp.qc.ca)
Date: Tue 29 Jan 2002 - 06:21:27 GMT
1.1. /usr/lib/vserver/vdu: New
This is a limited clone of the du command. It skips file with more
than one link. It is used to evaluate the disk usage of an unified
vserver. Using the normal du for this task is misleading since it will
count all unified files.
1.2. /usr/sbin/newvserver: more stats
The utility now reports more statistic about the amount of file and
1.3. CAP_SYS_CHROOT capability
It is now possible to remove this capability from a vserver (No
process, not even root, in a vserver can use the chroot system call).
Just place !CAP_SYS_CHROOT in the S_CAPS variable of the vserver
To support this feature, the /usr/sbin/vserver script had to be
reworked a bit since entering a vserver context involves using chroot.
So we had to kind of enter the context, then kill CAP_SYS_CHROOT
1.4. chroot and security issues: plugged
The new ctx-6 kernel solves the issues with chroot. With previous
kernel, root inside a vserver with the CAP_SYS_CHROOT capability was
able to escape out of the vserver and enter the root server. We solve
this using a single line in fs/namei.c:vfs_permission(). All chroot
escape involves walking you way toward the real root using relative
chdir (chdir("..")). The trick was to make the /vservers directory
into a "no man land" by issuing the following command:
chmod 000 /vservers
Setting these permission bits (well turning them all off) make the
directory inaccessible for any other user than root. The change in the
kernel ctx-6 makes such a directory unusable even by root in a
different security context (not 0).
The /usr/sbin/vserver will create the /vservers appropriately. If the
directory exist, it will check the permissions and signal the admin if
they are not 000.
1.5. New kernel ctx-6
The new kernel patch-2.4.17ctx-6 introduce many enhancements. This is
still binary compatible, so moving to this kernel does not involve any
configuration changes. Updating to vserver 0.10 is recommended, but
there is no special upgrade step.
The features are:
+ /dev/pts is now private in each vserver. One vserver can't see or
interfere with the other pseudo-tty of the other server, including
the root server.
+ Network device: A vserver can only see the network device
associated with its ipv4 root.
+ system V ipc: The sysv ipc resources are now private per security
+ The fakeinit concept allow usage of a normal /sbin/init in a
+ A signal handling bug was solved. The most noticeable feature is
that cntrl-C now work when using "vserver name enter". Other
networking issue are probably solved by this.
You can get the patch and binaries as usual from
ftp://ftp.solucorp.qc.ca/pub/vserver. The pub/vserver/patches also
contains a relative patch from ctx-5 to ctx-6. You can review what was
done this way.
This kernel plugs probably most security issues. There is still to
many things visible in /proc as seen from a vserver. A new file system
called vproc will be written to provide a limited view.
While this kernel should prevent a vserver administrator to gain
access to the vserver, there are still ways to produce some DOS by
exhausting all resources. The nproc feature works correctly and
control the amount of processes used by a vserver. Some more work is
needed to address all the other resource limits (files, memory, ...)
1.6. No NIS domain in a vserver
A vserver may be run with a different NIS domain name than the root
server. Or it could run with the same. To keep the same domain name,
one just had to set the S_DOMAINNAME variable to the vserver
configuration file to nothing.
There were no way to tell that you did not want a NIS domain name in a
vserver when there was one set in the root server. You can now enter
"none" as the S_DOMAINNAME value to achieve this.
1.7. Per vserver /sbin/init
The ctx-6 kernel supports the fakeinit context flag. This flag is
entered in the S_FLAGS line of the vserver configuration file
(/etc/vservers/xx.conf). Once you set this flag, the vserver will be
started and stop using /sbin/init, found in the vserver environment.
This is a normal /sbin/init as supplied by the distribution. You
should take care of cleaning up /etc/inittab in the vserver
environment. Using this feature, it is possible to use various run
level in the vserver, switch between them and so on. You can also use
respawn /etc/inittab services as well.
Here is what fakeinit does in the kernel:
This assigned the current process so it works like the process number
1. Using this trick, a normal /sbin/init may be run in a vserver. The
/usr/sbin/vserver command will use /sbin/init to start and stop a
vserver. A properly configured /etc/inittab is needed though.
+ Processes loosing their parent are re-parent to this process.
+ getppid() done by child process of this process returns 1.
+ getpid() done by this process returns 1.
+ This process is not shown in /proc since process number 1 is always
+ An "initpid" entry is available in /proc/*/status to tell which
process is the fake init process.
One nice thing about this feature is that the /usr/sbin/vserver is
somewhat distribution independent. It simply runs /sbin/init to start
a vserver and then "/sbin/init 6" to stop it (and then kills the
remaining process). There are some drawbacks (for now) though and
input are welcome.
First, the vserver start-up is no more synchronous. The
/usr/sbin/vserver used to run "/etc/rc.d/rc 3" and wait until it ends.
Now, it runs /sbin/init, but /sbin/init won't end until the vserver
ends. So /usr/sbin/vserver has to let go /sbin/init in background.
This is a little annoying.
When a vserver is started like this, we don't see all the service
started as before. Without fakeinit, we see each service getting
started and a OK/FAIL message for each. Now, it goes completely
silent. I have not investigated this behavior. I suspect /sbin/init is
opening a new tty (console) and runs the start-up scripts using that
newly open console.
Since /sbin/init runs all the start-up code, we don't know when it is
done so we can't run the post-start section of the /etc/vservers/xx.sh
Note that both start-up strategy still work: fakeinit and the
original. So you current vserver installation will work as before
without any fiddling. Once we have iron out the fakeinit drawback,
this will become the default way of doing things.
1.8. Some capability missing
The chcontext and reducecap utility were incomplete. Many capabilities
were not handled. They are now complete.
2. Bug fixes
2.1. /usr/sbin/vserver-stat: some fixes
The vserver-stat utility had various output glitches.
Jacques Gelinas <jack_at_solucorp.qc.ca>
vserver: run general purpose virtual servers on one box, full speed!