About this list Date view Thread view Subject view Author view Attachment view

From: Liam Helmer (linuxlists_at_thevenue.org)
Date: Sat 26 Jun 2004 - 04:51:38 BST


I'm getting kernel errors and a system hang using the new tools.

Configuration: kernel 2.6.6 with vserver 1.9.1
util-vserver 0.2.9-214
filesystems: mostly reiserfs and tmpfs. Using lvm-2 and device-mapper.
devfs: no
result: eventual system hang, all cpu being used up by ksoftirq, reboot
won't work

I think it's related to namespace: it doesn't happen when the legacy
tools are creating the vserver (i.e. with /etc/vservsers/x.conf style),
but it does happen reproducibly with the new tools (/etc/vserver/x/*) .
Further, specifying "nonamespace" (touch /etc/vservesrs/nonamespace) for
the vserver seems to prevent it happening. I'm not 100% sure of this
yet, but I'm fairly sure-> I'm going to leave it to test it starting and
stopping overnight to be SURE that it doesn't happen anymore!

----

Details:

Generally, the vserver will start and stop fine the first time. This is a simple test setup. There's no processes being started by init (each runlevel runs /bin/test as a placeholder). The fakeinit flag is on in the vserver. After the vserver starts, 2 processes (cups and xinetd) are started using vserver <x> exec ... It's when these processes start that the errors occur.make

The second time, the kernel will give the following error when the vserver is started:

------

Debug: sleeping function called from invalid context at include/linux/rwsem.h:66in_atomic():1, irqs_disabled():0 Call Trace: [<c011e695>] __might_sleep+0xa5/0xd0 [<c0181743>] __put_namespace+0x13/0x96 [<c0135328>] __dealloc_vx_info+0x98/0xa0 [<c0131f9e>] rcu_do_batch+0x2e/0x40 [<c0132246>] rcu_process_callbacks+0x176/0x1a0 [<c0125b71>] tasklet_action+0x61/0xb0 [<c01258d9>] __do_softirq+0xa9/0xb0 [<c012590d>] do_softirq+0x2d/0x30 [<c011707c>] smp_apic_timer_interrupt+0xec/0x160 [<c0107fca>] apic_timer_interrupt+0x1a/0x20 [<c010505a>] default_idle+0x2a/0x40 [<c01050dd>] cpu_idle+0x2d/0x40 [<c059684b>] start_kernel+0x1ab/0x200 ------

The third time it's called, the following error will result while starting processes in the server:

------ Debug: sleeping function called from invalid context at include/linux/rwsem.h:66in_atomic():1, irqs_disabled():0 Call Trace: [<c011e695>] __might_sleep+0xa5/0xd0 [<c0181743>] __put_namespace+0x13/0x96 [<c0135328>] __dealloc_vx_info+0x98/0xa0 [<c0131f9e>] rcu_do_batch+0x2e/0x40 [<c0132246>] rcu_process_callbacks+0x176/0x1a0 [<c011b7dd>] wake_up_process+0xd/0x20 [<c0125b71>] tasklet_action+0x61/0xb0 [<c01258d9>] __do_softirq+0xa9/0xb0 [<c012590d>] do_softirq+0x2d/0x30 [<c011707c>] smp_apic_timer_interrupt+0xec/0x160 [<c0107fca>] apic_timer_interrupt+0x1a/0x20 [<c010505a>] default_idle+0x2a/0x40 [<c01050dd>] cpu_idle+0x2d/0x40 [<c059684b>] start_kernel+0x1ab/0x200 bad: scheduling while atomic! Call Trace: [<c044adff>] schedule+0x8af/0x8c0 [<c014e08a>] __pagevec_release+0x1a/0x30 [<c014e795>] truncate_inode_pages+0xb5/0x230 [<c01645dd>] invalidate_inode_buffers+0xd/0x90 [<c017d160>] generic_delete_inode+0x180/0x1a0 [<c017d388>] iput+0x58/0x70 [<c0179fcd>] prune_dcache+0x1bd/0x270 [<c017a536>] shrink_dcache_parent+0x16/0x20 [<c0168196>] generic_shutdown_super+0x26/0x200 [<c0168fa6>] kill_anon_super+0x16/0x70 [<c0167f87>] deactivate_super+0x77/0xf0 [<c017f899>] umount_tree+0xe9/0x110 [<c01083d3>] dump_stack+0x13/0x20 [<c011e695>] __might_sleep+0xa5/0xd0 [<c018177e>] __put_namespace+0x4e/0x96 [<c0135328>] __dealloc_vx_info+0x98/0xa0 [<c0131f9e>] rcu_do_batch+0x2e/0x40 [<c0132246>] rcu_process_callbacks+0x176/0x1a0 [<c011b7dd>] wake_up_process+0xd/0x20 [<c0125b71>] tasklet_action+0x61/0xb0 [<c01258d9>] __do_softirq+0xa9/0xb0 [<c012590d>] do_softirq+0x2d/0x30 [<c011707c>] smp_apic_timer_interrupt+0xec/0x160 [<c0107fca>] apic_timer_interrupt+0x1a/0x20 [<c010505a>] default_idle+0x2a/0x40 [<c01050dd>] cpu_idle+0x2d/0x40 [<c059684b>] start_kernel+0x1ab/0x200

------------

At this point, ksoftirqd will start taking up all the available cpu time. The box will now no longer reboot, and many vserver related processes will hang. I have to hit the reset switch to reboot the box.

Cheers! Liam

_______________________________________________ Vserver mailing list Vserver_at_list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver


About this list Date view Thread view Subject view Author view Attachment view
[Next/Previous Months] [Main vserver Project Homepage] [Howto Subscribe/Unsubscribe] [Paul Sladen's vserver stuff]
Generated on Sat 26 Jun 2004 - 04:51:37 BST by hypermail 2.1.3