Re: [Vserver] Re: [parisc 32bit] first test of 'stable' vs 2.6.16-vs2.0.2-rc14 (-pa10 parisc tree)

From: Herbert Poetzl <herbert_at_13thfloor.at>
Date: Mon 03 Apr 2006 - 18:22:54 BST
Message-ID: <20060403172254.GA4883@MAIL.13thfloor.at>

On Mon, Apr 03, 2006 at 03:44:40PM +0100, Joel Soete wrote:
> > Hello Herbert,
> >
> >
> > Herbert Poetzl wrote:
> > > On Fri, Mar 31, 2006 at 06:27:51PM +0100, Joel Soete wrote:
> > >
> > [snip]
> > >>That said do you remember how much 'new' is this option (I don't have
> > >>enough space to save all kernel and config ;<( )?
> > >>
> > >>Well some time ago (I find back: around 2.6.12-rc1), I already
> > >>did 'just disable' a debug option, in the hope that works better:
> > >>the actual effect was just disable to printout "BUG ..." but the
> > >>underground bug effect was always there and kernel still missbehave.
> > >
> > >
> > > not disabling the de'bug' option, but the hang check
> > > and probably submitting something to lkml so that
> > > folks there could look into it ...
> > >
> > > at least I assume this happens with a vanilla kernel
> > > too, if not, please let me know ...
> > >
> > I never noticed this with parisc cvs tree, even when I use vps as a simple
> chroot (with just starting cron).
> >

> please read more detailed: with a 32bit kernel (it seems that 64bits
> kernel up and smp expose some unexpected hang, but that was already
> repoted to parisc-linux m-l), I never noticed this with parisc-cvs src
> tree (even if I use vps fs space as a simple chroot).

> i.e.:
> 1/ I didn't noticed (the same kernel without vserver patch) hiccup
> like this

> 2/ and fwiw I never seen dying the system like this:
> tones of messages like:

I assume you never had _that_ many processes on your
system before ...

> oom-killer: gfp_mask=0xd0, order=2
>
> Backtrace:
>
> [<10154980>] out_of_memory+0x17c/0x19c
> [<10156b94>] __alloc_pages+0x348/0x3a8
> [<10156fcc>] __get_free_pages+0x2c/0x98
> [<10123efc>] copy_process+0x18c/0x1410
> [<10103db8>] do_fork+0x78/0x204
> [<10109c30>] __kernel_thread+0x30/0x40
> [<10121a58>] try_to_wake_up+0xe4/0x1ec
> [<10120800>] __wake_up_common+0x78/0xc4
> [<1013d2dc>] keventd_create_kthread+0x24/0x74
> [<101393a8>] run_workqueue+0x84/0x128
> [<10139664>] worker_thread+0x124/0x180
> [<1013d4e4>] kthread+0x144/0x14c
> [<10109c5c>] ret_from_kernel_thread+0x1c/0x24

well, that looks like a 'normal' out-of-memory case

> Mem-info:
>
> DMA per-cpu:
> cpu 0 hot: high 90, batch 15 used:11
> cpu 0 cold: high 30, batch 7 used:19
>
> DMA32 per-cpu: empty
> Normal per-cpu: empty
> HighMem per-cpu: empty
>
> Free pages: 11024kB (0kB HighMem)
>
> Active:11675 inactive:11965 dirty:2 writeback:2 unstable:0 free:2756 slab:8258
> mapped:23361 pagetables:14416
> DMA free:11024kB min:2048kB low:2560kB high:3072kB active:46700kB
> inactive:47860kB present:262144kB pages_scanned:50944 all_unreclai
> mable? no
>
> lowmem_reserve[]: 0 0 0 0
>
> DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB
> pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
>
> Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB
> pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
>
> HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB
> present:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
>
> DMA: 2098*4kB 265*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB
> 0*2048kB 0*4096kB = 11024kB
> DMA32: empty
>
> Normal: empty
> HighMem: empty
>
> Swap cache: add 578297, delete 577842, find 50692/109462, race 98+52
>
> Free swap = 0kB
> Total swap = 517480kB
> Free swap: 0kB

even swap space is exhausted ...

> 65536 pages of RAM
> 1725 reserved pages
> 232835 pages shared
> 455 pages swap cached
>
> Out of Memory: Kill process 1859 (sendmail) score 4190 and children.
> Out of memory: Killed process 2708 (exim4).
>
> oom-killer: gfp_mask=0xd0, order=2

again a 'quite normal' OOM kill

> Backtrace:
>
> [<10154980>] out_of_memory+0x17c/0x19c
> [<10156b94>] __alloc_pages+0x348/0x3a8
> [<10156fcc>] __get_free_pages+0x2c/0x98
> [<10123efc>] copy_process+0x18c/0x1410
> [<10103db8>] do_fork+0x78/0x204
> [<10109c30>] __kernel_thread+0x30/0x40
> [<10121a58>] try_to_wake_up+0xe4/0x1ec
> [<10120800>] __wake_up_common+0x78/0xc4
> [<1013d2dc>] keventd_create_kthread+0x24/0x74
> [<101393a8>] run_workqueue+0x84/0x128
> [<10139664>] worker_thread+0x124/0x180
> [<1013d4e4>] kthread+0x144/0x14c
> [<10109c5c>] ret_from_kernel_thread+0x1c/0x24
>
>
> Mem-info:
>
> DMA per-cpu:
>
> cpu 0 hot: high 90, batch 15 used:4
> cpu 0 cold: high 30, batch 7 used:2
>
> DMA32 per-cpu: empty
> Normal per-cpu: empty
> HighMem per-cpu: empty
>
> Free pages: 11072kB (0kB HighMem)
>
> Active:11803 inactive:11887 dirty:2 writeback:2 unstable:0 free:2768 slab:8253
> mapped:23364 pagetables:14416
> DMA free:11072kB min:2048kB low:2560kB high:3072kB active:47212kB
> inactive:47548kB present:262144kB pages_scanned:10295 all_unreclai
> mable? no
>
> lowmem_reserve[]: 0 0 0 0
>
> DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB
> pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
>
> Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB
> pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
>
> HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB
> present:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
>
> DMA: 2108*4kB 266*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB
> 0*2048kB 0*4096kB = 11072kB
> DMA32: empty
>
> Normal: empty
> HighMem: empty
>
> Swap cache: add 578352, delete 577897, find 50703/109478, race 98+52
>
> Free swap = 0kB
> Total swap = 517480kB
> Free swap: 0kB
>
> 65536 pages of RAM
> 1725 reserved pages
> 232805 pages shared
>
> 455 pages swap cached

you might try to look for some kind of memory
leak and disable the memory overcommitment, that
will at least give ENOMEM instead of invoking
the OOM killer ...

btw, I'm a little confused by the kernel version
as I currently use 2.6.16-pa5-vs2.1.1-rc14 ...

best,
Herbert

> [...]
>
> > >
> > >>But that's a long time ago and as you're the second to 'just want to
> > >>disable this feature', I need to test.
> > >>
> > >>The first thing is that, the system behaviour is still heratic:
> > >> 1/ at the console a return can answer immediately or about 30s later?
> > >>
> > >> 2/ the same in a ssh connection: ls, vserver-stat ; sometime immediate
> > >>answer, sometime wait (even in the midle of the type of the cmdl)
> > >>
> > >> 3/ I reach to enter a vps (awaiting about 20min) but responding from time
> > >>to time
> > >>
> > >> 4/ but the system is still alive.... let it run the w-e
> > >
> > >
> > > sounds like a major scheduling issue ...
> > >
> > Not sure what hapen: when the system was a bit responsive, I launch a top
> and some time showing me about 50 cron child process,
> > and/or more then 20 logcheck process, also even before I started a vps
> server (so I presume to be a pb related to kernel, don't know
> > exactely what, much then the fact that I use glibc, dietlibc not availble
> for parisc, to build utils-vserver tools)?
> >
> > I will check on monday if system is still alive ;-)
> >
> So the system die the same way with or without CONFIG_DETECT_SOFTLOCKUP.
> (the same for 64bit kernel pb)
>
> I will check, if I can find something more helpfull?
>
> Thanks again,
> Joel
>
>
> ---------------------------------------------------------------
> A free anti-spam and anti-virus filter on all Scarlet mailboxes
> More info on http://www.scarlet.be/
>
> _______________________________________________
> Vserver mailing list
> Vserver@list.linux-vserver.org
> http://list.linux-vserver.org/mailman/listinfo/vserver
_______________________________________________
Vserver mailing list
Vserver@list.linux-vserver.org
http://list.linux-vserver.org/mailman/listinfo/vserver
Received on Mon Apr 3 18:23:27 2006

[Next/Previous Months] [Main vserver Project Homepage] [Howto Subscribe/Unsubscribe] [Paul Sladen's vserver stuff]
Generated on Mon 03 Apr 2006 - 18:23:35 BST by hypermail 2.1.8