[vserver] Re: State D secure mount process in Kernel 3.4.4

From: Linux VServer <uranus.lv_at_gmail.com>
Date: Wed 11 Jul 2012 - 08:29:27 BST
Message-ID: <CADR5UH4q1NO=LHu+KvPvU=XvSu4BUOf8qx+BuB19vAc4JBXzEg@mail.gmail.com>

Hi list,

traces now with addr2line

On Tue, Jul 10, 2012 at 1:43 PM, Linux VServer <uranus.lv@gmail.com> wrote:

> Hi List,
>
> I have state D Processes in task list and at this point no vserver
> restarts are possible.
>
> Patch: 3.4.4-vs2.3.3.5-nc
> UV: 0.30.216-pre3034
>
> I tracked that problem down and find the following traces in dmesg:
>
>

[588133.416078] INFO: task secure-mount:32496 blocked for more than 120
seconds.
[588133.423208] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[588133.431398] secure-mount D ffffffff8180c9a0 0 32496 19953
0x00000000

Address: ffffffff8180c9a0
__start_rodata
??:0

[588133.431404] ffff880938245c08 0000000000000082 ffff88110ec52de0
0000000000013880

Address: ffff880938245c08
??
??:0
Address: 0000000000000082
__per_cpu_start
??:0
Address: ffff88110ec52de0
??
??:0
Address: 0000000000013880
__per_cpu_start
??:0

[588133.431411] ffff880938245fd8 ffff880938244010 0000000000013880
0000000000013880

Address: ffff880938245fd8
??
??:0
Address: ffff880938244010
??
??:0
Address: 0000000000013880
__per_cpu_start
??:0
Address: 0000000000013880
__per_cpu_start
??:0

[588133.431418] ffff880938245fd8 0000000000013880 ffff8811fbf12de0
ffff88110ec52de0

Address: ffff880938245fd8
??
??:0
Address: 0000000000013880
__per_cpu_start
??:0
Address: ffff8811fbf12de0
??
??:0
Address: ffff88110ec52de0
??
??:0

[588133.431426] Call Trace:
[588133.431435] [<ffffffff8166b449>] schedule+0x29/0x70

_raw_read_unlock_irqrestore
??:0

[588133.431440] [<ffffffff8166c06d>] rwsem_down_failed_common+0xcd/0x170

get_current
/usr/src/linux-3.4.4/arch/x86/include/asm/current.h:14

[588133.431446] [<ffffffff811ee280>] ? init_once+0x20/0x20

proc_pid_wchan
/usr/src/linux-3.4.4/fs/proc/base.c:270

[588133.431450] [<ffffffff8166c123>] rwsem_down_write_failed+0x13/0x20

do_trap
/usr/src/linux-3.4.4/arch/x86/kernel/traps.c:147

[588133.431458] [<ffffffff8132c053>] call_rwsem_down_write_failed+0x13/0x20

hex_dump_to_buffer
/usr/src/linux-3.4.4/lib/hexdump.c:86

[588133.431462] [<ffffffff8166a302>] ? down_write+0x32/0x40

constant_test_bit
/usr/src/linux-3.4.4/arch/x86/include/asm/bitops.h:312

[588133.431468] [<ffffffff8118d90a>] sget+0x36a/0x500

__unregister_chrdev_region
/usr/src/linux-3.4.4/fs/char_dev.c:176

[588133.431474] [<ffffffff811ee6f0>] ? proc_mount+0x150/0x150

proc_pid_limits
/usr/src/linux-3.4.4/fs/proc/base.c:484

[588133.431480] [<ffffffff811ee5fa>] proc_mount+0x5a/0x150

proc_pid_limits
/usr/src/linux-3.4.4/fs/proc/base.c:467

[588133.431484] [<ffffffff8118d3d3>] mount_fs+0x43/0x210

cdev_purge
/usr/src/linux-3.4.4/fs/char_dev.c:442

[588133.431488] [<ffffffff811aa495>] vfs_kern_mount+0xc5/0x160

sys_pivot_root
/usr/src/linux-3.4.4/fs/namespace.c:2545

[588133.431491] [<ffffffff811aa5b2>] do_kern_mount+0x52/0x110

sys_pivot_root
/usr/src/linux-3.4.4/fs/namespace.c:2545

[588133.431499] [<ffffffff812b10b8>] ? security_capable+0x18/0x20

security_inode_listxattr
/usr/src/linux-3.4.4/security/security.c:584

[588133.431503] [<ffffffff811abd01>] do_mount+0x231/0x960

seq_open
/usr/src/linux-3.4.4/fs/seq_file.c:47

[588133.431509] [<ffffffff811ac4c0>] sys_mount+0x90/0xe0

might_fault
/usr/src/linux-3.4.4/include/linux/kernel.h:196

[588133.431516] [<ffffffff81674429>] system_call_fastpath+0x16/0x1b

call_function_single_interrupt
/usr/src/linux-3.4.4/arch/x86/kernel/entry_64.S:1042

>
> shortly before this trace I saw the following trace:
>
>

[588013.617254] INFO: task apache2:7696 blocked for more than 120 seconds.
[588013.631799] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[588013.641116] apache2 D 000000000000000f 0 7696 9371
0x00000000

Address: 000000000000000f
__per_cpu_start
??:0

[588013.641123] ffff8800c8c294d8 0000000000000082 ffff880ee91e5bc0
0000000000013880

Address: ffff8800c8c294d8
??
??:0
Address: 0000000000000082
__per_cpu_start
??:0
Address: ffff880ee91e5bc0
??
??:0
Address: 0000000000013880
__per_cpu_start
??:0

[588013.641129] ffff8800c8c29fd8 ffff8800c8c28010 0000000000013880
0000000000013880

Address: ffff8800c8c29fd8
??
??:0
Address: ffff8800c8c28010
??
??:0
Address: 0000000000013880
__per_cpu_start
??:0
Address: 0000000000013880
__per_cpu_start
??:0

[588013.641134] ffff8800c8c29fd8 0000000000013880 ffff8810eb132de0
ffff880ee91e5bc0

Address: ffff8800c8c29fd8
??
??:0
Address: 0000000000013880
__per_cpu_start
??:0
Address: ffff8810eb132de0
??
??:0
Address: ffff880ee91e5bc0
??
??:0

[588013.641139] Call Trace:
[588013.641148] [<ffffffff8166b449>] schedule+0x29/0x70

_raw_read_unlock_irqrestore
??:0

[588013.641153] [<ffffffff8166c06d>] rwsem_down_failed_common+0xcd/0x170

get_current
/usr/src/linux-3.4.4/arch/x86/include/asm/current.h:14

[588013.641157] [<ffffffff8166c123>] rwsem_down_write_failed+0x13/0x20

do_trap
/usr/src/linux-3.4.4/arch/x86/kernel/traps.c:147

[588013.641163] [<ffffffff8132c053>] call_rwsem_down_write_failed+0x13/0x20

hex_dump_to_buffer
/usr/src/linux-3.4.4/lib/hexdump.c:86

[588013.641167] [<ffffffff8166a302>] ? down_write+0x32/0x40

constant_test_bit
/usr/src/linux-3.4.4/arch/x86/include/asm/bitops.h:312

[588013.641173] [<ffffffff81139b1d>] unregister_shrinker+0x1d/0x60

__list_add
/usr/src/linux-3.4.4/include/linux/list.h:43

[588013.641178] [<ffffffff8118d159>] deactivate_locked_super+0x49/0x90

iterate_supers_type
/usr/src/linux-3.4.4/fs/super.c:584

[588013.641182] [<ffffffff8118df8a>] deactivate_super+0x4a/0x70

inode_set_bytes
/usr/src/linux-3.4.4/fs/stat.c:469

[588013.641187] [<ffffffff811a984a>] mntput_no_expire+0x11a/0x190

copy_tree
/usr/src/linux-3.4.4/fs/namespace.c:1280

[588013.641191] [<ffffffff811a98e3>] mntput+0x23/0x40

copy_tree
/usr/src/linux-3.4.4/fs/namespace.c:1280

[588013.641195] [<ffffffff811a992c>] kern_unmount+0x2c/0x40

copy_tree
/usr/src/linux-3.4.4/fs/namespace.c:1296

[588013.641201] [<ffffffff812a3405>] mq_put_mnt+0x15/0x20

sys_mq_timedreceive
/usr/src/linux-3.4.4/ipc/mqueue.c:1004

[588013.641204] [<ffffffff812a51af>] put_ipc_ns+0x3f/0x80

key_user_put
/usr/src/linux-3.4.4/security/keys/key.c:132

[588013.641207] [<ffffffff812a524e>] ipcns_put+0xe/0x10

key_user_lookup
/usr/src/linux-3.4.4/security/keys/key.c:70

[588013.641213] [<ffffffff811ed74c>] proc_evict_inode+0x6c/0x80

proc_self_follow_link
/usr/src/linux-3.4.4/fs/proc/base.c:2776

[588013.641216] [<ffffffff811a5813>] evict+0xb3/0x1c0

notify_change
/usr/src/linux-3.4.4/fs/attr.c:262

[588013.641220] [<ffffffff811a5a3c>] iput+0x11c/0x230

setattr_copy
/usr/src/linux-3.4.4/fs/attr.c:154

[588013.641223] [<ffffffff811a3a8c>] shrink_dentry_list+0x52c/0x570

__wait_on_freeing_inode
/usr/src/linux-3.4.4/fs/inode.c:1624

[588013.641227] [<ffffffff811a40fc>] prune_dcache_sb+0x18c/0x1c0

alloc_inode
/usr/src/linux-3.4.4/fs/inode.c:216

[588013.641231] [<ffffffff8118d098>] prune_super+0x158/0x1d0

iterate_supers_type
/usr/src/linux-3.4.4/fs/super.c:573

[588013.641235] [<ffffffff81139d5f>] shrink_slab+0x19f/0x2e0

shrink_inactive_list
/usr/src/linux-3.4.4/mm/vmscan.c:1538

[588013.641239] [<ffffffff8113c590>] do_try_to_free_pages+0x460/0x590

try_to_freeze
/usr/src/linux-3.4.4/include/linux/freezer.h:47

[588013.641243] [<ffffffff8113c94b>] try_to_free_pages+0x9b/0x120

trace_mm_vmscan_kswapd_wake
/usr/src/linux-3.4.4/include/trace/events/vmscan.h:56

[588013.641266] [<ffffffffa003d54d>] ?
megasas_build_and_issue_cmd+0x1cd/0x5c0 [megaraid_sas]

??
??:0

[588013.641272] [<ffffffff81130888>] __alloc_pages_nodemask+0x4e8/0x930

dirty_background_ratio_handler
/usr/src/linux-3.4.4/mm/page-writeback.c:356

[588013.641278] [<ffffffff8108ee7b>] ? dequeue_entity+0x12b/0x310

alloc_fair_sched_group
/usr/src/linux-3.4.4/kernel/sched/fair.c:5409

[588013.641283] [<ffffffff810abc6d>] ? ktime_get_ts+0xad/0xe0

sync_cmos_clock
/usr/src/linux-3.4.4/kernel/time/ntp.c:512

[588013.641290] [<ffffffff8116bbf6>] alloc_pages_vma+0xc6/0x1a0

trace_mm_compaction_isolate_freepages
/usr/src/linux-3.4.4/include/trace/events/compaction.h:41

[588013.641294] [<ffffffff811277b0>] ? sleep_on_page+0x20/0x20

filemap_fault
/usr/src/linux-3.4.4/mm/filemap.c:1703

[588013.641298] [<ffffffff8115ff02>] read_swap_cache_async+0xf2/0x150

setup_swap_map_and_extents
/usr/src/linux-3.4.4/mm/swapfile.c:1994

[588013.641302] [<ffffffff8115ffee>] swapin_readahead+0x8e/0xd0

setup_swap_extents
/usr/src/linux-3.4.4/mm/swapfile.c:1466

[588013.641307] [<ffffffff81150610>] handle_pte_fault+0x6a0/0xaf0

sys_mincore
/usr/src/linux-3.4.4/mm/mincore.c:268

[588013.641312] [<ffffffff811a836e>] ? vfsmount_lock_local_unlock+0x1e/0x30

copy_mount_options
/usr/src/linux-3.4.4/fs/namespace.c:2098

[588013.641317] [<ffffffff8119552d>] ? complete_walk+0x9d/0x120

do_lookup
/usr/src/linux-3.4.4/fs/namei.c:1259

[588013.641322] [<ffffffff81150c35>] handle_mm_fault+0x1d5/0x330

mincore_pmd_range
/usr/src/linux-3.4.4/mm/mincore.c:172

[588013.641327] [<ffffffff8166fdfa>] do_page_fault+0x14a/0x480

kprobe_optimizer
/usr/src/linux-3.4.4/kernel/kprobes.c:593

[588013.641331] [<ffffffff8108abf0>] ? __dequeue_entity+0x30/0x50

variable_test_bit
/usr/src/linux-3.4.4/arch/x86/include/asm/bitops.h:319

[588013.641336] [<ffffffff810127a9>] ? __switch_to+0x1a9/0x450

paravirt_write_msr
/usr/src/linux-3.4.4/arch/x86/include/asm/paravirt.h:138

[588013.641342] [<ffffffff81085153>] ? finish_task_switch+0x53/0x100

spin_unlock_irqrestore
/usr/src/linux-3.4.4/include/linux/spinlock.h:340

[588013.641346] [<ffffffff8166ad49>] ? __schedule+0x3f9/0x7d0

rt_mutex_lock_interruptible
/usr/src/linux-3.4.4/kernel/rtmutex.c:804

[588013.641350] [<ffffffff8166c8e5>] page_fault+0x25/0x30

default_do_nmi
/usr/src/linux-3.4.4/arch/x86/kernel/nmi.c:366

>
> This apache process remain in state d and so i think the state d from
> secure-mount is coming from the hung apache process.
>
> The trace itself looks like that there is a deadlock in the sem handling
> of vanilla or linux vserver.
>
> I got these state D processes in Kernel versions since 3.2.
>
> For a test i disabled CONFIG_CFS_BANDWIDTH because this new function was
> enabled in kernel since 3.2. Maybe that's the missing clue.
>
> CONFIG_BLK_CGROUP was not set, so therefore this seems not be related to
> each other.
>
>
> Are there any other hinds to get behind this error?
>
> Best regadrs
>
> uranus
>

best regards

uranus
Received on Wed Jul 11 08:29:40 2012

[Next/Previous Months] [Main vserver Project Homepage] [Howto Subscribe/Unsubscribe] [Paul Sladen's vserver stuff]
Generated on Wed 11 Jul 2012 - 08:29:40 BST by hypermail 2.1.8