Re: [vserver] opteron server dies with vserver patch.

From: Herbert Poetzl <herbert_at_13thfloor.at>
Date: Mon 08 Aug 2011 - 19:29:37 BST
Message-ID: <20110808182936.GR12671@MAIL.13thfloor.at>

On Mon, Aug 08, 2011 at 08:09:53PM +0200, Pawe?? Sikora wrote:
> On Monday 08 of August 2011 20:01:18 Herbert Poetzl wrote:
>> On Mon, Aug 08, 2011 at 07:36:50PM +0200, Pawe?? Sikora wrote:
>>> On Monday 08 of August 2011 19:26:17 Pawe?? Sikora wrote:
>>>> On Monday 08 of August 2011 18:35:25 Pawe?? Sikora wrote:
>>>>> On Monday 08 of August 2011 17:57:05 Herbert Poetzl wrote:
>>>>>> On Mon, Aug 08, 2011 at 05:32:21PM +0200, Pawel Sikora wrote:
>>>>>>> On Monday 08 of August 2011 16:28:20 Herbert Poetzl wrote:

>>>>>>>> ah, I forgot, yes there is a way (not the easiest though)
>>>>>>>> to decode the output your ipmi console recorded:

>>>>>>>> - extract the byte sequence listed as Code:
>>>>>>>> - search for that sequence in your vmlinux
>>>>>>>> - add the offset of <xx> relative to the first byte (44)
>>>>>>>> - lookup the resulting address with addr2line or
>>>>>>>> compare it with objdump -t vmlinux

>>>>>>> i've copied v3.0+vserver vmlinuz + more stack traces to web site:

>>>>>>> http://pluto.agmk.net/kernel/vmlinuz-3.0.0-vs2.3.1-pre8-dirty
>>>>>>> http://pluto.agmk.net/kernel/vserver-crash.jpg

>>>>>> good idea, but could you also upload the vmlinux (x not z)
>>>>>> from your build tree?

>>>>> bzipped vmlinux uploaded: http://pluto.agmk.net/kernel/vmlinux.bz2

>>>> update, screenshot with backtraces is for 2.6.39.y+vserver,
>>>> correct vmlinux uploaded.

>>>> i've isolated so far two code points from screenshot:

>>>> trace 1, 3, 5:

>>>> ffffffff81165a19: 90 nop
>>>> ffffffff81165a1a: 90 nop
>>>> ffffffff81165a1b: 90 nop
>>>> ffffffff81165a1c: 90 nop
>>>> ffffffff81165a1d: 90 nop
>>>> ffffffff81165a1e: 90 nop
>>>> ffffffff81165a1f: 90 nop

>>>> ffffffff81165a20 <vfsmount_lock_local_lock>:
>>>> ffffffff81165a20: 55 push %rbp
>>>> ffffffff81165a21: 48 c7 c0 f0 ee 00 00 mov $0xeef0,%rax
>>>> ffffffff81165a28: 48 89 e5 mov %rsp,%rbp
>>>> ffffffff81165a2b: 65 48 03 04 25 30 c5 add %gs:0xc530,%rax
>>>> ffffffff81165a32: 00 00
>>>> ffffffff81165a34: ba 00 01 00 00 mov $0x100,%edx
>>>> ffffffff81165a39: f0 66 0f c1 10 lock xadd %dx,(%rax)
>>>> ffffffff81165a3e: 38 f2 cmp %dh,%dl
>>>> ffffffff81165a40: 74 06 je ffffffff81165a48 <vfsmount_lock_local_lock+0x28>
>>>> ffffffff81165a42: f3 90 pause
>>>> ffffffff81165a44: 8a 10 mov (%rax),%dl
>>>> ffffffff81165a46: eb f6 jmp ffffffff81165a3e <vfsmount_lock_local_lock+0x1e>
>>>> ffffffff81165a48: c9 leaveq
>>>> ffffffff81165a49: c3 retq

>>>> trace 4:

>>>> ffffffff81161310 <d_lookup>:
>>>> ffffffff81161310: 55 push %rbp
>>>> ffffffff81161311: 48 89 e5 mov %rsp,%rbp
>>>> ffffffff81161314: 48 83 ec 20 sub $0x20,%rsp
>>>> ffffffff81161318: 4c 89 65 f0 mov %r12,-0x10(%rbp)
>>>> ffffffff8116131c: 4c 89 6d f8 mov %r13,-0x8(%rbp)
>>>> ffffffff81161320: 49 89 f4 mov %rsi,%r12
>>>> ffffffff81161323: 48 89 5d e8 mov %rbx,-0x18(%rbp)
>>>> ffffffff81161327: 49 89 fd mov %rdi,%r13
>>>> ffffffff8116132a: 8b 1d 50 0f 6a 00 mov 0x6a0f50(%rip),%ebx # ffffffff81802280 <rename_lock>
>>>> ffffffff81161330: f6 c3 01 test $0x1,%bl
>>>> ffffffff81161333: 75 26 jne ffffffff8116135b <d_lookup+0x4b>
>>>> ffffffff81161335: 4c 89 e6 mov %r12,%rsi
>>>> ffffffff81161338: 4c 89 ef mov %r13,%rdi
>>>> ffffffff8116133b: e8 d0 fd ff ff callq ffffffff81161110 <__d_lookup>
>>>> ffffffff81161340: 48 85 c0 test %rax,%rax
>>>> ffffffff81161343: 75 08 jne ffffffff8116134d <d_lookup+0x3d>
>>>> ffffffff81161345: 3b 1d 35 0f 6a 00 cmp 0x6a0f35(%rip),%ebx # ffffffff81802280 <rename_lock>
>>>> ffffffff8116134b: 75 dd jne ffffffff8116132a <d_lookup+0x1a>
>>>> ffffffff8116134d: 48 8b 5d e8 mov -0x18(%rbp),%rbx
>>>> ffffffff81161351: 4c 8b 65 f0 mov -0x10(%rbp),%r12
>>>> ffffffff81161355: 4c 8b 6d f8 mov -0x8(%rbp),%r13
>>>> ffffffff81161359: c9 leaveq
>>>> ffffffff8116135a: c3 retq

>>>> trace 2:

>>>> in progress....

>>> possible points:

>> what do you mean with 'possible'?
>> do all 64 bytes match (from the Code line) for both locations?

> looks like a full pattern matches only for this code:

> ffffffff811612fe: 45 31 c0 xor %r8d,%r8d
> ffffffff81161301: 83 ce ff or $0xffffffff,%esi
> ffffffff81161304: eb c0 jmp ffffffff811612c6 <__d_lookup+0x1b6>
> ffffffff81161306: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
> ffffffff8116130d: 00 00 00

> ffffffff81161310 <d_lookup>:
> ffffffff81161310: 55 push %rbp
> ffffffff81161311: 48 89 e5 mov %rsp,%rbp
> ffffffff81161314: 48 83 ec 20 sub $0x20,%rsp
> ffffffff81161318: 4c 89 65 f0 mov %r12,-0x10(%rbp)
> ffffffff8116131c: 4c 89 6d f8 mov %r13,-0x8(%rbp)
> ffffffff81161320: 49 89 f4 mov %rsi,%r12
> ffffffff81161323: 48 89 5d e8 mov %rbx,-0x18(%rbp)
> ffffffff81161327: 49 89 fd mov %rdi,%r13
> ffffffff8116132a: 8b 1d 50 0f 6a 00 mov 0x6a0f50(%rip),%ebx # ffffffff81802280 <rename_lock>
> ffffffff81161330: f6 c3 01 test $0x1,%bl
> ffffffff81161333: 75 26 jne ffffffff8116135b <d_lookup+0x4b>
> ffffffff81161335: 4c 89 e6 mov %r12,%rsi
> ffffffff81161338: 4c 89 ef mov %r13,%rdi
> ffffffff8116133b: e8 d0 fd ff ff callq ffffffff81161110 <__d_lookup>
> ffffffff81161340: 48 85 c0 test %rax,%rax
> ffffffff81161343: 75 08 jne ffffffff8116134d <d_lookup+0x3d>
> ffffffff81161345: 3b 1d 35 0f 6a 00 cmp 0x6a0f35(%rip),%ebx # ffffffff81802280 <rename_lock>
> ffffffff8116134b: 75 dd jne ffffffff8116132a <d_lookup+0x1a>
> ffffffff8116134d: 48 8b 5d e8 mov -0x18(%rbp),%rbx
> ffffffff81161351: 4c 8b 65 f0 mov -0x10(%rbp),%r12
> ffffffff81161355: 4c 8b 6d f8 mov -0x8(%rbp),%r13
> ffffffff81161359: c9 leaveq
> ffffffff8116135a: c3 retq

could you upload those sections (without line break) and
also mark the <xx> bytes in the dumps, I'm having a hard
time to determine them from the screenshot ...
(seems to be cut off on the right side)

btw, how long does it take till those traces show up and
in what way do they affect the system? (total crash,
single cpu blocked, zombie, nothing)

thanks in advance,
Herbert
Received on Mon Aug 8 19:29:48 2011

[Next/Previous Months] [Main vserver Project Homepage] [Howto Subscribe/Unsubscribe] [Paul Sladen's vserver stuff]
Generated on Mon 08 Aug 2011 - 19:29:48 BST by hypermail 2.1.8