Hi,
To rephrase from IRC:
[15:16] <ard> Bertl_zZ : we get a hard lock up. it's a regression between 3.7.2 and 3.7.10 patch
[15:17] <ard> The difference is that:
[15:17] <ard> < + spin_lock_irqsave(&nxi->addr_lock, irqflags);
[15:17] <ard> was working and:
[15:17] <ard> > + WARN_ON_ONCE(in_irq());
[15:17] <ard> > + spin_lock(&nxi->addr_lock);
[15:18] <ard> gives us a BUG: spinlock recursion on CPU#....
[15:18] <ard> We can only replicate(!) this behaviour on servers with a very specific kind of load...
[15:38] <ard> maybe I also sould say that that's in kernel/vserver/inet.c
[15:40] <ard> I think the problem is that the item gets locked, then an interrupt comes in, which does a tcp_v4_rcv, and that
wants to lock the same structure.
[15:55] <ard> If I am reading: https://www.kernel.org/doc/htmldocs/kernel-locking.html correct, and if the documentation is still
valid, we should use spin_lock_bh() since the lowest level is a softirq
[15:56] <ard> we only care about the softirq running on the same cpu because that one causes a deadlock
[17:28] <ard> changing the spin_(un)lock into spin_(un)lock_bh fixed the problem
Attached the enormous patch.
Regards,
Ard