About this list Date view Thread view Subject view Author view Attachment view

From: Jonathan Sambrook (jonathan.sambrook_at_dsvr.co.uk)
Date: Mon 15 Dec 2003 - 08:00:36 GMT


At 18:04 on Sat 13/12/03, herbert_at_13thfloor.at masquerading as 'Herbert Poetzl' wrote:
> On Fri, Dec 12, 2003 at 08:10:09PM +0000, Jonathan Sambrook wrote:
>
> Hi Jonathan!
>
> > > I totally agree that uts_sem isn't the apropriate
> > > to use in this place(s), but I don't see how this
> > > could cause a deadlock/panic as there is nothing
> > > to lock/panic if the rw sem is held for write ...
> >
> > Neither do I, but the call from exit.c seems to lock.
> >
> > > please correct me if you see any crash/lock condition
> > > there ...
> >
> > My machines running with these write locks replaced
> > by a spinlock have been up for over two days now,

Over four days of constant context creation and entering testing, still
trundling away quite happily.

> > instead of less than a minute. No other changes!
>
> okay, after Rik opened my eyes (I was too blind
> to look beyond the affected code), I now can confirm
> that this actually fixes a bug (race).
>
> for those interested:
>
> -> vc_new_s_context {tasklist_lock[r]}
> -> vx_assign_info
>
> so vc_new_s_context() is holding the tasklist_lock
> and vx_assign_info() might sleep on the uts_sem.

Doh! You keep staring at something for long enough.... and you _keep_
missing what's right under your nose. I'd been concentrating on
examining the context creation code path rather than context entering.

Okay, I'm a lot happier now I can see _why_ this fix works.

>
> another bugfix release (vs1.22) will be there soon
>
> > > what would be interesting is, how does vs1.3.0
> > > behave in your tests, as it replaced the allocation
> > > scheme completely (so no assign/alloc and no rw
> > > uts_sem to fix ;)
> >
> > Yes. Will examine on Monday.
> > But need to go live on some reall machines soon too...
>
> would be still interesting to have that stuff tested
> as the locking is quite different ...

There was (is) no prospect of trying the new code on production
machines, but I will still take a look at the new code and have a play
with it - we are interested in the on-going development of the patch :)

First thing though is to get LKCD playing nicely again. 2.4.20 was fine,
but the support isn't there for even later vanilla kernels. You have to
pull from CVS, apply an extra patch from the mailing list and hope
they've had time from their 2.6 schedules to keep us 2.4 laggards in
mind. And of course you still have to resolve conflicts with your other
patches, e.g. s_context.

> thanks for spotting/fixing this,

You're welcome. I'm just extremely glad it's over :)

Jonathan

-- 
                   
 Jonathan Sambrook 
Software  Developer 
 Designer  Servers


_______________________________________________ Vserver mailing list Vserver_at_list.linux-vserver.org http://list.linux-vserver.org/mailman/listinfo/vserver


About this list Date view Thread view Subject view Author view Attachment view
[Next/Previous Months] [Main vserver Project Homepage] [Howto Subscribe/Unsubscribe] [Paul Sladen's vserver stuff]
Generated on Mon 15 Dec 2003 - 08:01:58 GMT by hypermail 2.1.3