> > Do you know what the race is?
> Apparently it's a race between deleting a process and accessing its
> /proc/pid entries. It came out in pidof while it was accessing
> /proc/pid/stat (fs/proc/array.c:do_task_stat crashed on first
> instruction - it was an inline function accessing task->state,
> get_task_state IIRC). oops (with vserver history data - I'm using a
> patch mentioned below) is attached.
> > How does one reproduce it?
> I managed to reproduce it (although not reliably) during high CPU load
> and I/O (parallel kernel compiles) on SMP systems with the vserver
> patch (http://linux-vserver.org, the exact patch is
> but the vserver maintainer pointed out that it probably is a mainline
> issue. We're not using 2.6 systems too much except for the vserver
> test beds so I cannot tell if it happens on vanilla kernels.
> > > The following micro-patch seems to fix it.
> > It might be right, or it might be a workaround..
> I'm not a kernel guru so it's just my proposal. Can it break anything?
> An alternative _might_ be somewhat coarser task_struct locking
> (do_task_stat grabs a spinlock but then it's already too late).
> However, if no "right" solution appears, I'll keep using my two-liner
> because it seems to help, at least in my setup.
Oh well, I got another oops in the very same place with the patch
applied. So now I surrounded the check with
read_[un]lock(&tasklist_lock) and added a check to do_task_stat (both
now have a printk). If it builds, boots and doesn't crash, I'll post
Vserver mailing list
Received on Tue Nov 29 13:25:37 2005