From: Sam Vilain (sam_at_vilain.net)
Date: Mon 24 Feb 2003 - 03:01:30 GMT
On Sat, 22 Feb 2003 22:39, DaveC wrote:
> >>EIP; c0123c30 <sys_release_ip_info+20/50>   <=====
>
> Trace; c011a9a8 <session_of_pgrp+48/70>
> Trace; c016f1f6 <tiocspgrp+66/90>
> Trace; c016f5e8 <tty_ioctl+258/370>
> Trace; c0147217 <sys_ioctl+217/230>
> Trace; c010883b <system_call+33/38>
> Code;  c0123c30 <sys_release_ip_info+20/50>
> 00000000 <_EIP>:
> Code;  c0123c30 <sys_release_ip_info+20/50>   <=====
>    0:   8b 01                     mov    (%ecx),%eax   <=====
> Code;  c0123c32 <sys_release_ip_info+22/50>
>    2:   48                        dec    %eax
> Code;  c0123c33 <sys_release_ip_info+23/50>
>    3:   85 c0                     test   %eax,%eax
Blast; I hadn't even tested chbind, just the sched functionality.
However, this is still interesting.  The exception is happening in 
sys_release_ip_info, during the reference count decrement:
        if (ip_info != NULL){
                down_write (&uts_sem);
                ip_info->refcount--;      <==== here
                if (ip_info->refcount == 0){
                        // printk ("vfree s_info %d\n",p->pid);
                        vfree (ip_info);
                }
                up_write (&uts_sem);
        }
It appears that the ip_info structure has already been deallocated; which 
you would not expect until you had closed all the processes in that 
context.  There's a counter being decremented incorrectly somewhere...
However, my patch doesn't come near that code path at all, and this happens 
with or without --flag sched turned on.  IMHO there is a race condition 
here anyway; the semaphore is probably in a bad place.  But I'm having 
difficulty thinking how it got called twice in the first place.
Thanks for the report; I'll look into it and come back with a more fully 
tested patch.
-- Sam Vilain, sam_at_vilain.net"God is a comedian playing to an audience too afraid to laugh." - Voltaire -