About this list Date view Thread view Subject view Author view Attachment view

From: Herbert Pötzl (herbert_at_13thfloor.at)
Date: Wed 20 Aug 2003 - 05:07:34 BST


On Tue, Aug 19, 2003 at 05:47:19PM -0700, Roderick A. Anderson wrote:
> On Wed, 20 Aug 2003, Herbert Pötzl wrote:
>
> > about 90% of the kernel 'crashes' do not need any
> > further explanation (besides a crash report), about
> > what the user was doing or what was going on ...
>
> In this case nothing I can tell. The message file was pretty slim.

this doesn't necessarily mean that your case is
with the 10%, does it?

> > > I have turned on kernel logging into /var/log/kernel but do not know what
> > > else I need to or should do to to get better information for when I do
> > > have a crash. Case in point this morning or rather last night. Suddenly
> > > the system froze up. Would not respond to the keyboard and I had to press
> >
> > freezes and lockups are actually not kernel crashes,
> > but, if you want to get something useful in such a case
> > you have to do some preparations, namely
> >
> > - setup nmi_watchdog (this will cause a kernel oops
> > when the kernel is not responding ...
> > - configure magic sys-req (you'll be able to activate
> > some kernel task/process/memory info in such a
> > case)
> >
> > - use lkcd, or a serial line to capture the kernel
> > oops (handwritten oopses are as much fun as
> > screenshots done with your webcam :( )
>
> I'll research and get these going.
>
> > huh? what does e100/e1000 have to do with 2.4.21?
> > as fas as I remember those where in 2.4.14 and for
> > sure are in 2.4.22-rc2 ...
>
> But not part of the binary ctx distribution. Remember we're dealing with
> a pretty low-on-the-food-chain computer guy. I only recently started
> compiling my own ctx kernels and then only when there was time and not on
> a running system. People that use alpha/beta/pre - me - releases deserve
> any trouble they get.

ahh, sorry, but I guess you'll get used to it,
probably very soon ...

> > > random for my liking to make me think it is a hardware issue. I did run
> > > memtest86 on the system before I put it online.
> >
> > for how long?
>
> Probably a day. I know, I know; sometimes it takes 3 days or longer to
> run.

well, a day is a good start, usually people tend
to start memtest, sit around for 5 to 10 minutes,
until the fascination of increasing numbers and
changing patterns subsides, then abort the test,
only to claim "I did a memtest, everything was fine!"

> > > Where I am trying to take this is; what information is needed to help
> > > determine the cause of the crashes so I (we) can point a finger in the
> > > right direction - hardware, software, wetware, or the ctx kernel and
> > > friends. Not to point fingers as much as to lend a hand to the
> > > developers.
> >
> > first step for any further investigation will be
> > some kind of kernel oops, parsed by ksymoops with
> > the correct kernel System.map ...
> >
> > further useful information (after a captured oops)
> > will be a detailed system description, and some
> > hardware tests ...
>
> Will get this ready for the next (though now that I am setting this up it
> won't happen) _unexplained_ lock-up/crash/freeze.

you should look at it from a more optimistic
perspective: if you are prepared, and fate decides
not to strike your system down, isn't that a win too?

best,
Herbert

> Rod
> --
> "Open Source Software - Sometimes you get more than you paid for..."


About this list Date view Thread view Subject view Author view Attachment view
[Next/Previous Months] [Main vserver Project Homepage] [Howto Subscribe/Unsubscribe] [Paul Sladen's vserver stuff]
Generated on Wed 20 Aug 2003 - 05:36:25 BST by hypermail 2.1.3