[00:31] Simon (~sgarner@210.54.177.190) joined #vserver. [00:51] JonB (~jon@129.142.112.33) joined #vserver. [01:05] Nick change: riel -> unriel [01:33] Nick change: Bertl -> Bertl_zZ [01:50] infowolfe (infowolfe@pcp04891550pcs.frnkmd01.md.comcast.net) left irc: Quit: Trillian (http://www.ceruleanstudios.com) [02:02] surriel (~riel@66.92.77.98) joined #vserver. [02:13] JonB (~jon@129.142.112.33) left irc: Quit: Client exiting [06:11] infowolfe (infowolfe@68.33.215.209) joined #vserver. [08:05] infowolfe (infowolfe@68.33.215.209) left irc: Quit: Trillian (http://www.ceruleanstudios.com) [08:05] infowolfe (infowolfe@68.33.215.209) joined #vserver. [08:59] shuri (~ipv6@207.236.226.187) joined #vserver. [09:08] shuri (~ipv6@207.236.226.187) left irc: Quit: ipv6 [10:15] infowolfe (infowolfe@68.33.215.209) left #vserver. [10:36] virtuoso (~shisha@ip114-115.adsl.wplus.ru) got netsplit. [10:36] ensc (~ircensc@ultra.csn.tu-chemnitz.de) got netsplit. [10:36] Bertl_zZ (~herbert@MAIL.13thfloor.at) got netsplit. [10:36] maja|ipv6 (maharaja@ipax.tk) got netsplit. [10:47] maja|ipv6 (maharaja@ipax.tk) got lost in the net-split. [10:47] Bertl_zZ (~herbert@MAIL.13thfloor.at) got lost in the net-split. [10:47] ensc (~ircensc@ultra.csn.tu-chemnitz.de) got lost in the net-split. [10:47] virtuoso (~shisha@ip114-115.adsl.wplus.ru) got lost in the net-split. [10:47] virtuoso (~shisha@ip114-115.adsl.wplus.ru) joined #vserver. [10:47] ensc (~ircensc@ultra.csn.tu-chemnitz.de) joined #vserver. [10:47] Bertl_zZ (~herbert@MAIL.13thfloor.at) joined #vserver. [10:47] maja|ipv6 (maharaja@ipax.tk) joined #vserver. [11:24] say (~say@212.86.243.154) joined #vserver. [11:24] say_ (~say@212.86.243.154) left irc: Read error: Connection reset by peer [11:36] nick Bertl [11:36] Nick change: Bertl_zZ -> Bertl [11:36] good morning ;) [11:44] Hi Herbert. [11:44] hi! [12:58] BobR (~georg@oglgogl.BMTP.AKH-Wien.ac.at) joined #vserver. [12:58] hi bob! [13:02] hello there [13:02] hi! [13:02] hey herbert, how are you? [13:02] fine, thanks, how are you? [13:03] pretty good [13:03] been in training for the week, so i've had plenty of time to think and plan :) [13:04] my god, training sucks [13:04] good ;) [13:04] what do you do when you're not hacking vserver herbert? [13:04] I'm currently working at our general hospital ... but usually having fun or doing linux consulting ;) [13:05] heh [13:05] are you making a living doing linux consulting? [13:05] hmm .. well not really .. but I probably could if I wouln't do other stuff ;) [13:06] damn that other stuff [13:06] :) [13:06] what do you do at the hospital? [13:06] curently I'm talking to you ;) [13:06] hahaha :) [13:06] Action: kestrel_ understands [13:08] well, actually I'm evaluating/installing software to map functional image data (with focus on the brain) on brain templates for statistical evaluations ... (on linux ;) [13:09] that sounds like fun [13:09] are you evaluating commercial software, free, or both? [13:09] in this case GPL software ... [13:30] say (~say@212.86.243.154) left irc: Ping timeout: 483 seconds [13:34] Simon (~sgarner@210.54.177.190) left irc: Quit: so long, and thanks for all the fish [14:18] serving (~serving@213.186.189.225) left irc: Ping timeout: 493 seconds [14:24] Nick change: Bertl -> Bertl_oO [14:29] Nick change: BobR -> BobR_oO [14:51] beardfactor10 (~jonathan@212.69.216.20) joined #vserver. [15:21] BobR_oO (~georg@oglgogl.BMTP.AKH-Wien.ac.at) left irc: Ping timeout: 480 seconds [15:22] say (~say@212.86.243.154) joined #vserver. [15:47] BobR_oO (~georg@oglgogl.BMTP.AKH-Wien.ac.at) joined #vserver. [15:47] Nick change: Bertl_oO -> Bertl [15:47] hi say! [15:54] so it's linking againts syscall.o [15:54] okay, can we replace this with the following ... [15:55] #define __NR_vserver273 [15:55] static inline [15:55] _syscall3(int, vserver, uint32_t, cmd, uint32_t, id, void *, data); [15:55] or the aproriate for the stable version? [15:55] Nick change: BobR_oO -> BobR [15:55] yup [15:56] I hope this doesn't change the effect ;) [15:56] I'll be delighted if it does... but also very surprised :) [15:56] what is the 'idea' behind this program? [15:57] I mean, did you develop it to 'kill' the system, or was it just an accident? [15:57] it's called "killer.cc" in my tree :) [15:57] so I assume latter ... [15:58] Action: beardfactor10 laughs [15:59] I was going to develop into a stress rig to try and reproduce the wild problem [15:59] but it worked with just ht esingle syscall [15:59] okay, I'll be back in 1-2 hours, please if possible, make the test program independant from other code .. I'll test it then ... okay? [15:59] sure [15:59] will you be there, later? [16:00] Nick change: BobR -> BobR_oO [16:00] okay have to leave now ... [16:00] Nick change: Bertl -> Bertl_oO [16:01] infowolfe (infowolfe@68.33.215.209) joined #vserver. [16:03] yes [16:03] for another four hours at least [16:08] BobR_oO (~georg@oglgogl.BMTP.AKH-Wien.ac.at) left irc: Ping timeout: 480 seconds [16:10] serving (~serving@213.186.190.195) joined #vserver. [16:21] gtkBitchX-1.0c20cvs+ by panasync - Linux 2.4.21-pre5 [16:23] beardfactor10 (~jonathan@212.69.216.20) left #vserver. [16:25] beardfactor10 (~jonathan@212.69.216.20) joined #vserver. [17:04] mhepp (~mhepp@213.211.38.19) joined #vserver. [17:04] mhepp (~mhepp@213.211.38.19) left irc: Client Quit [17:18] Nick change: Bertl_oO -> Bertl [17:19] okay, back, Jonathan? [17:21] working on modifying killer.c for 1.1.5 [17:21] Action: beardfactor10 needs (more) caffeine [17:22] btw herbert, I like the code tidy for 1.1.5 [17:31] okay, you know the syscall interface changed for 1.1.x .. [17:31] hence the modiciations, yes [17:32] okay, you can get the 'syscall' stuff on my page ... [17:32] http://vserver.13thfloor.at/Experimental/vkill-0.01.tar.bz2 [17:32] for an example ... [17:39] Bertl: I'm sniffing 13thfloor for a test patch for 2.6.. [17:39] hmm, and? [17:39] And I can't see it. :) [17:39] http://vserver.13thfloor.at/Experimental/patch-2.6.0-test9-vs0.01.diff [17:40] Thx. [17:47] mhepp (~mhepp@213.211.38.19) joined #vserver. [17:47] mhepp (~mhepp@213.211.38.19) left irc: Remote host closed the connection [17:54] grrr [17:54] Action: beardfactor10 reddens with IRC newbness [17:55] ... same behaviour on 1.1.5 anyhow [17:58] oaky, via email? [17:58] better yet, I'm adding a vserver "page" to my company vs [18:01] good idea ;) [18:10] http://jonathan.dsvr.co.uk/vserver/killer.c [18:32] any advance on "oh. that's nasty"? [18:32] well .. yes, I'm currently at ctx: 57757 [18:33] on an SMP box? [18:33] and I expect it to hang at 65535 ;) [18:35] it's running in an emulation ;) [18:35] ah. which sort of emu? [18:35] QEMU [18:36] i don't think that that's going to be the basis of a viable workaround :) [18:37] hmm .. well it didn't hang ... ;) [18:38] but I have a patch for you anyway ... could you try it? [18:38] sure [18:38] http://vserver.13thfloor.at/Experimental/patch-dynamic-fix-02.diff [18:39] this is ontop of vs1.1.5 ... [18:39] what systems do you use to test, and why/how do you stress them? [18:42] Bertl: I have 2.6 on my workstation only. I planned to install it at home but didn't manage to yet. [18:43] well, the wild problem occurs on a range of dul processor boxes, from PIII 450's to Xeon 2.6 Ghz [18:43] in test terms, it's usally lower end systems (same old, same old) [18:43] Bertl: Oh, that wasn't for me, sorry. :) [18:43] no problem ;) [18:44] I'm always interested in information ... [18:44] usefull stressing is hard [18:45] I've used my wonderful s_context perl module to knock up a script which launches and monitors disk, memory and cpu hogs in various vses [18:45] but that's not live data [18:46] @jonathan just to refresh my memory, why are you testing this? [18:46] don't get me wrong, I'm more than happy that you are testing ;) [18:46] the wild problem didn't occur on my test machine (indedd some machines have been running _that_ kernel for months in the wild, whereas a range of others locked up within hours/days) [18:47] hmm, do you have any statistics, which systems are 'especially' prune to this lock? [18:48] Action: beardfactor10 giggles [18:48] I kept a grid of this and there was no correlation in hardware [18:48] Action: Bertl jsut realized why *smile* [18:49] well, no native speaker, you know ... u/o ... [18:50] my guess is that higher load levels of the syscall activity are an indicator [18:50] that's the main difference between DSVR's usage and vserver userspace utils [18:50] this sugegst a lock/race of the syscall with itself ... [18:50] yes, but I've not found one [18:51] more to the point, why is it locking the scheduler [18:51] that is easy to accomplish ... did you test my patch yet? [18:51] I've tried adding traces to the locks involved, buit obviously didn't find anything [18:51] no, not yet [18:51] please do so if possible ... [18:52] within five minutes [18:52] this would rule out something, I don't want to analyze further ... [18:58] you're trying to rule our a wrap around scenario? [18:58] yes, but I guess I already know the reason for those deadlocks ... [18:59] and ... :) [18:59] well, yes, it is obvious ... [18:59] but I wouldn't have looked there ... [18:59] vc_new_s_context() [18:59] Action: beardfactor10 snorts [18:59] take a look at the if (ctx == -1) case ... [19:00] we here take the alloc_ctx_lock (spinlock) [19:00] and then a few lines later ... [19:01] switch_user_struct ... [19:01] which does the alloc_uid [19:01] which probably sleeps ... [19:03] ... in the calls to spin_lock(&uidhash_lock); [19:04] okay, that's probably the reason for _a lot of_ different lockups ... [19:04] patch-dynamic-fix-02.diff still locks up [19:05] okay ... I'll medidate over a smart solution ... probably available today or tomorrow ... [19:05] I would appreciate if you could test it then ... [19:05] cool [19:05] sure [19:05] I could be wrong on that, but it looks very suspicious ... [19:06] hmm, actually we can do a simple test for that ... [19:07] could you for a test, just comment out the spin_lock(&alloc_ctx_lock); and unlock? [19:08] line 222 and 259 after the patch, and see if this doesn't lock ... it will behave a little strange, but that should not hurt ... [19:08] in vcontext.c ;) [19:09] okay, recompling [19:15] without that spinlock, killer does NOT lockup [19:23] sounds good to me ;) [19:46] hmmn.... [19:47] okay ... will be back in 3-4 hours ... [19:47] Nick change: Bertl -> Bertl_oO [20:46] kloo_ (~kloo@213.84.79.23) left irc: Remote host closed the connection [23:07] shuri (~ipv6@207.236.226.187) joined #vserver. [23:10] hello [23:18] unriel (~riel@66.187.230.200) got netsplit. [23:18] linas (~linas@67.100.217.179) got netsplit. [23:18] mcp (~hightower@81.17.110.148) got netsplit. [23:19] unriel (~riel@66.187.230.200) returned to #vserver. [23:19] mcp (~hightower@81.17.110.148) returned to #vserver. [23:19] linas (~linas@67.100.217.179) returned to #vserver. [23:26] gaertner (~gaertner@212.68.83.129) left #vserver. [23:33] JonB (~jon@129.142.112.33) joined #vserver. [00:00] --- Thu Nov 27 2003