[00:01] mhepp (~mhepp@r72s22p13.home.nbox.cz) joined #vserver. [00:01] mhepp (~mhepp@r72s22p13.home.nbox.cz) left irc: Remote host closed the connection [00:35] JonB (~jon@129.142.112.33) joined #vserver. [00:44] AGoe (~agoeres@80.184.238.42) joined #vserver. [00:44] AGoe (~agoeres@80.184.238.42) left irc: Client Quit [01:47] Nick change: Bertl_zZ -> Bertl [01:47] hi folks! [01:48] hey Bertl [01:48] hi jon! [01:49] hello [01:51] hi! [01:59] JonB (~jon@129.142.112.33) left irc: Quit: zzzzzzzzz [03:01] kestrel (~athomas@o2rosock0a.optus.net.au) left irc: Ping timeout: 493 seconds [03:05] /t [03:21] matta (matta@tektonic.net) left irc: Ping timeout: 493 seconds [03:22] kestrel (~athomas@o2rosock0a.optus.net.au) joined #vserver. [03:36] matta (~matta@68.81.235.145) joined #vserver. [03:36] hello [03:36] hi matt! [04:17] kestrel (~athomas@o2rosock0a.optus.net.au) left irc: Ping timeout: 493 seconds [04:24] okay, bed time for me ... [04:24] Nick change: Bertl -> Bertl_zZ [04:39] kestrel (~athomas@202.139.83.4) joined #vserver. [05:08] Nick change: MrBawb -> hungry [05:08] heh oops. [05:08] Nick change: hungry -> MrBawb [06:20] kestrel (~athomas@202.139.83.4) left irc: Ping timeout: 492 seconds [06:42] kestrel (~athomas@202.139.83.4) joined #vserver. [08:08] shadow (~umka@212.86.233.226) joined #vserver. [08:08] morning.., [08:08] hi [08:52] kestrel_ (~athomas@192.65.90.92) joined #vserver. [08:52] hi there [09:13] shadow (~umka@212.86.233.226) left irc: Read error: Connection reset by peer [09:18] shadow (~umka@212.86.233.226) joined #vserver. [11:32] kestrel_ (~athomas@192.65.90.92) left irc: Ping timeout: 492 seconds [11:36] Nick change: say-out -> say [11:37] hi all [14:00] serving (~serving@213.186.189.145) left irc: Read error: Connection reset by peer [14:01] Nick change: Bertl_zZ -> Bertl [14:05] hi all! [14:30] AGoe (~agoeres@80.184.205.57) joined #vserver. [14:30] hi alexander! [14:30] hi herbert [14:30] herbert, have you got time for 1 question? [14:31] or more.. [14:31] yes, sure .. [14:31] it's about your cq-tools.. what exactly is the block size mentioned? [14:32] you mean for the disk limits? [14:33] yes.. is it the filesystem's blocksize? [14:33] depends on the tool/patch version ... in the beginning it was the filesystem blocksize, now (IIRC) it is changed to 1k sizes .. the magic is done within the kernel ... [14:34] i think i have your 0.06 version.. [14:35] so how much space are 1024 blocks? [14:35] not much, let me have a look ... [14:36] ; 0.05 - added enforcement for ext3 [14:36] ; - bugfix in dlimit_transfer [14:36] ; - clamped negative values to zero [14:36] ; - changed from blocksize to 1k [14:37] so if you specify 1024 it will be 1024k or 1 meg ... [14:37] and what if the filesystems blocksize is 4k? [14:37] then 1024 will be 1024k, or 1 meg ;) [14:38] hmm.. so i don't need too muck calculations..:-) [14:38] that was the idea with dl0.05 block size fix ;) [14:38] it gives slightly incorrect calculations sometimes ... [14:38] but that it the price for not having to calculate this ;) [14:39] and just one last thing.. how many inodes should be given per 1mb? are there definite values or is it ad lib? [14:39] well, I would say it depends on how your filesystem is used and configured ... [14:39] just check the amount of inodes used at the moment, whith df -i [14:40] and then make up some sane values ... [14:40] i took a look at dumpe2fs and that gave me for 500 mb something about 65000+ inodes.. [14:41] so 100k inodes would be on the safe side then, right? [14:41] but with a 4k block filesystem.. [14:42] the blocksize will not change .. only the accounting does ;) [14:42] herbert.. sounds sound [14:42] oh.. and can a context-limit be removed again? [14:43] what is a context limit? [14:43] shadow (~umka@212.86.233.226) left irc: Ping timeout: 493 seconds [14:43] disk-limits per context, given with cqdlim.. [14:43] yes, you can change it at any time ... [14:44] you have to set a 'start' value and/or the current usage ... [14:45] start value/current usage would be the numbers of -S _0_,200,_0_,1000,10? [14:45] yes ... [14:46] serving (~serving@213.186.189.25) joined #vserver. [14:46] hi serving! [14:48] herbert, then i think i made the disk-limits work.. without fakeinit, but with a working vserver in borders.. great tool, herbert ! :-) [14:49] thanks ... [14:50] herbert.. just for understanding.. how can the block size of a per-context disk limit be smaller than the real filesystem's block size? [14:51] HI Bertl :) [14:51] well, it isn't but the accounting actually isn't block based, it is byte based ... so there happens some calculation magic, which tries to compensate/adapt this ... [14:54] ahh magic.. better i don't get too deep in that or i'll be bitten by a tiger..:-) [14:54] or eaten by a snake 8-) *ssss* [17:15] loger joined #vserver. [18:42] evening.. [18:43] hi alex, hi kestrel [18:57] Herber - good news about ext3. [18:57] yes? [18:57] vanila kernel 2.4.21 not have it bug.. [18:58] hmm and what about the 2.4.22/23? [18:58] and rh kernel 2.4.20-19 also.. [18:59] vanilla 2.4.22 does have the bug? [19:00] i don`t have this kernel and in first place test kernels who have placed in my host... [19:01] hmm, IIRC honza added a bunch of quota patches somewhere between 2.4.22pre4 and 2.4.23-pre2 ... [19:01] it can be bug linux-2.4.17-lowlatency.patch [19:01] i replicate it bug without diskquotas in kernel.. [19:01] ahh good ... [19:01] and without any not rh patches.. [19:03] i have problems at to point in kernel - time wait lists and ext3fs - both points be patched with this patch.. [19:20] @alex about quota on xfs ... this is really weird stuff ... [19:20] @alex ahh maybe we should use the time to talk about the per context capabilities ;) [19:47] fleshcrawler (~fleshcraw@port-212-202-15-5.reverse.qsc.de) joined #vserver. [19:47] hello there! [19:47] hmm, fleshcrawler? hi there! [19:47] ? [19:48] well, sure you have a simple and intuitive explanation for that nick? [19:48] yes. it just popped up in my mind. [19:49] so what brought you ehre, besides the internet? [19:49] I came herer because i started playing arround with my vserver. [19:51] there were some hurdles I had. First of all the ctx patch won't work properly with debian kernel sources. I used a fresh copy from kernel.org and that one works. [19:51] hmm, what deb kernel? [19:52] it's 2.4.22 it had problems with the files udp.c in net/ipv4 and route.h in include/linux/route.h [19:52] http://vserver.13thfloor.at/Experimental/patch-deb0.2-2.4.22-c17f.diff.bz2 [19:53] this is almost equiv to the stable release ... [19:53] well. I copied the original source files over the patched ones and it worked. [19:53] this might be a stupid idea but i couldn't help else. [19:53] erh ... lucky you 8-} [19:54] now I use the kernel.org sources and it patched and compiled well. [19:54] yeah, this is the better way, until the debian vserver maintainer will care/update ... [19:55] anyway. it's working fine now. the only problem is that i can't ping domainnames. [19:55] but ips work on the internet. [19:55] probably name resolving? [19:55] yes. I think I got it running once but now it doesn't work. [19:56] check whats in the /etc/resolv.conf file of the vserver? [19:56] all my nameservers are in. that's the strange part of it. [19:56] i can also ping those severs. [19:56] hmm, maybe they won't answer your ip? [19:56] try with dig ... [19:57] the vserver is masqueraded by the firewall [19:58] I'm forgetting something stupid... I know that... [19:58] dig says the servers can't be reached. [19:59] brb. I'll check something. [20:02] _fleshcrawler (~fleshcraw@port-212-202-15-5.reverse.qsc.de) joined #vserver. [20:03] <_fleshcrawler> by the way. the link to the patch has restriced access [20:04] in what way? [20:05] <_fleshcrawler> Error 403 [20:06] hmm, I assume you have more problems than you know ... [20:07] wget http://vserver.13thfloor.at/Experimental/patch-deb0.2-2.4.22-c17f.diff.bz2 [20:07] <_fleshcrawler> A! wait! yeah. that was my fault. [20:07] 18:07:43 (238.88 KB/s) - `patch-deb0.2-2.4.22-c17f.diff.bz2' saved [20058/20058] [20:08] <_fleshcrawler> have it... [20:08] little security freak, right? ;) [20:08] fleshcrawler (~fleshcraw@port-212-202-15-5.reverse.qsc.de) left irc: Ping timeout: 493 seconds [20:08] <_fleshcrawler> yeah. zonealarm... [20:08] hi [20:09] hi matt, what's the matt-a ? [20:09] wanna see what happens when you have 60 vservers and forget to stage cron.weekly? [20:09] 6:50pm up 3 days, 1:22, 1 user, load average: 281.02, 140.82, 57.28 [20:09] 1929 processes: 1429 sleeping, 437 running, 61 zombie, 2 stopped [20:09] CPU states: 6.7% user, 93.2% system, 0.0% nice, 0.0% idle [20:09] Mem: 3099808K av, 3079032K used, 20776K free, 0K shrd, 844K buff [20:09] Swap: 4096496K av, 240416K used, 3856080K free 1157808K cached [20:09] 7:16pm up 3 days, 1:48, 1 user, load average: 768.41, 728.29, 561.65 [20:09] 144 processes: 48 sleeping, 36 running, 59 zombie, 1 stopped [20:09] CPU states: 0.0% user, 100.0% system, 0.0% nice, 0.0% idle [20:09] Mem: 3099808K av, 3093844K used, 5964K free, 0K shrd, 1564K buff [20:09] Swap: 4096496K av, 240416K used, 3856080K free 1158580K cached [20:09] :( [20:09] cool, I only tried that with 14 .. this was already a nightmare ... [20:10] yeah [20:10] i do cron.daily [20:10] but seemed to have forgotten cron.weekly.. [20:10] do you use a binary split down for that? [20:10] i always did it by vi /vservers/*/etc/crontab and going through em.. [20:10] but now i'm just writing a perl script to automate this... can't have that happening again [20:11] i wonder though, in the second... why is it 100% system? [20:11] you think it just went kaput at that point? [20:11] nope, I assume the I/O is too much ... [20:12] don't forget something like slocate will kill em all ... [20:12] in this case i believe it's makewhatis [20:13] same thing other label ;) [20:14] alekibango (~john@62.245.97.59) left irc: Ping timeout: 492 seconds [20:15] <_fleshcrawler> okay. I'm baking the debian kernel with the patch. [20:15] Nick change: _fleshcrawler -> fleshcrawler [20:16] btw. I used the 17g2 patch. Ist newer. [20:17] hrm .. that is the experimental/devel branch ... [20:17] hrm [20:17] perhaps makewhatis wasn't the problem [20:17] it just did it again [20:17] this is insane [20:18] i thought it was a 2.4.22 problem, but it's still in 2.4.23pre8 [20:18] hmm. this is on a O(1)? [20:18] kswapd just starts using all system CPU and finally kills the system after a minute [20:18] no, 2.4.22 wasn't O(1) [20:18] it was c17e though [20:18] it seems to be getting worse [20:18] I mean the 60 vserver kernel? [20:18] right now it is [20:18] try to issue sync 2-3 times [20:19] it's 2.4.23pre8+O(1) [20:19] can't do anything... [20:19] I said try ;) [20:19] it did the same thing under 2.4.22-c17e though [20:19] perhaps a problem that will arise with a large number of vservers [20:19] it used to last a week [20:19] then 3-4 days.. [20:19] then 2 days... [20:20] now 16 hours [20:20] well one issue still there is the trashing (this is in since the OOM killer was removed/disabled) [20:20] 12:15pm up 16:05, 1 user, load average: 35.08, 11.50, 5.23 [20:20] 1891 processes: 1773 sleeping, 91 running, 27 zombie, 0 stopped [20:20] CPU states: 8.3% user, 90.8% system, 0.7% nice, 0.0% idle [20:20] Mem: 3099808K av, 3075364K used, 24444K free, 0K shrd, 984K buff [20:20] Swap: 4096496K av, 147464K used, 3949032K free 1173868K cached [20:20] PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND [20:20] 5 root 16 0 0 0 0 SW 50.5 0.0 0:46 kswapd [20:20] this is the last I saw.. [20:20] I can have a look at this particular issue after the next devel release ... [20:21] it's weird, not sure how to reproduce [20:21] i don't see anything out of the ordinary except kswapd using all excess CPU [20:21] heh, that doesn't look very healthy ;) [20:21] is it usual to have all service from the vserver mixed with the machine running the vserver on a nmap scan on the vserver ip? [20:21] the question is, is the boxy swapping in and out? (trashing?) [20:21] oh [20:21] hrm, how do I even tell that? :) [20:22] i know sar... but that's 5 minute intervals only.. [20:22] @fleshcrawler depends on the config ... [20:22] aha. okay. then i'll read some more... [20:22] @matt, hmm if you have someone at site, he could try to listen to the disks ;) [20:23] s/at/on/ [20:23] har har :) [20:23] this is a serious nuisance [20:24] for now on I think I will buy small servers and only put a fewew amount of clients on each :) [20:24] minimize the damage when a server is going bad [20:24] well, mainly depends on the issue ... [20:24] jsut consider you take 4 smaller servers with less memory and slower disk of course ... [20:25] of course [20:25] the issue might arise at the same intervals, only 4 times more often ... [20:25] hrm, true [20:26] this is with rmap15k right? [20:26] alekibango (~john@62.245.97.59) joined #vserver. [20:26] yes [20:26] how was the swap usage just before the incident? [20:26] 150MB [20:26] not too bad [20:27] maybe you should activate a vmstat log via serial console or network ... or such ... [20:27] but you observed the same behaviour without rmap15k too, or not? [20:28] right [20:28] i can you get you a list of patches for both kernels [20:28] yes, please ... [20:28] i think i may boot up 2.4.22-c17e this time as that seemed to last a bit longer.. [20:29] because of the 437 running processes last night I was almost sure it was due to cron [20:30] but right now everything seemed fine... then it just went boom [20:30] hmm, rik, do you have any idea? [20:30] 2.6 kernel ;) [20:30] i'm all for it :) [20:30] but that requires that somebody finishes the work ;) [20:31] riel - 2.6 stable ? :) [20:31] 2.6 is pretty ok [20:32] hmm, maybe he meant, 2.6 is the cause for 2.4 going crazy ;) [20:32] riel - i see may mails about problems and crashes 2.6 in lkml... [20:33] i think it not stable :) [20:33] @matt I'm sure that those issues do not arise from the stable vserver patches, so if you observed them without rmap15k and O(1), it's likely a kernel issue ... [20:33] yeah [20:34] that's why I hoped upgrading to 2.4.23pre8 would fix it [20:34] hmm, maybe you should try to downgrade to 2.4.20 .. then [20:34] i searched and found issues with kswapd using all CPU but it seems they were fixed in 2.4.2 [20:34] shadow: I see complaints about problems and crashes in 2.4, here on #vserver ... you think it's stable ? [20:34] ;) [20:41] alekibango (~john@62.245.97.59) left irc: Quit: Client killed by consultant [20:54] okay alex, I have about 20min left, do you want to discuss the capabilities? [20:54] okey. [20:55] riel - i think 2.4 more and more stable 2.6 :) [20:55] fleshcrawler (~fleshcraw@port-212-202-15-5.reverse.qsc.de) left irc: [20:58] i reconect.. [20:58] fleshcrawler (~fleshcraw@port-212-202-15-5.reverse.qsc.de) joined #vserver. [20:58] shadow (~umka@212.86.233.226) left irc: Quit: brb [20:58] okay, I was thinking about 64 bit maybe one set for permit one to forbid ... [21:00] ok [21:00] so 2.4.22 is: [21:00] patch-2.4.22-1020-rl2 [21:00] patch-2.4.22-1030-vh [21:00] patch-2.4.22-c17e.diff [21:00] patch-2.4.22-c17e-rmap15k.diff [21:00] patch-2.4.22-c17e-rmap15k-mq0.11.diff [21:00] patch-2.4.22-c17e-mq0.10-cx0.06.diff [21:00] patch-2.4.22-c17e-mq0.11-cx0.06-cq0.11.diff [21:00] patch-2.4.22-c17e-mq0.11-cx0.06-cq0.11-dl0.05.diff [21:00] patch-2.4.22-ctx17a-vr0.13.diff [21:00] patch-2.4.22-c17e-rmap15k-ml0.06.diff [21:00] patch-2.4.22-ctx17a-fakemem.diff [21:00] patch-c17e-signal-ctx1.diff [21:00] no-proc-mounts.diff [21:00] 2.4.23pre8 is: [21:00] patch-2.4.23-pre8 [21:00] patch-2.4.23-pre8-O1.3.diff [21:00] patch-2.4.23-pre8-O1.2-rmap15k.diff [21:00] patch-2.4.23-pre8-O1.2-rmap15k-c17h.diff [21:00] patch-2.4.23-pre8-O1.3-rmap15k-c17h-ml0.07.diff [21:00] patch-2.4.22-ctx17a-fakemem.diff [21:00] patch-2.4.23-pre8-O1.3-rmap15k-c17h-qh0.12.diff [21:00] patch-2.4.22-c17e-mq0.10-cx0.06.diff [21:00] patch-2.4.22-c17e-mq0.11-cx0.06-cq0.11.diff [21:00] patch-2.4.22-c17e-mq0.11-cx0.06-cq0.11-dl0.05.diff [21:00] patch-2.4.22-ctx17a-vr0.13.diff [21:00] patch-c17e-signal-ctx1.diff [21:00] no-proc-mounts.diff [21:00] no-hostname.diff [21:00] under both kswapd will eat up all CPU and raise loads to 500+ every few days.. [21:01] perhaps it is a core kernel problem [21:01] and not related to the patches [21:02] shadow (~umka@212.86.233.226) joined #vserver. [21:04] okay, I was thinking about 64 bit maybe one set for permit one to forbid ... [21:04] fleshcrawler (~fleshcraw@port-212-202-15-5.reverse.qsc.de) left irc: [21:04] @matt so there is no kernel without rmap then? [21:06] @alex what do you think, is this a good approach? [21:06] fleshcrawler (~fleshcraw@212.202.15.5) joined #vserver. [21:10] please repeat approach [21:10] okay, I was thinking about 64 bit maybe one set for permit one to forbid ... [21:11] for example a bit could mean 'allow device access' and it's pendant would mean 'deny device access' [21:12] bertl: oh, i guess not :) [21:13] so i guess rmap COULD be the problem. [21:13] @matt and it would be my first choice ... sorry rik! [21:13] i will compile a kernel without it for next this it crashes [21:13] so i'll be ready :) [21:14] this is on the SMP machine, yes? [21:14] hm. if i right understand you - you offer use bitmask for control access to devices ?.. in this way me can use only 64 devices... [21:14] bertl: yes [21:15] @alex nope, not for each device, like the capability system already there, the device access would only be a few or just one bit ... [21:15] maybe separating network from char and block ... [21:17] http://lkml.org/lkml/2003/8/12/210 [21:17] hrm... [21:17] shadow (~umka@212.86.233.226) left irc: Quit: BitchX-1.0c19 -- just do it. [21:18] seems for this guy rmap helped... a little [21:19] http://search.luky.org/linux-kernel.2002/msg00780.html [21:19] smiliar results seen... hrm [21:19] wonder what MH is :) [21:24] fleshcrawler (~fleshcraw@212.202.15.5) left irc: [21:26] shadow (~umka@212.86.233.226) joined #vserver. [21:27] sorry. server when i have shell was a restarted .. [21:28] herber but what you think about use only virtual network devices in vps ? it remove problem with contorol access to it. [21:29] I took 2.4.18-pre2 and 2.4.18-pre2 with the vmscan patch from [21:29] "M.H.VanLeeuwen" . [21:29] yeah, couldn't find it [21:29] ok [21:29] so when it just crashed now [21:30] @alex yes, but we need some other capabilities .. [21:30] 984K buff [21:30] seems like buffer problem.. [21:30] in my other two tops [21:30] 844k buff and 1564k buff [21:30] @alex for example I would move the quotactl to those capability set ... [21:30] so definitely same problem as these other people with kswapd not properly reclaiming memory from cache [21:31] @alex also changing setting network/host parameters would not always be desired ... [21:31] hm.. we talk about network devices ? quotactl - it in common capabilites list. [21:32] @alex we talk about adding a per context capability feature ;) [21:32] capabilites are limited to 32bit and quotactl is 29 or 30 ... so not much legft, eh? [21:32] but can be divide it to some parts ? [21:33] common - networkl - device access - .... [21:33] yes sure we can take it apart, or just assign bits as we go .. like the original capability set did ... [21:34] the question is more, how to get the desired functionality ... [21:34] okay I have to leave now ... we talk about it tomorrow ... [21:34] okey. [21:35] cu all l8er ... [21:35] i send mail with my idas to you.. [21:35] Nick change: Bertl -> Bertl_zZ [21:38] shadow (~umka@212.86.233.226) left irc: Quit: BitchX-1.0c19 -- just do it. [21:39] Ivanov23 (~Ivanov@D5779B89.kabel.telenet.be) joined #vserver. [21:42] shadow (~umka@212.86.233.226) joined #vserver. [22:04] netrose (~john877@CC3-24.171.21.47.charter-stl.com) left irc: Ping timeout: 493 seconds [22:24] alekibango (~john@62.245.97.59) joined #vserver. [23:02] fleshcrawler (~fleshcraw@212.202.15.5) joined #vserver. [23:02] hello again [23:04] Ivanov23 (~Ivanov@D5779B89.kabel.telenet.be) left irc: [23:06] anyone had the problem that your vserver won't use configured dns and thus you couldn't ping domain names although ip's worked? [23:08] fleshcraw: ping does not work because it's not allowed to create raw sockets [23:08] (by default) [23:08] fleshcrawler: is /etc/resolv.conf setup in the vserver? [23:09] no. I allowed it to create raw sockets. [23:09] resolv is alright. [23:10] I think it has todo with kernel patches. [23:10] I use the preemptive kernel patch also. [23:10] do you have a dns tool installed like dig or host on the vserver? [23:10] after I played arround with patching the kernel sources it worked again with pinging domains [23:12] right now my vserver kann lookup domains. [23:12] I'll patch arround and try to reproduce that error. [23:40] fleshcrawler (~fleshcraw@212.202.15.5) left irc: [23:51] fleshcrawler (~fleshcraw@212.202.15.5) joined #vserver. [23:53] fleshcrawler (~fleshcraw@212.202.15.5) left irc: Client Quit [23:56] netrose (~john877@CC3-24.171.21.47.charter-stl.com) joined #vserver. [00:00] --- Wed Nov 12 2003