[00:00] shuri: it works, i'm installing now [00:00] ok [00:01] which NIC does vmware emulate ? [00:02] http://vserver-beta.electronicbox.net/ [00:03] JonB: amd pcnet [00:03] MrBawb: supported by 2.2 ? [00:03] @shuri: could you do cat /proc/cpuinfo and put this in the same dir? [00:03] yes [00:03] (inside and outside the vserver) [00:04] JonB: yeah [00:04] 2.2.20 has it [00:04] done [00:05] @shuri same with /proc/uptime please ;) [00:06] done [00:06] oupss [00:06] hmm, maybe the old logs should get another name ;) [00:06] yes [00:07] done [00:08] tx [00:10] hmm, could you do the strace uptime >/tmp/uptime.log 2>&1 (inside the vserver, again?) [00:11] yes [00:12] done [00:12] does it still report the HZ message? [00:13] vserver test1 enter [00:13] ipv4root is now 0.0.0.0 [00:13] New security context is 2 [00:13] test1:/# uptime [00:13] Unknown HZ value! (177) Assume 100. [00:13] 13:28:35 up 16 min, 0 users, load average: 0.00, 0.06, 0.11 [00:13] ahh okay, I need the /proc/stat too ... [00:13] ok [00:15] done [00:16] hehe, it seems your uptime tool tries to calc the HZ setting from /proc/stat and /proc/uptime ... [00:16] are the values it reports correct or do they look fishy? [00:16] corrct [00:16] correct [00:17] please check the Unknown value should get lower and lower on each call ... is this the case? [00:18] yes [00:18] okay, I'll ahve a look what in /proc/stat needs virtualization ... [00:18] alekibango (~john@b59.brno.mistral.cz) left irc: Remote host closed the connection [00:19] Unknown HZ value! (165) Assume 100. [00:19] 13:31:37 up 19 min, 0 users, load average: 0.16, 0.06, 0.09 [00:19] test1:/# uptime [00:19] Unknown HZ value! (164) Assume 100. [00:21] is this RH? [00:22] debian [00:22] root and vhost [00:22] hmm, do they have variable HZ usually? [00:22] running in vmware [00:22] no [00:22] make is vmware.. [00:22] maybe [00:23] do you what acess ? [00:23] want [00:23] no, thanks not necessary ... [00:23] I believe RH exports all jiffie values after converting to HZ=100 [00:24] i can test a redhat base verver [00:33] @shuri could you test with a task running in the background, if the load average reported moves to 1 on a newly started server? [00:34] with a) a task running outside, and b) a task running inside ... (something like while true; do echo "mumble" >/dev/null; done) [00:37] say-out (~say@212.86.243.154) joined #vserver. [00:37] hi say! [00:39] anyway, that should fix the HZ issue ... http://vserver.13thfloor.at/Experimental/patch-2.4.23-pre9-vs1.1.0-uptime-btime.diff [00:48] Bertl: test server is operational [00:48] hey cool ... [00:48] you have a working vserver on that one? [00:48] Bertl: not yet [00:49] Bertl: what do you want ? [00:49] okay start with 2.4.23-pre9-v1.1.0 ... the devel release [00:49] and setup a simple vserver with network and such ... [00:51] Bertl: okay [00:51] look at processes, server messages, try to mess with the network ... [00:51] Bertl: hey hey, not so fast [00:52] okay, let me know when you are there ... [00:54] Bertl: i dont think i well get beyond installing the kernel [00:55] hmm, why so? [00:55] Bertl: a Scent of woman comes on tv soon [00:55] ;) [00:55] Scent of a Woman [00:55] and it is a good movie :) [00:56] I know ... have it on DVD ... [00:56] Bertl: i find the patch under devellopment, right ? [00:56] the default is on linux-vserver.org devel release ... [00:58] Bertl: i dont need pre1-8 to apply pre9 do i ? [00:59] nope, pre9 is on vanilla ... [01:02] dammit i have too much cruft under /usr/src [01:03] mugwump (~sv@stc.surreytech.co.uk) joined #vserver. [01:03] hi sam! [01:07] Hi there Herb [01:07] How's tricks? [01:08] fine, thanks, how are you? [01:09] a bit stressed. brb [01:13] alekibango (~john@b59.brno.mistral.cz) joined #vserver. [01:33] Got a big presentation tomorrow... gotta show the investor he hasn't been wasting his money :-) [01:33] cool what do you show? [01:34] It's a system for the Hotel Industry ... for communication between reservation agents and hoteliers [01:36] The scary bit is that most of the really kinky stuff is happening client side, in JavaScript :-) [01:38] So what's your Bread and Butter then? You seem to be online most of the time... [01:39] hmm, IT consulting ... but currently only bread no butter :( [01:40] not exactly an IT hotbed, Argentina? [01:40] well it's Austria, but it really depends ;) [01:41] Oh, Austria. Why on earth did I think .at was Argentina [01:42] I guess you must specialise in server consolidation then :-) [01:43] I used to work at Sun, they always were pushing server consolidation to anyone willing [02:57] Hurga (ident@pD9E7988D.dip.t-dialin.net) left irc: Quit: Leaving [03:00] Bertl: can the development patch that you gave me be used by the stable util-vserver? [03:00] the 0.24 version should work with it ... [03:02] Bertl: well, there is no rpm/deb version :( [03:02] Bertl: what about the old vserver from jacq? [03:02] nope [03:03] I'll make you an rpm if you want? [03:05] Bertl: why thank you, it is easier to keep my system in a known state using packages [03:07] there you go ... [03:07] http://vserver.13thfloor.at/Stuff/util-vserver-0.24-0.src.rpm [03:07] http://vserver.13thfloor.at/Stuff/util-vserver-0.24-0.i586.rpm [03:07] http://vserver.13thfloor.at/Stuff/util-vserver-linuxconf-0.24-0.i586.rpm [03:08] Bertl: linuxconf ? [03:08] that is the newvserver stuff enrico included from jack, to be compatible ... [03:22] Bertl: do i still need to do the chmod 0000 ? [03:23] yup, no chsaferoot in vserver yet [03:24] Bertl: what is required of chsaferoot ? [03:25] the chsaferoot is a good idea jack had to make chroot safe ... unfortunately this is completely untested and I had no time to do some tests on my own yet ... [03:25] the requirement itself is simple, once chroot, you should get no way to escape that 'jail' [03:26] i wonder if one could make a small university project of testing that it is safe [03:27] do you have a small university ad hand? [03:27] heh [03:28] having experienced people look it over is good [03:28] one way to do that is to document how it works [03:29] MrBawb: i'm not experienced [03:29] Bertl: the small was related to project size [03:29] Bertl: i dont particular think that the university of copenhagen is small [03:30] largely depends on what compared to ;) [03:31] Bertl: i know [03:31] Bertl: but anyway, i was wondering if the subject of researching this is academic enough [03:32] well, basically I could live with a BSD jail implementation for linux ... [03:32] Bertl: oh, so there is some implementation in it as well ? [03:41] making chroot safe is all well and good, but then there's the syscalls, /proc interfaces ... [03:42] please ask me tomorrow if i have added this to my project list [03:42] http://docs.freebsd.org/44doc/papers/jail/jail.ascii.gz 6.2 Fortification of the chroot ... [03:45] http://www.linuxorbit.com/modules.php?op=modload&name=Sections&file=index&req=printpage&artid=538 [03:45] http://www.gsyc.inf.uc3m.es/~assman/jail/index.html [03:46] Bertl: and jacqs idea ? [03:46] is in ctx-18pre1 ... as I said, had no time to look into it ... [03:47] Bertl: okay, i thought he might have an explination somewhere [03:47] anyway, it's getting late [03:47] me too, but no luck so far ... [03:49] Bertl: well, i will succede [03:49] JonB (~jon@129.142.112.33) left irc: Quit: zzzzzzz [03:55] Hurga (ident@pD9E7988D.dip.t-dialin.net) joined #vserver. [04:00] My vserver behaves a bit different WRT DNS than the old machine... [04:02] define different ... [04:03] I have a local bind running and 127.0.0.1 in my resolv.conf [04:03] e.g. ping just hangs, never gets an answer from the named. [04:03] If I give the vserver's IP in the resolv.conf, it works fine. [04:04] you added 127.0.0.1 to the allowed IPs ? [04:04] the named answers 'host' and 'dig' just fine. [04:04] (on 127.0.0.1) [04:04] allowed IPs - where? [04:04] forget it for another moment ... [04:05] dig works fine? [04:05] yes. [04:05] what is the command you specify for dig? [04:06] dig @127.0.0.1 zxq.de [04:06] and ping wont produce an ip? [04:06] or host zxq.de 127.0.0.1 [04:07] what does dig zxq.de do? [04:07] ping zxq.de just sits there [04:07] dig zxq.de does the same as dig @127.0.0.1 zxq.de [04:08] works fine. [04:08] no sitting no waiting? [04:08] yes. [04:08] okay strace there? [04:09] gah, I seem to have deleted the static version [04:09] hangon [04:11] ok. [04:11] make sure you use an address your dns can resolve ... [04:15] http://www.tigress.com/hurga/ping.trace [04:15] could you name it ping.trace.txt please ;) [04:16] done [04:16] tx [04:16] hehe .. nis is your friend, not always ... [04:17] look in /etc/nsswitch.conf and remove all nis/nisplus stuff ... [04:17] gah [04:19] damn nis doesn't work with nfsroot, which really sucks [04:19] rpc is so shit, but all the alternatives are an administrative nightmare [04:19] Bertl: Still on it. But why is the vhost different than the original machine here? [04:20] hmm, had netinfo running for some time, works well ... [04:21] @Hurga probably because the nis connect can't reach the host itself ... [04:22] Bertl: removed all nis/nisplus stuff, no change. [04:22] hmm strange ... [04:23] maybe the source address for the ping is different somehow ... [04:23] could you tcpdump this from outside? [04:24] what exactly? [04:24] the connect to the nameserver ping does ... [04:24] so simply start tcpdump -s 1000 -vvnei lo0 on the host [04:25] s/lo0/lo/ [04:25] ok. [04:25] the start the ping inside and look at the packages ... [04:26] try the same if configured on the host ... [04:31] http://www.tigress.com/hurga/ping.dump.txt [04:32] 127.0.0.1.43233 > 217.24.218.165.domain hmmm [04:32] noticed that... [04:32] 217.24.218.165 is the real ip of that vserver? [04:32] mv resolv.conf has nameserver 127.0.0.1 [04:32] 217.24.218.165 is the vserver, yes. [04:33] what does dig localhost return? [04:34] ;; QUESTION SECTION: [04:34] ;localhost. IN A [04:34] that's all? [04:34] the usual stuff. No result. [04:34] ;; SERVER: 217.24.218.165#53(127.0.0.1) [04:34] okay ... [04:35] uups .. server is what? [04:35] bind-9.2.1-0.7x [04:35] obviously bind doesn't know that it should reply vial local ... [04:36] but that is another issue ... [04:38] Action: Hurga straces again with 217.24.218.165 in the resolv.conf [04:38] no, please do the old trace again with -v ... [04:40] erf, sorry, I just noticed that i didn't use the new strace. [04:40] ahh that explains the fragmented data ... [04:42] I do the trace in a moment. - But I wonder... 217.24.218.165.43670 > 217.24.218.165.domain <--- that's when I have 217.24.218.165 in the resolv.conf. [04:42] monako (~monako@194.186.248.101) joined #vserver. [04:43] I always thought the resolv.conf tells the machine where to ask, and not which source address to use for the query?? [04:43] mugwump (~sv@stc.surreytech.co.uk) left irc: Quit: hmm, 2am, early night today [04:43] it does tell so, but linux adds some interface magic and the vserver patch does some too .. [04:43] uh oh. [04:44] if a host has 2 addresses, and you have one route to the outside, default kernel will use this interface address as source ... [04:44] but this isn't required behaviour of the stack ;) [04:44] i see... [04:44] good nignt all [04:44] so you now want ./strace -v ping zxq.de ? [04:45] good nigh all [04:45] good night alex! [04:45] br [04:45] night shadow [04:45] thanx [04:45] @Hurga yep! [04:46] Herbert tomorow i try fill a command matrix.. [04:46] shadow (~umka@212.86.233.226) left irc: Quit: to bed [04:46] ping.trace2.txt [04:47] nodename="hydrogen.yatho.de" what does this resolve to? [04:47] hydrogen[root]:~> host hydrogen.yatho.de [04:47] hydrogen.yatho.de has address 217.24.218.165 [04:48] okay ... [04:50] add -s 1000 and do it again ... [04:51] ping.trace3.txt [04:54] okay, seems like the address is converted on the way ... [04:55] I don't know where exactly this happens, but I suspect the vserver code rewrites it somewhere ... maybe because 127.0.0.1 isn't part of the vserver definition ... [04:56] Any suggestions? put the vserver's IP in the resolv.conf and forget about it? [04:57] I can look into that, but not today, you can try the following things: (and it would be useful to document them): [04:57] - add lo:127.0.0.1/255.255.255.0 to the vserver description [04:57] same with a mask of 255.0.0.0 and 255.255.255.255 [04:58] - set localhost in /etc/hosts to the outside (real) ip ... [04:58] well that's it ... [04:59] how do I give that in the vserver description? [04:59] hmm, what? [04:59] lo:127.0.0.1/255.255.255.0 [04:59] just add that line? [05:00] exactly this way, you should have a similar line with eth0:217.24.218.165/netmask [05:00] just add this to it ... [05:01] I have IPROOT=217.24.218.165 and that's it for IP... [05:01] well have a look at the documentation jack or enrico ... you'll find how to specify them ... [05:02] ok. [05:19] kestrel_ (~athomas@192.65.90.92) left irc: Quit: radeon! [05:41] serving (~serving@213.186.189.236) left irc: Read error: Connection reset by peer [05:45] monako (~monako@194.186.248.101) left irc: [06:03] hmm. [06:04] I start screen inside vserver, open some shells. When I quit them, I get [06:04] Utmp slot not found -> not removed [06:05] probably utmp isn't configured, doesn't exist ... [06:06] I had some permission problems, but they *should* be fixed. [06:08] gah. Forget it, sgid from utempter went away somehow. [06:09] I wonder why that rsync did so many weird stuff. [06:09] hmm, probably because it isn't dump/restore ;) [06:10] mrf. Usually it works fine... [06:10] should have been doing a rpm -Va already, that should have caught it. [06:11] yup ... [06:33] something breaks the permissions of /var/run/utmp when I restart the vserver. weird. [06:34] you are using tagctx ? [06:34] um? [06:34] okay, forget it ... [06:55] ok, that could be an issue. [06:55] from /usr/sbin/vserver: [06:55] rm -f `find var/run -type f` [06:55] touch var/run/utmp [06:56] from the vserver's rc.sysinit: [06:56] chgrp utmp /var/run/utmp [06:56] chmod 0664 /var/run/utmp [06:56] so fine so good... [06:56] but? [06:56] just rc.sysinit doesn't seem to be run. [06:57] which is good, because it wold clean the mtab, too. :) [06:57] yup ... [06:57] so add the chgrp utmp /var/run/utmp and chmod 0664 /var/run/utmp to the .sh script [06:58] in post-start: [07:01] post-start studd is executed inside the vserver's contect, right? [07:01] stuff [07:01] yup [07:11] hm, doesn't work. [07:13] you might do a chroot() before you change the uid ... [07:35] finally. [07:35] damn shell stuff. [07:45] kestrel_ (~athomas@192.65.90.92) joined #vserver. [07:49] serving (~serving@213.186.189.236) joined #vserver. [07:50] Action: Hurga needs to sleep... [07:50] Action: Bertl too ... [07:51] good night everyone ... [07:51] Thanks again for all :) Sleep well! [07:51] Nick change: Bertl -> Bertl_zZ [07:51] Hurga (ident@pD9E7988D.dip.t-dialin.net) left irc: Quit: Leaving [07:53] kestrel_ (~athomas@192.65.90.92) left irc: Quit: goodnight [08:31] exit [09:42] blueshoe (~blueshoe@phylogenomics.Berkeley.EDU) joined #vserver. [09:43] what is the chrootsafe call? and does it matter that vserver is using plain chroot instead? [10:08] blueshoe (~blueshoe@phylogenomics.Berkeley.EDU) left irc: Ping timeout: 493 seconds [11:15] shadow (~umka@212.86.233.226) joined #vserver. [11:15] morning... [12:00] blueshoe (~blueshoe@phylogenomics.Berkeley.EDU) joined #vserver. [12:26] kestrel_ (~athomas@192.65.90.92) joined #vserver. [13:19] alekibango (~john@b59.brno.mistral.cz) left irc: Ping timeout: 493 seconds [13:30] alekibango (~john@b59.brno.mistral.cz) joined #vserver. [13:36] ace (~ace@213.225.74.103) left irc: Ping timeout: 492 seconds [14:00] serving (~serving@213.186.189.236) left irc: Ping timeout: 493 seconds [14:40] JonB (~jon@129.142.112.33) joined #vserver. [14:50] kestrel_ (~athomas@192.65.90.92) left irc: Quit: blah [14:56] serving (~serving@213.186.190.186) joined #vserver. [15:21] mhepp (~mhepp@r72s22p13.home.nbox.cz) joined #vserver. [16:26] mhepp (~mhepp@r72s22p13.home.nbox.cz) left irc: Remote host closed the connection [17:25] kestrel_ (~athomas@dialup28.optus.net.au) joined #vserver. [17:32] monako (~monako@ts1-a126.Perm.dial.rol.ru) joined #vserver. [17:45] monako (~monako@ts1-a126.Perm.dial.rol.ru) left #vserver (всем пока ... good bye all ...). [18:02] kestrel_ (~athomas@dialup28.optus.net.au) left irc: Quit: Hey! Where'd my controlling terminal go? [18:25] Medivh (ck@62.93.217.199) joined #vserver. [18:49] vserver xx stop hangs at killall. Can I a vserver stop be forced ? [19:09] what show "vps ax" about program states ? [19:22] that's a major issue.. [19:22] i had quotaon hang in a vserver [19:22] and I had to change the context id, re-tag the files and then it would finally start [19:22] Action: matta hates disk sleep/wait :( [19:46] Nick change: Bertl_zZ -> Bertl [19:48] hi [19:48] hey Bertl the MHZ error disape [19:48] hi all! [19:48] so the patch works? [19:48] yes [19:48] no more MHZ warning [19:48] you added the small modification I did, right? [19:49] dont know why [19:49] no [19:49] okay then if you stop/start a vserver, it will be back instantly ... ;) [19:49] maybe a reboot of the host is required, don't know the logic of this calculation ... [19:50] test1:/# uptime [19:50] Unknown HZ value! (511889) Assume 100. [19:50] @serving the only thing you can do, is change into the context and killall5 everything ... [19:50] after reboot of the vs [19:51] @shuri but if you add the small patch I did to fix this, it should be just fine ... [19:51] where are the patch [19:51] experimental? [19:51] probably, let me look ... [19:52] http://vserver.13thfloor.at/Experimental/patch-2.4.23-pre9-vs1.1.0-uptime-btime.diff [19:52] goes ontop of the http://vserver.13thfloor.at/Experimental/patch-2.4.23-pre9-vs1.1.0-uptime.diff [19:52] Nov 9 04:03:02 vps4 kernel: inode: da6add00 [#76] != dqh: d589f400 [50,#69] [19:52] Nov 9 04:03:02 vps4 kernel: c4727e70 c014a270 c0276680 da6add00 0000004c d589f400 00000032 00000045 [19:52] Nov 9 04:03:02 vps4 kernel: c8f86d40 00000007 c8f86d40 00000000 c8f86d40 c4727f84 c016aa8f 00000000 [19:52] Nov 9 04:03:03 vps4 kernel: c8f86d40 00000000 c0127f09 00000246 ed156640 ed1fd660 00000000 ffff63e4 [19:52] Nov 9 04:03:03 vps4 kernel: Call Trace: [] [] [] [] [] [19:52] Nov 9 04:03:03 vps4 kernel: [] [] [] [] [19:52] whoa.. [19:53] hmm, could you run it through ksymoops ... and provide it somewhere ... [19:53] and try to find the first oops .. this looks like a followup ... [19:54] hi matt, by the way ;) [19:55] evening Herbert [19:55] hi alex! [19:55] http://66.103.140.20/inode-oops.txt [19:55] had a look at the xfs quota code, this is verry confusing ... [19:56] @matt Object not found! [19:56] he requested URL was not found on this server. If you entered the URL manually please check your spelling and try again. [19:56] 8-) [19:56] no oops in my messages.. [19:56] searched all [19:57] @matt so this is your way telling that there is no oops? ;) [19:57] http://66.103.140.20/inode-oops.txt [19:57] Bertl: yes [19:57] okay brb .. 10min [19:57] Bertl> i try to find races in ext3.. i found what is wait list locked.. but not fixed it.. [19:58] [root@vps4 root]# grep -i oops /var/log/messages* [19:58] [root@vps4 root]# [19:58] that should search all the way back to Oct 19th.. [20:10] Bertl: I also have a process in a vserver that is in disk sleep state [20:10] "quotaoff -u /" [20:10] @matt we do not get any page on that url ... is this intentional? [20:10] so I don't know if it is a vroot problem or what [20:10] Bertl: no page? [20:10] it's a text file [20:11] http://66.103.140.20/inode_oops.txt [20:11] as we tried to explain, earlier, there is only a 'Object not found!' reply ... [20:11] I had a - before [20:11] it's _ [20:11] ahh ... okay ... [20:12] hmm, what patches are those? [20:13] it looks like one of my debug traces, so no oops actually ... [20:13] herbert what a point who write "inode: da6add00 [#76] != dqh: d589f400 [50,#69]" [20:14] that is my debug output ... and IIRC I do a stack dump then ... [20:14] matta> not not oops who make quotaoff... [20:16] @matt do you have a list of the used patches? [20:16] quotaoff can be in diskwaite state only not all diskquotas flueshed. and quota_off do invalidate_dquots [20:16] quotaoff can be in diskwaite state only if not all diskquotas flushed. and quota_off do call invalidate_dquots [20:17] with quotaoff, it's actually a little more complicated ... if you have a process which 'holds' a dquot for whatever reason, then quotaoff will block, and automatically resume if the last dquot is released ... [20:19] patch-2.4.23-pre8 [20:19] patch-2.4.23-pre8-O1.3.diff [20:19] patch-2.4.23-pre8-O1.2-rmap15k.diff [20:19] patch-2.4.23-pre8-O1.2-rmap15k-c17h.diff [20:19] patch-2.4.23-pre8-O1.3-rmap15k-c17h-ml0.07.diff [20:19] patch-2.4.22-ctx17a-fakemem.diff [20:19] but if error in count dq_count or have inodes who not remove quotas hashes.. [20:19] patch-2.4.23-pre8-O1.3-rmap15k-c17h-qh0.12.diff [20:19] patch-2.4.22-c17e-mq0.10-cx0.06.diff [20:19] patch-2.4.22-c17e-mq0.11-cx0.06-cq0.11.diff [20:19] patch-2.4.22-c17e-mq0.11-cx0.06-cq0.11-dl0.05.diff [20:19] patch-2.4.22-ctx17a-vr0.13.diff [20:31] @matt do you have a context 76 and 69 ? [20:35] one sec.. [20:35] ah, yes [20:35] 69 = old context [20:35] quotaoff ids stuck in there [20:35] but since it exists the context can't be started [20:35] so I moved the vserver to context 76 [20:36] so it would start :) [20:36] ahh okay ... so you started the existing context over the old, right? [20:36] Bertl: hmmm... [20:36] i'm not sure what you are askingh [20:37] the quotaoff in disk sleep state is the only process is 69 now [20:37] well the old context is still running #69 ... [20:37] and since it won't die the context won't stop, correct [20:37] and you didn't copy the files away, and start the new context, you just started it on the old files, right? [20:38] okay, let me explain the message you got ... [20:38] Bertl: correct [20:38] i think i understand.. [20:38] inode: da6add00 [#76] != dqh: d589f400 [50,#69] [20:39] I did a chctx -R 76 /vservers/... [20:39] this says, that context 76 encountered an inode, which has a quota hash in context 69 ... [20:39] i guess an inode is "locked" somewherE? [20:39] yes probably that is the inode quota off was/is waiting on ... [20:39] the interesting part now is, can we pinpoint this inode, because this would be interesting ... [20:41] but I think we cannot, alex, any ideas? [20:42] i was thinking lsctx may show the file as context 69 still [20:42] but no luck [20:42] oh wait, found it [20:42] bunch of files [20:42] hey cool ... [20:42] many in /var/tmp [20:42] don't touch them ... [20:43] and ./var/qmail/queue/lock/trigger [20:43] have a look at them via lsof ... [20:43] which i believe is a pipe [20:43] hmm, pipe sounds good ... [20:43] syntax? [20:43] sec [20:43] wait, they are all pipe's [20:44] [root@vps4 triosade]# ls -al ./var/tmp/_NetProtect_Plugins_AV_7_4102 [20:44] prw-rw-rw- 1 10003 10003 0 Nov 7 17:31 ./var/tmp/_NetProtect_Plugins_AV_7_4102 [20:45] try from ctx 1 lsof +d and +D [20:45] lsof: WARNING: not a directory: ./var/tmp/_temp_live_cupdate_0_10380 [20:46] specify the /tmp only ... [20:46] oh, got it [20:47] lsof: no pwd entry for UID 10003 [20:47] bdregd 21256 10003 8u FIFO 22,2 641474 ./var/tmp/_NetProtect_Plugins_AV_2_21256 [20:47] .. more of the same.. [20:47] okay what is bdregd and is it still running? [20:48] bdregd is bitdefender [20:48] it's running under the new context [20:48] check in context 1 for all ips and look at their context number ... [20:49] ... [20:49] (ips of bdregd of course ;) [20:50] just running in context 76 [20:50] with the given ip= 21256 but the pipe ./var/tmp/_NetProtect_Plugins_AV_2_21256 is still 69 ? [20:50] #10003 21250 76 triosade 0.0 0.0 11560 760 ? SN Nov07 0:00 /opt/BitDefender/bin/bdregd start /opt/BitDefender/etc/bdsettings.xml [20:50] #10003 21255 76 triosade 0.0 0.0 11560 760 ? SN Nov07 0:00 /opt/BitDefender/bin/bdregd start /opt/BitDefender/etc/bdsettings.xml [20:50] #10003 21256 76 triosade 0.0 0.0 11560 760 ? SN Nov07 0:07 /opt/BitDefender/bin/bdregd start /opt/BitDefender/etc/bdsettings.xml [20:50] yep [20:51] could you stop that service and remove the pipes? [20:52] ok [20:52] did that [20:52] didn't re-create them... [20:53] quotaoff still running [20:53] there is still the qmail trigger though [20:54] okay try to remove this one too ... [20:56] did that [20:56] quotaoff didn't exit yet [20:56] hrm... [20:56] issue a sync ... [20:57] and send a signal to the quotaoff ... [20:57] did both.. [20:57] i sent -9 [20:57] didn't change anything, I guess ... [20:57] [root@vps4 lock]# chcontext --ctx 69 kill -9 28865 [20:57] New security context is 69 [20:57] Segmentation fault [20:57] hrm [20:58] hmm, maybe that is the real problem here ... [20:58] but I have signal from ctx 1 [20:58] and it can't be killed from ctx 1.. [20:59] does chcontext --ctx 69 cat /proc/self/status work? [21:01] segfault [21:01] okay, guess we hit a OOM issue there, and now cannot enter the context ... [21:01] yeah [21:01] try to raise/disable the memory limit ... [21:01] i can't :) [21:01] need to be able to run ulimit... [21:02] ooops sorry, forgot, this isn't implemented in your patches ;) [21:02] that's why I liked alex's method :) [21:02] you have the syscall method working now? [21:02] will be there in the next version ... [21:02] ah [21:02] well, not a big deal [21:02] it's running, but no harm... [21:02] i don't know if it would die if we could enter the context though [21:03] yes, the message (stack dump) issued is explained by that ... only the cause for the dangling inodes/pipes isn't [21:04] next version will have a syscall to destroy a context also ? [21:05] this isn't possible in the current implementation ... [21:05] oh [21:05] but it will have a send signal to context ... [21:07] I'm not sure that a destroy context is possible at all ... it's the same thing like a forced unmount ... doesn't really make sense ... [21:08] destroy as kill all tasks in context. [21:08] this will work with the send signal to context ... but when a task cannot be killed, you lose ... [21:12] but we change task status to terminate or call code simmular do_exit.. [21:12] well, a blocked task won't do that, you only can ignore it ... [21:14] but task can have unrelease resouces... [21:15] well, if you think you have a clean solution, please post it on lkml, people will be really happy ... because hanging processes are a real problem ... [21:17] @alex where do we put a signal to context syscall? [21:17] (on the syscall matrix I mean) [21:18] 10? [21:18] process magagment.. [21:19] control is 12. [21:20] okay, so 12 .. what do we need to pass beside the context, and the signal? [21:22] yes. in parametres ctx_id and signal_number [21:23] do we have to mark the context as 'not scheduled' for the time sending the signals ... [21:26] yes. [21:27] or add flag to stop creating processes in context.. [21:27] what about a schedule/unschedule syscall maybe in combination with the scheduling priority of one context ... [21:28] in currecnt i have second way - but if create per context scheduler - first way better. [21:29] sounds good. or other named - change scheduler flags for context. [21:32] what you about this ? [21:33] I think we can separate this in two syscalls, one which just sends the signal, and another, which allows to schedule/unschedule (maybe configure scheduling parameters) the context ... [21:33] this way we don't have to care about the scheduling/forking in the signal syscall ... [21:34] yes. it course. [21:34] EINVAL An invalid signal was specified. [21:34] i say about change schedule/unschedule to change scheduler flags for context. [21:35] ESRCH The context does not exist ... [21:35] agree. [21:35] EPERM your context isn't priviledged to send to ... [21:35] sounds reasonable? [21:36] mhepp (~mhepp@r72s22p13.home.nbox.cz) joined #vserver. [21:36] yes. [21:38] suggestion: [21:38] struct vcmd_ctx_signal_v0 { [21:38] uint32_t pid; [21:38] uint32_t signal; [21:38] }; [21:38] context id is passed in the id argument of the syscall switch ... [21:39] pid = -1; means signal all processes ... [21:39] everything else is like kill(2) [21:39] -uint32_t pid [21:39] +pid_t pid; [21:39] it more corrctly. [21:40] hmm, is this uniform for 32/64 bit architectures? [21:40] i see it used in sched.c [21:40] example: [21:40] static int setscheduler(pid_t pid, int policy, struct sched_param *param) [21:40] yes, I know, in kernel it's no question, but from userspace to kernel space, I don't know [21:40] JonB (~jon@129.142.112.33) left irc: Ping timeout: 492 seconds [21:40] 1 sec. [21:42] typedef int __kernel_pid_t; [21:42] that can be 32 or 64 bit ... [21:42] yes. in linux/types.h [21:43] so I would use int32_t actually ... and convert to pid_t in the syscall ... [21:43] but it not principial - you want run 32bit aplication on 64bit plaforms ? [21:43] i think me _must_ recompile. [21:44] we have this issues on sparc and X86_64 for example ... [21:44] struct vcmd_ctx_kill_v0 { [21:44] int32_t pid; [21:44] uint32_t signal; [21:44] }; [21:44] what about that? [21:44] it need compile tools on target platforms. [21:45] i think if defined special type for it - me must use it. [21:46] not difficlult recompile tools on target plantform. [21:46] you will get a non working syscall, if you compile a 32bit userspace tool on a 64bit kernel ... [21:46] this has nothing to do with the target platform ... [21:47] the only way you could avoid this, is using two syscalls, one for 32 and one for 64 bit tools ... [21:47] which actually would do the conversion in the kernel ... [21:48] hm.. [21:48] 1 sec - i try to find how defined on 64bit platform [21:50] see in include/asm-ia64/ia32.h [21:50] if defined #ifdef CONFIG_IA32_SUPPORT [21:51] then all types redifined to 32bits sizes. [21:51] if not - it not do. [21:51] but our syscall would change the size ... I don't like that ... [21:52] size of ? [21:52] int [21:52] typedef int __kernel_pid_t32; [21:53] and without it is 64 bit .. [21:53] i see all syscalls have pid_t as parameter.. [21:54] sys_kill(int pid, int sig) [21:54] sys_tkill(int pid, int sig) [21:54] ifdef __KERNEL_SYSCALLS__ [21:54] static inline _syscall0(int,sync) [21:54] static inline _syscall0(pid_t,setsid) [21:55] asmlinkage long sys_wait4(pid_t pid,unsigned int * stat_addr, int options, struct rusage * ru) [21:56] okay, but what reason do you see for using a pid_t? [21:57] if 32bit arent enough, we can use a int64_t ... [21:57] it is predefined type for process indentifyer [21:57] but I don't want this to change depending on the userspace compile ... [21:58] if you can show me that a 32 and 64 bit userspace compile always produces the same size for pid_t I have no problem with that ... [21:58] not same [21:58] but same with kernel [21:59] that is bad, because this will produce a wrong syscall, right? [21:59] not. [21:59] wrong if size in userspace != size in kernel. [22:00] but me can check it while compile tools in configure script. [22:00] yes, and you can compile the userspace tools on a 64bit kernel in 32 bit _and_ 64 bit mode ... [22:00] what would you check in the configure script? how will you respond, by arbitrarily redifining the pid_t ? [22:02] me can use for compile tools actual kernel souces. [22:02] which would not help you in any way, as the size changes if you compile 32 or 64 bit user space apps ... [22:03] who ? [22:03] look at sys_wait4 and the ugly sys32_wait4 pendant ... [22:04] I don't want to go this way, as I find it really ugnly ... [22:04] i don`t quckly find in kernel/ 2.6.0-test5 tree.. [22:05] hm.. it in mips architecture ? [22:05] arch/ia64/ia32/sys_ia32.c [22:06] although I am looking at 2.4 ... [22:07] i don`t find on it.. but it can be renamed.. i try to locate.. [22:08] just search for sys32_wait4 ... [22:08] hmm [22:08] extern asmlinkage long [22:08] compat_sys_wait4(compat_pid_t pid, compat_uint_t * stat_addr, int options, [22:08] struct compat_rusage *ru); [22:08] may be use it way ? [22:08] compat_pid_t [22:08] well as I said, I don't want to go down this road ... [22:09] if you think pid should be 64 bit .. we use int64_t and be happy ion all platforms .. [22:09] inside and outside the kernel, this can be converted by the compiler from/to pid_t whatever it might be ... [22:11] eh.. [22:11] other way.. how many bites used for actual pid in kernel ? [22:12] gues this depends ... on a 32bit architecture it will be 32 on a 64 maybe 64? [22:13] i don`t see processes with pid larger 65535 [22:13] I don't know, havent looked at the pid allocation yet ... [22:13] :-\ [22:14] but it might be that it is limitted to 32bit ... [22:14] or even to 16 bit ... [22:14] pid = (map - pidmap_array) * BITS_PER_PAGE + offset; [22:16] and defined [22:16] PID_MAX_DEFAULT [22:16] hmm, so this might be changed via sysctl then? [22:17] * This controls the default maximum pid allocated to a process [22:17] */ [22:17] #define PID_MAX_DEFAULT 0x8000 [22:17] * A maximum of 4 million PIDs should be enough for a while: [22:17] */ [22:17] #define PID_MAX_LIMIT (4*1024*1024) [22:17] * This controls the default maximum pid allocated to a process [22:17] */ [22:17] #define PID_MAX_DEFAULT 0x8000 [22:17] okay 32bit ... [22:17] so we go for int32_t then? [22:19] agree. but try found how used platforms independet pid_t...:) [22:20] well I believe you can simply assign/retrieve a pid_t to/from the struct ... there should be no problem there ... [22:20] pid_t mypid = 7; [22:20] struct vcmd_ctx_kill_v0 xy; [22:20] xy.pid = mypid; [22:20] should work without any casting issues ... [22:21] mypid = xy.pid; also .. [22:21] in the kernel it's int anyway ... so we don't care there ... [22:21] #define VC_CAT_PROCTRL12 [22:21] #define VC_CAT_PROCTRL 12 [22:22] #define VCMD_ctx_killVC_CMD(PROCTRL, 1, 1) [22:22] #define VCMD_ctx_kill VC_CMD(PROCTRL, 1, 1) [22:22] okey. [22:22] #define VCMD_ctx_kill VC_CMD(PROCTRL, 1, 0) sorry ... [22:24] :) [22:24] okey. [22:24] by the way, did you already define your compatibility calls yet? [22:25] or do you want to abandon them with the syscall switch? [22:25] byt it simular with changes uint32_t signal; to uint32_t sched_flags; [22:26] and field 14. [22:26] what you think about it ? [22:26] you are talking about the scheduler control ... I was talking about the compatibility calls you have now ... [22:27] looks good at first glance ... but we should talk with Sam ... [22:27] maybe we can merge this with the scheduler tuning stuff he did ... [22:27] Bertl even with the new patch i get de MHZ error again [22:28] @shuri and what values are there? some examples [22:28] Unknown HZ value! (811) Assume 100. [22:28] 01:20:56 up 0 min, 0 users, load average: 0.05, 0.04, 0.01 [22:28] test1:/# uptime [22:28] Unknown HZ value! (119) Assume 100. [22:28] 01:31:09 up 10 min, 0 users, load average: 0.00, 0.00, 0.00 [22:28] test1:/# uptime [22:28] okay, can you get the source for this uptime tool? [22:29] is debian [22:29] 3.0 [22:29] i will try it on a REAL box first [22:29] maybe is vmware.. [22:29] just try to get it, and provide a .tar.gz then ... I'll ahve a look at it ... [22:30] ok [22:30] it might be that the vmware does strange things there ... [22:56] hm.. herber.. i found very strange problem.. [22:56] handle_t *current_handle = ext3_journal_current_handle() [22:56] return NULL.. [22:56] (inode.c, 2587): ext3_dirty_inode: marking dirty. outer handle=00000000 [22:56] jbd_debug(5, "marking dirty. outer handle=%p\n", [22:56] current_handle); [22:56] hmm ... [22:56] so the journal vanished, or what? [22:57] i don`t now.. but created very small log ~300k.. [22:58] maybe we should try with an external yournal ... [22:58] (inode.c, 2587): ext3_dirty_inode: marking dirty. outer handle=c6f30680 [22:58] hm.. [22:58] on testbox only one fs :( [22:58] you know sfdisk 8-) [23:00] but for it need space :) [23:01] hm.. [23:01] static inline handle_t *journal_current_handle(void) [23:01] { [23:01] return current->journal_info; [23:01] } [23:01] not init :-\ [23:02] hmm ... funny ... [23:02] but in create context we must be cloned it`s field.. [23:04] serving (~serving@213.186.190.186) left irc: Read error: Connection reset by peer [23:05] @alex do you know, is honzas fix in pre9 already? [23:05] I think not, yes? [23:08] hm.. i can`t find anonce next kernel version after honzas fix. [23:08] probably it will be in pre10 ... [23:09] hm.. i find where journal_info do assigned.. it journal_start.. [23:09] hm.. [23:10] i some situations me call mark_dirty_inode without journal_start.. :-\ [23:10] hmm ... [23:12] i can`t find any variants.. [23:14] hm.. i wroing.. [23:15] or not :-\ [23:18] who is blueshoe? does anybody know? [23:19] hi, i'm back [23:19] i can`t find it in dictionaryes :) [23:20] oh. [23:20] hi blueshoe? do we know you? [23:20] i don't think so [23:20] i've been on this channel only once before [23:20] do you want to remain unknown? [23:20] i'm just another vserver user :) [23:21] i'm on this time because i was hoping someone could tell me the difference between chrootsafe and chroot [23:21] hmm, maybe I can ... [23:22] and does it matter that my kernel can only do chroot and chrootsafe? [23:22] err.. and not chrootsafe? [23:22] well depends ... [23:22] the vserver chroot is a little modified, as it honors barriers like the 000 chmod on /vservers ... [23:22] ahh [23:23] so it is safe(r) compared to the normal chroot ... [23:23] so the regular chroot ignores 000 on /vservers? [23:23] i thought 000 was done so that you couldn't "cd .." or something... isn't vserver safe against that sort of hack? [23:23] yes ... you could walk over it ... [23:23] (see escaping chroot environments) [23:24] i mean vserver without chrootsafe [23:24] no, the chrootsafe/chsaferoot call is something jack developed to allow a safe chroot environment without this hack ... [23:25] this is required for vservers inside of vservers ... [23:25] if you use recent tools util-vserver 0.24 or vserver 0.26 the message is gone ... [23:26] yeah, i think i'm using 0.23 [23:26] jack added this feature to the tools but didn't release the kernel for it ... [23:26] i see [23:26] now we have the kernel patches, but don't use them yet ... [23:27] (still needs some testing ;) [23:27] so by "without this hacK" do you mean without making /vserver 000 ? [23:29] yes, without requiring the 000 barrier ... [23:29] ok [23:29] got it [23:29] so chroot() now with 000 is as safe as chrootsafe() without ... [23:30] so how about this one... why am i all of a sudden getting a "No directory for this vserver: /var/lib/vservers/foo", when there's clearly a /var/lib/vservers/foo directory [23:32] hmm, which patches do you use and what tools? [23:32] ctx-17 [23:32] 0.23 [23:33] hmm, they should not use the /var/lib/vservers dir at all ... [23:33] no? [23:34] no! [23:34] ahhh... wait... i think i might see the problem... i have a /var/lib/vserver and it's complaining about /var/lib/vservers [23:35] no /var/lib/vserver either ... [23:35] /var/run/vservers/ yes [23:35] well, i had it working with /var/lib/vserver before... same patches and tool version [23:35] and /usr/lib/vserver too ... [23:36] what should be in /var/lib/vserver ? [23:36] i haven't looked at this vserver installation in a while [23:36] in my /var/lib/vserver i have the directory structure for each vserver [23:37] usually it chroots in to /var/lib/vserver/foo for the foo vserver [23:37] hmm, this isn't the default, at least not for jack's tools ... [23:37] maybe it's a debian thing [23:37] probably ... [23:37] ahh... yes [23:37] "In this debian package the vserver root is changed from /vservers to /var/lib/vservers." [23:38] "This is because the /vserver directory is not LSB compliant." [23:38] which is funny, but has no deeper reason, as /var/lib/vservers ist just wrong for that ;) [23:38] yeah [23:38] it shouldn't be in any "lib" directory, imo [23:38] otherwise your home would be in /var/lib/home/blueshoe ... ;) [23:39] right [23:39] anyway, apart from debian weirdness, why would it be looking in /var/lib/vservers all of a sudden when it should be looking in /var/lib/vserver? [23:40] hmm you said it is changed to /var/lib/vservers on debian, why do you put it into /var/lib/vserver then? [23:40] yeah, i did, didn't i [23:40] hmmm [23:40] let me see... maybe i can change it to /var/lib/vservers and it'll work [23:41] oh, yeah, that's better [23:41] ;) [23:41] i wonder what happened there? [23:41] did you upgrade any of this stuff recently? [23:41] i **swear** it was working before, and i certainly wouldn't have moved it from vservers to vserver on a whim [23:42] i don't think i upgraded it... but maybe i did [23:42] it's been some months [23:42] well maybe cosmic radiation did kill off the 's' then *grin* [23:43] yeah, i must have upgraded it, because i wasn't getting the "chrootsafe" warning before, i'm sure of that... and now i am getting it here too [23:43] well probably the debian guys lost the 's' in the first version, and added it again later ... [23:44] yeah, that must have been it [23:44] good call [23:44] which kernel version is this? [23:44] 2.4.20 [23:44] there is a 2.4.22-3 debian kernel with the latest stable release somewhere ... [23:45] yeah, i might upgrade to that [23:45] there're just a million things i gotta do, though... i wish everything just magically worked :) [23:45] or if you prefer to compile it yourself, or stay with 2.4.20 there is a release for that too ... [23:46] you've been a great help, though [23:46] thanks for helping me figure this out [23:46] my pleasure ... [23:47] mhepp (~mhepp@r72s22p13.home.nbox.cz) left irc: Remote host closed the connection [23:48] ahh... now it works [23:49] just had to update my own scripts to look in /var/lib/vservers [23:49] unix is just such a delicate little ecosystem [23:59] matta (matta@69.10.150.254) left irc: Remote host closed the connection [00:00] --- Mon Nov 10 2003