[00:27] re [00:49] hi! [01:26] Nick change: noel- -> noel [01:27] hi noel! [01:27] hello together. [01:28] hi noel [01:31] everybody suvived christmas?:) in germany we had a very christmassy tv program. "terminator 2" on the first day and "die hard" on the second.;) [01:32] hmm, what about 'nightmare on elmstreet?' ;) [01:33] noel thats what some ppl like :) [01:56] tanjix (ViRu_@pD9049C6B.dip.t-dialin.net) left irc: [02:36] nathan (~nathan@29.sub-166-156-241.myvzw.com) joined #vserver. [02:42] hi nathan! [02:44] hey bert [02:44] Bertl, i think there may still be a race condition in 1.3.1 [02:45] hmm, possible, where? [02:45] the put that releases locks on iplist_lock [02:45] but where in the get is that lock obtained? [02:46] it just decs refcount after obtaining vxlist_lock but vxlist_lock is only obtained in the create [02:46] yup, it isn't (yet) [02:46] oh, so this still isnt smp safe? [02:47] well, the list lock is/will be used on allocation ... [02:47] and again, the task lock will be used to prohibit races with the task->ip_info ... [02:47] wait a second ... [02:49] http://vserver.13thfloor.at/Experimental/split-2.4.23-vs1.3.2/ [02:50] I'm currently finalizing this version ;) [02:50] looking [02:53] ok so in the 1.3.2 it seems that proc_pid_status needs to obtain the tasklist_lock [02:53] based on the current implementation [02:53] unless i am overlooking something [02:54] Action: Bertl is searching the code ... sec [02:54] proc pid status does: vxi = task_get_vx_info(task); [02:55] which takes the task_lock() ;) [02:55] right, but the task_lock only happens in the scope of task_get_vx_info, once it returns the task becomes unlocked [02:55] so the code following the vxi=task_get.. is a critical section [02:56] yup, but the refcount is hold until put_vx_info() ;) [02:56] s/hold/held/ [02:57] keep looking/searching/asking, it's good to have a second opinion ... [02:58] ok i see what you are saying, i see what you are saying [02:58] yea i gotcha makes sense [02:59] I'm more worried about the ip_info and 'required' locking I missed .. one place I marked with /* locking ... required? */ [03:00] or something similar .. should be easy to find in the net split ... [03:01] wgeting all the splits on this slow connection [03:02] sorry should have bzip2ed them ... [03:03] and I added the vx_verify_info()/ip_verify_info() checks, just to catch wrong assumptions ;) [03:04] ill need to give this a run when i get back to boston tomorrow/monday [03:06] Is it safe to use 1.3.1? [03:06] Or are unexpected kernel panics to expect? [03:06] 1.22 appears to work here ;) [03:06] hmm. if I'm right, I guess it's more stable than vs1.22 ... [03:06] <0>seth[1002]:~# vserver sendmail enter [03:06] SIOCSIFBRDADDR: Cannot assign requested address [03:06] SIOCSIFFLAGS: Cannot assign requested address [03:06] vs1.3.2 I mean ... [03:06] Is that right, by the way? [03:07] well, it depends on your config ... but probably you try to assign an already existing address ... [03:07] and do it the _wrong_ way ... [03:07] Where could the problem be? [03:07] This box has one IP only [03:08] and you want to use it in and outside the vserver? [03:08] right. [03:08] show me your vserver config ... [03:09] which one? [03:09] the one which reports those errors ... [03:09] only the uncommented lines ... [03:09] both vserver configs report this error [03:11] if you want more details, http://vserver.13thfloor.at/Stuff/VServer-IP-Setup-0.1.txt [03:12] short explanation, your start script tries to create an alias, where the ip already exists ... [03:13] without IPROOTDEV, the existing one is used ... [03:14] Bertl, so you are thinking the locking with ip_info may not be adequate? [03:14] I'm not sure ... I have to look at all cases where current isn't the reference ... [03:15] and especially the place I commented, looked racy to me ... [03:15] im seeing it in 07_2.4.23_net.diff [03:15] 118 + /* check for locking here */$ [03:16] well, we are talking about a one in a milion race there ... but ... [03:17] 0 out of 0 is a better odd :) [03:17] yup, agreed ... [03:17] so what is the race condition you see there? [03:17] if I had _seen_ a race there, I would have fixed it .. I just _suspect_ a race there ... [03:18] or a possibility for a race ... [03:19] actually back to my original lock thing i thought i saw [03:19] what is to stop a get while mid-put? [03:19] huh? please elaborate ... [03:20] by the way, the comment should be changed to: +/* reordering required? */ [03:20] what is to stop a get_ip_info() while put_ip_info() in its critical section obtaining tasklist_lock [03:20] or vx_info for that matter [03:21] probably a better example is the vx_info with the proc calls [03:21] simple, put() acquires the list lock when zero is reached ... [03:21] Bertl: You are from Austria? [03:21] okay? [03:21] Madkiss: yes! [03:22] Bertl: Hallo Nachbar -wink- [03:22] Action: Bertl winke winke ... [03:23] nathan: okay? [03:23] Bertl, lets say count goes to 0 and it gets tasklist_lock and while that is happening another processor enters at proc_pid_status and goes for a task_get_vx_info() while tasklist_lock is still obtained in the put [03:23] there is no tasklist_lock ... [03:23] sorry, tasklist_lock=vxlist_lock [03:24] okay, so how could this reach zero ... [03:24] simple: it could be the 'last' release where the task_struct is freed ... [03:25] in release_task() right? [03:25] hmm ok so in that case the process will be locked by another piece of code in the kernel [03:25] right [03:26] so its not a problem [03:26] so if it happens by the last releasing of an out of task reference, it's no problem either ... [03:26] the actual race was never between get/put ... [03:26] it was between task->vx_info = NULL and vxi=task->vx_info ;) [03:26] and this is now protected by the task_lock() [03:28] yep ok [03:29] and if you are interested in actually testing it, you can enable preemption, to give races a better chance ;) [03:32] Madkiss: where in Austria are you from? [03:39] Nick change: doener_aw -> Doener [03:39] hi [03:39] hi Doener [03:41] Bertl: I'm not from Austria, I'm from Germany. We are, so to say, Country-Neighbours [03:42] ahh, okay ... I assumed that you are from austria too ... [03:43] i'm not. [03:44] well, we won't use it against you, at least not yet ;) [03:44] monako (~monako@ts1-a50.Perm.dial.rol.ru) joined #vserver. [03:44] Naja, wenigstens habt ihr 'nen richtigen Bundeskanzler. Wir haben nur Gerhard Schröder. [03:44] :\ [03:45] hi monako! [03:46] nathan? still here? or are you busy reading the patches ... [03:54] do i remember correctly that there was an issue with an earlier version of vserver that could cause processes to get stuck in D state? [03:54] if so, what could trigger this? [03:55] hmm, every interaction with the I/O system, can trigger such races (which are often the cause for such 'hanging' processes) [03:56] one issue, fixed in vs1.22 could be triggered by accessing the proc fs while a vserver task dies ... [03:56] hmm, no... doesn't apply here... for some reason a lvm partition got mounted ro and now some rm processes are stuck [03:57] lvm has some races too ... [03:57] (known races) if you are not updated to the latest version ... [03:57] have you tried to mount the partition rw? [03:58] remount, I mean? [03:58] yeah, no success [03:58] do you have magic sysreq enabled? [03:58] which kernel version are we talking about? [03:59] 2.4.22-c17f guess it will get it's update when we reboot to get rid of the rm processes [04:00] nathan (~nathan@29.sub-166-156-241.myvzw.com) left irc: Ping timeout: 512 seconds [04:00] hmm, well there are some issues fixed since then ... local root exploit, races in vserver, ... [04:01] I would also suggest to have a look at http://vserver.13thfloor.at/Stuff/patches-2.4.23vs1.22/ [04:04] for the partition being mount read-only, are there situations/reasons apart from mount starting fsck that could cause that? never saw that before [04:05] well, usually any filesystem falls back to ro, if there is sever fs corruption or disk failure ... [04:07] e2fsck reports no errors [04:07] check the logs ... [04:46] nothing unusual there... thanks anyways... [04:47] i'll have some sleep now... g'night all [04:47] Nick change: Doener -> doener_zZz [04:47] good night ... [04:52] monako (~monako@ts1-a50.Perm.dial.rol.ru) left irc: Ping timeout: 480 seconds [05:23] monako (~monako@ts1-a2.Perm.dial.rol.ru) joined #vserver. [05:58] tanjix (ViRu_@c-180-204-230.n.dial.de.ignite.net) joined #vserver. [05:59] back again? [05:59] sure :) [05:59] and fixed you issues? [05:59] s/you/your/ [05:59] no [05:59] didn't try, or didn't succeed? [05:59] was not successful [06:00] trying since hours [06:01] what's the problem, still ipac-ng? [06:04] yep [06:04] maybe ask on the mailing list, maybe somebody made it work yet ... [06:04] when that runs fine i'm happy but that tool costs my life :) [06:05] another question: how can i see what vroot devices are set [06:05] hmm, not possible at the moment ... you have to do some bookkeeping ... [06:06] you can put that on the todo list if you like :) [06:06] do they have to bet set again on each server reboot ? [06:07] nope the vserver reboot has no effect on that, the host reboot will clear them [06:07] i meant that :) [06:08] so i could put that into a startup script [06:08] sure ... [06:08] again for better understanding: [06:08] mknod /dev/vroot0 b 4 0 [06:08] mknod /dev/vroot1 b 4 1 [06:08] ... [06:08] ? [06:09] yup?, that will create the device nodes ... [06:09] then setting them up with the vrsetup tool [06:09] this is only required once, and only if you do not use devfs [06:09] (the node creation) [06:09] so i dont neeed to put the mknod command into the startup [06:09] ? [06:10] nope, this is like your /dev/zero, it won't go away, unless you delete it ... [06:10] done once and it's over [06:10] ok [06:10] the same with the vrsetup ? [06:10] nope, vrsetup is like the losetup for loopback devices ... [06:11] so this has to be done on eahc reboot of the host [06:11] actually it's pretty much like the loop device setup [06:11] and now that I think about it, how do you tell what loop device isn't configured yet? [06:12] eh sorry ? [06:15] well, I don't know a way to find out if a loop device is used or not, do you? [06:15] (except for trying to configure it, and failing to do so) [06:15] i would look into "ifconfig" ? [06:15] if it is present or not ? [06:16] not the network loopback , the loop device /dev/loop0 ... [06:16] oh, sry [06:16] don't know [06:17] me neither, just because you mentioned that I should put that on my todo list for the vroot device .) [06:17] mh ok understood what you mean :) [07:37] monako (~monako@ts1-a2.Perm.dial.rol.ru) left irc: Ping timeout: 512 seconds [08:00] okay, enough work for me for today ... [08:00] wish you a good whatever, cu 2morrow ... [08:01] Nick change: Bertl -> Bertl_zZ [08:05] tanjix (ViRu_@c-180-204-230.n.dial.de.ignite.net) left irc: [09:53] Doener` (~doener@pD9E12F48.dip.t-dialin.net) joined #vserver. [10:00] doener_zZz (~doener@pD9588874.dip.t-dialin.net) left irc: Ping timeout: 512 seconds [10:33] noel- (~noel@pD9E09745.dip.t-dialin.net) joined #vserver. [10:40] noel (~noel@pD9E09741.dip.t-dialin.net) left irc: Ping timeout: 493 seconds [13:15] serving (~serving@213.186.189.100) left irc: Ping timeout: 480 seconds [15:07] serving (~serving@213.186.189.88) joined #vserver. [15:37] Hunk #3 succeeded at 101 with fuzz 2. [15:37] hm [15:44] Nick change: noel- -> noel [17:05] chctx skips 'dev' entries [17:05] guess touch will do [18:52] Nick change: Bertl_zZ -> Bertl [18:52] hi everyone! [19:08] hello Bertl. you are sleeping very long.;) [19:08] hmm, well, I'm working very hard ;) [19:09] ok, this is true and an excuse.;) [19:09] thank you! I'm glad to hear! ;) [19:15] :) [19:15] Hi. [19:16] hi virtuoso! [19:16] maharaja (maharaja@ipax.tk) left irc: Ping timeout: 485 seconds [19:16] Bertl: you asked one or 2 days ago if anybody wants to test memory limits. Now I have a dedicated testsystem and would test it. just if you still need one. [19:18] I always need people who do some testing, what do you have in mind? [19:18] what is this 'dedicated' test system? [19:22] noel: hmm you had those fixed context id issues? [19:23] I setup quota and the only missing for me is limiting memory of a vserver. this testsystem is just a dedicated i386 just for this tests and sometimes I will test the stuff I do on sparc64. [19:23] Bertl: yes. its me. [19:23] hmm, and it doesn't take the S_CONTEXT value? [19:23] for me not. it works for you? [19:24] I assume so, let me test ... [19:25] S_CONTEXT isn't in the manpage in 0.29 so I thought it got removed. but its possible the problem is on my side. [19:26] # uname -a [19:26] Linux (none) 2.4.23-vs1.22 #1 SMP Sat Dec 13 19:02:02 CET 2003 i686 unknown [19:26] # vserver XXXX start [19:26] Starting the virtual server XXXX [19:26] Server XXXX is not running [19:26] ipv4root is now 192.168.0.2 [19:26] Host name is now XXXX.test.org [19:26] Domain name is now [19:26] New security context is 1001 [19:26] Kernel do not support chrootsafe(), using chroot() [19:26] Starting system logger: [ OK ] [19:27] with 'chcontext version 0.24' [19:27] S_CONTEXT=1001 [19:28] I'll test with the newer tools ... sec [19:28] hmm. ok. I have "chcontext version 0.26" here. [19:29] is this util-vserver or vserver? [19:29] vserver. [19:30] maharaja (maharaja@ipax.tk) joined #vserver. [19:30] could you test http://vserver.13thfloor.at/Stuff/testme.sh and make the result available/ see if everything works ... [19:31] sure. one moment. [19:32] is it ok to paste it here? [19:32] hmm, dcc or web or email ... [19:32] or send it in private ... [19:32] ok. [19:34] ahh, you are using util-vserver (at least my script says so ;) [19:35] anyway, that should have no effect on that ... [19:35] hmm. right, sorry I forgot.I installed it after I found the problem into /usr/local. mom. will remove it and try again. [19:35] JonB (~Jon@0x503e03ba.kjnxx7.adsl.tele.dk) joined #vserver. [19:35] hi Jon! [19:35] hey Bertl [19:35] hi maharaja! alsmost missed you! [19:37] does anyone know of an OS X11 irc/icq/yahoo client program in one? [19:41] later [19:41] JonB (~Jon@0x503e03ba.kjnxx7.adsl.tele.dk) left irc: Quit: ChatZilla 0.9.35 [Mozilla rv:1.5.1/20031120] [20:36] netrose (~john877@CC3-24.171.21.47.charter-stl.com) joined #vserver. [20:37] JonB (~Jon@0x503e03ba.kjnxx7.adsl.tele.dk) joined #vserver. [21:30] JonB (~Jon@0x503e03ba.kjnxx7.adsl.tele.dk) left irc: Quit: ChatZilla 0.9.35 [Mozilla rv:1.5.1/20031120] [21:31] hm [21:31] hmhm! [21:32] root@ns1:/etc# named [21:32] named: capset failed: Operation not permitted [21:33] sounds like your named want's to play with the capabilities ;) [21:33] i gave it S_CAPS="CAP_QUOTACTL CAP_NET_RAW" [21:34] ah [21:34] strace helps :) [21:34] is this bind? [21:34] capset(0x19980330, 0, {CAP_DAC_READ_SEARCH|CAP_SETGID|CAP_SETUID|CAP_NET_BIND_SERVICE|CAP_SYS_CHROOT|CAP_SYS_RESOURCE, CAP_DAC_READ_SEARCH|CAP_SETGID|CAP_SETUID|CAP_NET_BIND_SERVICE|CAP_SYS_CHROOT|CAP_SYS_RESOURCE, CAP_DAC_READ_SEARCH|CAP_SETGID|CAP_SETUID|CAP_NET_BIND_SERVICE|CAP_SYS_CHROOT|CAP_SYS_RESOURCE}) = -1 EPERM (Operation not permitted) [21:34] yep [21:34] you should 'recompile' without linux capablilities ... [21:35] hrm [21:35] for security reasons, but you can 'just' give CAP_SYS_RESOURCE ... [21:35] --disable-linux-caps disable linux capabilities [21:35] (this will disable resource control for this vserver) [21:35] hm what exactly does this do? linux-caps that is [21:36] well, the bind people 'think' it is a good idea, to raise/modify the limits themselves ... and then drop that capability ... [21:36] hrm [21:37] of course, this doesn't work in a vserver, where the capabilities are already dropped ... [21:37] well a recompile is no biggy :) [21:37] i guess it does not need RAW then either [21:38] nope, should not require that ... [21:38] and i just found out i dont need quotas either, doh [21:39] for a nameserver, not very likely ;) [21:39] oh well, good excercise :) [21:39] nah, web/ftp [21:39] hmm, why then use bind? [21:39] its a diff vserver :) [21:39] ah okay ... [21:39] what tools do you use? [21:39] vserver tools, I mean ... [21:39] hardly any [21:40] hmm, how do you start vservers? [21:40] vserver web start :) [21:40] well, there is a tools involved in that, right? [21:41] where does vserver come from? [21:41] i guess, but i did not say i did not use any :) [21:41] just that vserver is the only thing really being used heh [21:41] basically, I'm only interested in _what_ version this is ;) [21:42] the thing doesnt tell [21:43] ;) [21:43] hmm, I know ... try chcontext --ctx 100 false [21:44] 0.26 [21:44] okay, probably this is enricos util-vserver 0.26 .... [21:44] the one that is linked to on the 1.22 stable page [21:45] yeah, okay, because I just discovered, that jacques tools (0.29) are more broken, than I thought ... [21:45] oh? [21:46] b0rked eyond recognition? heh [21:46] hmm, almost, problem is, he changed to much, without extensive testing ... [21:46] right... [21:47] that usually results in 'crap' [21:47] heh [21:47] righto named now works :) [21:56] so, what parts are already known not to work in 0.29 then? [21:56] it ignores S_CONTEXT for example ... [21:56] guess he didnt use them [21:57] possible ... [21:57] Linux_Lord (~dr@pD9507EDD.dip0.t-ipconnect.de) joined #vserver. [21:58] Linux_Lord: hi! [22:02] hmm, I hate those auto joins ... [22:05] LL0rd (~dr@pD9507ECB.dip0.t-ipconnect.de) left irc: Ping timeout: 480 seconds [22:13] heh [22:34] Bertl: is vserver-0.29-fix01 the answer of my problem?:) [22:34] yup, I hope so ;) [22:35] great. testing... [22:47] Tamama (~Tamama@a62-216-20-152.adsl.cistron.nl) left irc: Read error: Connection reset by peer [23:40] Fuff. [23:40] Action: virtuoso is going to compile a kernel for production server. [23:40] good idea ... [23:40] What to take: 1.22 or 1.3.1? [23:41] hmm, I would take 1.3.2, but that isn't an option for you, yet ;) [23:41] but vs1.22 is stable, and production quality ... [23:42] Ok, will take it. [23:42] Action: virtuoso also wonders if there is new rmap patch. [23:42] nope, rik is pining ... ;) [23:43] anybody had this error: "Can't chroot to directory . (Permission denied)" the dir is ok and a manual chroot works. the second vserver works. (/me has always errors:)) [23:44] hey, we'll fix em ;) [23:44] Are you uid 0? :) [23:44] virtuoso: sure. [23:44] noel: first some questions ... [23:44] context tagging enabled? [23:45] (if you do not know what this means, the answer is probably no) [23:45] ok: no.:) [23:46] the command is (set -x in /usr/sbin/vserver): /usr/sbin/chbind --ip 192.168.74.101 --bcast 192.168.74.255 /usr/sbin/chcontext --cap CAP_NET_RAW --flag lock --flag nproc --ctx 101 --hostname v101 --secure /usr/lib/vserver/save_s_context /var/run/vservers/v101.ctx /usr/lib/vserver/capchroot . /etc/init.d/rc 2 [23:46] next, check the permissions of _all_ directories in the path to that directory/vserver ... [23:47] ok. will check permission diffs between v101 and v102 (this one is working). [00:00] --- Mon Dec 29 2003