About this list Date view Thread view Subject view Author view Attachment view

From: Herbert Poetzl (herbert_at_13thfloor.at)
Date: Wed 31 Mar 2004 - 17:09:23 BST

Hi Community!

this is the last part of the (IMHO necessary) explanations
to actually start discussing the options we have for future
linux-vserver development (and their implications, advantages
and disadvantages)
this time we take a short look at the netfilter (iptables)
stuff and how packets are handled in the kernel (brief)

here a simple overview how packets from the network traverse
the tables (on the left side is the network, and on the right
the local process receiving and sending packets)

      +- -----------+ .-------. +--------+ +-----+
  --->| PREROUTING |--->| route |----->| INPUT |---->| |
      +- -----------+ '-------' +--------+ | P |
                              | | L R |
                              V | O O |
                         +---------+ | C C |
                         | FORWARD | | A E |
                         +---------+ | L S |
      +- -----------+ | +--------+ | S |
  <---| POSTROUTING |<-------+----------| OUTPUT |<----| |
      +- -----------+ +--------+ +-----+

important things to mention here are:

 - packets destined for a local process (routing decision)
   do not pass the FORWARD, OUTPUT, or POSTROUTING table
 - packets originating from a local process do not pass
 - packets 'routed' through the host will not pass the
   INPUT or OUTPUT table

here is a simple example illustrating the above (again with QEMU)

  on the host:
    ifconfig tun0 netmask
    route add -net gw

  on the (QEMU) client

    ifconfig eth0 netmask
    ifconfig dummy0
    iptables -A INPUT -j LOG --log-prefix INPUT:
    iptables -A FORWARD -j LOG --log-prefix FORWARD:
    iptables -A OUTPUT -j LOG --log-prefix OUTPUT:

  after this setup, we can simulate terminating, originating,
  and routed packets, with two simple pings ...
    H# ping -c 1

            INPUT: IN=eth0 OUT= MAC=.. SRC= DST=
                    LEN=84 PROTO=ICMP ID=5665
            OUTPUT: IN= OUT=eth0 SRC= DST=
                    LEN=84 PROTO=ICMP ID=5665

    we see, that the INPUT table is consulted when the echo
    request arrives, and the OUTPUT table, when the echo reply
    is sent, but there is no forwarding involved

    C# echo 1 >/proc/sys/net/ipv4/ip_forward

    H# ping -c 1
            FORWARD:IN=eth0 OUT=dummy0 SRC= DST=
                    LEN=84 PROTO=ICMP ID=5921

    this on the other hand is a ping request forwarded by the
    kernel from one interface (eth0) to the other (dummy0)
    (it doesn't matter that this request is never answered)

and now for the last piece of information, how packets
and sockets are related (very simplified version)

applications like apache, sendmail, ssh and others, communicate
via sockets, the server has to bind it's socket(s) to a specific
ip and port, and the client uses another socket to send to this
ip/port, depending of the type of protocol, a connection is
established or 'just' a message sent.

basically when a packet it received by the hosts network card,
the nic driver allocates a buffer (skb) and puts the data into
this buffer, then the packet is passed on to the network stack.
after some routing decisions, firewalling etc, when the stack
decides that the packet is destined for the host, the kernel
starts checking each bound socket (qualifying for the ip) and
sends a copy of the buffer to that socket, which according to
the protocoll, unwraps the packet, and delivers the message to
the application waiting for that socket. a similar but simpler
process works in the other direction.

so interesting things to keep in mind are:

 - sockets are checked for each packet destined for the host
 - sockets are created and bound by userspace applications
 - there is a buffer (skb) travelling the kernel's ip stack


additional documentation:



Vserver mailing list

About this list Date view Thread view Subject view Author view Attachment view
[Next/Previous Months] [Main vserver Project Homepage] [Howto Subscribe/Unsubscribe] [Paul Sladen's vserver stuff]
Generated on Wed 31 Mar 2004 - 17:10:11 BST by hypermail 2.1.3