[vserver] High Systemload and many Processes in D state

From: Urban Loesch <bind_at_enas.net>
Date: Fri 18 May 2012 - 10:36:03 BST
Message-ID: <4FB61803.5010008@enas.net>

Hi,

I have a DELL PE R610 (32GB RAM 2x Six Core CPU and about 1,4 TB RAID 10)
running in a vServer of about 20.000 Mailaccounts (Dovecot 2).

The Server was running about 1 year without any problems. 15Min Load was between 0,5 and max 8.
No high IOWAIT. CPU Idletime almost about 98%.

But since yesterday morning the Systemload on the Server has been increased over 500. I Think this is
very high. The strange thing: there was no IOWAIT and the CPU Idle time was allways the same on about 98%,
when this happens. Stopping, waiting for a couple of minutes and starting Dovecot again resolves the issue.
But after some hours the problem cames back.

Information:
OS: Server Debian Squeue
Kernel: 3.0.28-vs2.3.2.3-rol-em64t
Util-vServer: 0.30.216-pre3034-squeeze0.2-1
1 vServer (Debian Lenny) with a total amount of IMAP Sessions of 700.

vmstat and iostat during high load.

# vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
  r b swpd free buff cache si so bi bo in cs us sy id wa
  1 0 0 27040576 635460 3495600 0 0 14 15 2 31 0 0 99 0
  0 0 0 27040320 635468 3496064 0 0 804 455 1383 1281 0 0 98 1
  0 0 0 27047016 634964 3489312 0 0 216 156 1841 1292 1 0 98 1
  0 0 0 27047140 635028 3489012 0 0 240 619 1629 1658 0 0 96 3
  0 0 0 27047264 635120 3489172 0 0 92 0 1069 881 0 0 100 0
  0 0 0 27047388 635120 3489256 0 0 0 46 1404 1265 0 0 100 0
  0 0 0 27047512 635136 3489312 0 0 128 471 1539 1354 0 0 99 1
  0 0 0 27047388 635156 3489384 0 0 12 360 1108 952 0 0 99 0
  0 0 0 27047516 635160 3489408 0 0 104 12 893 677 0 0 99 0
^C

# iostat -k
Linux 3.0.28-vs2.3.2.3-rol-em64t 16.05.2012 _x86_64_ (24 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle
            0,08 0,00 0,09 0,34 0,00 99,49

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 67,35 636,28 1080,48 31337690 53215361
dm-0 73,53 591,55 893,93 29134837 44027496
dm-1 15,06 42,40 181,79 2088493 8953661

The only strange thing I can see if it happens is this:

# vps -ostat,pid,time,wchan='WCHAN-xxxxxxxxxxxxxxxxxxxx',cmd ax |grep D
STAT PID CONTEXT TIME WCHAN-xxxxxxxxxxxxxxxxxxxx CMD
D 93 0 MAIN 00:00:10 synchronize_sched [fsnotify_mark]
D 18713 00:00:00 synchronize_srcu dovecot/imap
D 18736 00:00:00 synchronize_srcu dovecot/imap
D 18775 00:00:05 synchronize_srcu dovecot/imap
D 20330 00:00:00 synchronize_srcu dovecot/imap
D 20357 00:00:00 synchronize_srcu dovecot/imap
D 20422 00:00:00 synchronize_srcu dovecot/imap
D 20687 00:00:00 synchronize_srcu dovecot/imap
S+ 20913 00:00:00 pipe_wait grep D

When the load goes high there are many imap processes in D State.
I think they are delayed and are waiting for some event.
I have no idea on which event they are waiting.

The "fsnotify_mark" thread is always in D state, too. I think this could be the problem.

I'm not shure if this is vServer related or not, but I have no idea how I can resolve this.
Have you any Idea ho I can troubleshoot this problem?

Additional info:
- On the server this the problem is happening there are about 1100 tasks shown by top.
- On my second server there are running 3 vServers with Dovecot, but not under heavy load. Only about 400
tasks are shown by "top" and here I don't have this problem.

Could it be that there are some problems if we have to many tasks active?

Many thanks and regards
Urban
Received on Fri May 18 10:36:09 2012

[Next/Previous Months] [Main vserver Project Homepage] [Howto Subscribe/Unsubscribe] [Paul Sladen's vserver stuff]
Generated on Fri 18 May 2012 - 10:36:10 BST by hypermail 2.1.8