Discussion:
High ksoftirqd CPU load, high latency
Roberto Suarez Soto
2012-03-02 09:51:26 UTC
Permalink
Hi,

we've got a load problem in a firewall we administer, and we believe it could
be related to iptables. But we don't know for sure, and don't know either how
to confirm or deny it. I'm quite at a loss, and would like to get the list's
opinion/ideas/voodoo magic/hints on the issue. Thanks in advance!

The symptoms are:

- High ksoftirqd load in one CPU (the one assigned to the LAN ethernet's IRQ)
- High network latency in the LAN, even from another box in the same switch
- Growing system load (up to 6) until ksoftirqd load decreases

About the system:

- Two-node cluster, active/passive (problem reproduced in any of the boxes
that is active at that moment)
- Running Debian Squeeze (kernel 2.6.32-5-686-bigmem)
- Two Intel Xeon CPUs, 8GB RAM, no other services but firewalling
- Four network interfaces: WAN, LAN and cluster sync (a bit more about this
later)
- Two Intel 82571EB NICs (e1000e driver, PCI card) and Broadcom BCM5708
(bnx2, on board), connected to 100Mbps switches
- NICs are: Intel for WAN, Broadcom for LAN and cluster sync
- All NICs appear as "PCI-MSI-Edge" in /proc/interrupts

netfilter stats:

- nf_conntrack_buckets is 16384
- nf_conntrack_max is 1048576
- nf_conntrack_count is usually around 25k-30k
- About 2200 iptables rules, 800 of them for NAT (I read somewhere that NAT
is an expensive process, that's why I'm remarking that)

As the problem seems to arise in the CPU assigned to the LAN NIC (according
to /proc/interrupts), we tried two things:

- Changed coalescing values with ethtool, to send more frames per IRQ (didn't
hurt, but didn't solve the problem either)
- Created a "bond0" in balance-rr mode, aggregating the previous LAN NIC and
one of the Intels that wasn't in use (less IRQ calls in /proc/interrupts, but
didn't fix the problem; also, there's now an "events" process in peak times
that also occupies CPU and I think wasn't there before, but I'm a bit paranoid
and could be wrong)

A few statistics when the problem appears:

- Packet count is around 3k-4k per second (counting both NICs), both incoming
and outgoing (i.e., 3k-4k incoming, 3k-4k outgoing)
- Traffic is around 3MBytes/s (counting both NICs), both incoming and outgoing
- Packet drops rise to 200-300 per second (counting both NICs), usually is 0

Strange things that I've seen:

- There are higher packet and traffic counts in other moments (4k-5k pps; 5-8
MBytes/s), and the problem doesn't appear
- Using "sar -I 98,99" (being those the interrupts for the NICs in the bond),
I've seen that the IRQ count per second seems to be 0 most of the time, only
rising from time to time; I suppose it's because of the coalescing values, but
I found it strange anyway
- Though now there's a bond, ksoftirqd seems to be heavy only in one CPU; I
haven't confirmed it 100%, but it seems it's in the one assigned to the
Broadcom NIC; /proc/interrupts doesn't report a high IRQ usage, so I don't
know what can be causing that

Could this problem be related to iptables? I've trimmed a few rules
(previously there were almost 3000), but it keeps happening. And I don't
believe those are too many rules, anyway.

Also, is there anything else I can do to debug the problem? Unfortunately,
this is a production firewall and I can't do anything that means an outtage
without serious justification. But maybe there's something that gives us a
better idea of where the bottleneck is.

Thanks,
--
Roberto Suarez Soto Allenta Consulting
***@allenta.com www.allenta.com
+34 881 922 600
--
To unsubscribe from this list: send the line "unsubscribe netfilter" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Kerin Millar
2012-03-02 11:52:16 UTC
Permalink
Hello,
Post by Roberto Suarez Soto
we've got a load problem in a firewall we administer, and we
believe it could be related to iptables. But we don't know for sure, and
don't know either how to confirm or deny it. I'm quite at a loss, and
would like to get the list's opinion/ideas/voodoo magic/hints on the
issue. Thanks in advance!
- High ksoftirqd load in one CPU (the one assigned to the LAN ethernet's IRQ)
For this issue, I've found that irqbalance works wonders. I recommend
using version 1.0 or greater and running it as a daemon. To get the best
out of it, I'd also recommend building a kernel with the following patch.

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=da8d1c8

The patch should apply cleanly to the 2.6.32 series. Here's an
explanation as to why the patch is useful:-

https://code.google.com/p/irqbalance/source/detail?r=32a7757a0314

You can look at /proc/interrupts to determine whether your card supports
multiqueue and is exposing distinct interrupts per tx/rx queue.

Cheers,

--Kerin

--
To unsubscribe from this list: send the line "unsubscribe netfilter" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Engelhardt
2012-03-02 12:03:29 UTC
Permalink
Post by Roberto Suarez Soto
we've got a load problem in a firewall we administer, and we
believe it could be related to iptables. But we don't know for sure, and
don't know either how to confirm or deny it. I'm quite at a loss, and
would like to get the list's opinion/ideas/voodoo magic/hints on the
issue. Thanks in advance!
- High ksoftirqd load in one CPU (the one assigned to the LAN ethernet's IRQ)
For this issue, I've found that irqbalance works wonders. I recommend using
version 1.0 or greater and running it as a daemon.
Better is irqd, which claims to have specific understanding of RPS/RFS/MQ.
https://github.com/vaesoo/irqd
--
To unsubscribe from this list: send the line "unsubscribe netfilter" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Roberto Suarez Soto
2012-03-02 12:20:26 UTC
Permalink
For this issue, I've found that irqbalance works wonders. I recommend=
using
version 1.0 or greater and running it as a daemon. To get the best ou=
t of it,
I'd also recommend building a kernel with the following patch.
Sorry, I forgot to say that we're already using it. Version 0.56, thou=
gh,=20
which is the one that comes in Debian Squeeze. I'll see if it's easy to=
make a=20
backport.

Thanks,

--=20
Roberto Suarez Soto Allenta Consul=
ting
***@allenta.com www.allenta=
=2Ecom
+34 881 922=
600

Este correo electr=F3nico contiene informaci=F3n estrictamente confiden=
cial y
es de uso exclusivo del destinatario, quedando prohibida a cualquier ot=
ra
persona su revelaci=F3n, copia, distribuci=F3n, o el ejercicio de cualq=
uier
acci=F3n relativa a su contenido. Si ha recibido este mensaje por error=
, por
favor conteste a su remitente mediante correo electr=F3nico y proceda a
borrarlo de su sistema. Rogamos nos comunique inmediatamente sobre
cualquier inconveniente que pueda tener usted en relaci=F3n al env=EDo =
de este
tipo de correo electr=F3nico.

Sus datos personales ser=E1n tratados de forma confidencial y no ser=E1=
n
cedidos a terceros ajenos a ALLENTA CONSULTING, S.L. En cualquier caso,
podr=E1 ejercer los derecho de oposici=F3n, acceso, rectificaci=F3n y c=
ancelaci=F3n
de acuerdo con lo establecido en la Ley Org=E1nica 15/99, de 13 de dici=
embre,
de Protecci=F3n de Datos de Car=E1cter Personal dirigi=E9ndose a ALLENT=
A
CONSULTING, S.L. en C/Enrique Mari=F1as 36, 2=BA piso, oficina 8, 15009=
=96 A
Coru=F1a o en la direcci=F3n de electr=F3nico ***@allenta.com

--
To unsubscribe from this list: send the line "unsubscribe netfilter" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Roberto Suarez Soto
2012-03-12 09:30:56 UTC
Permalink
Hi,

I think I have this nailed down, and it happens not to be related to i=
ptables=20
at all. My fault: one of the main jobs of the problematic boxes is bein=
g IPSec=20
gateways, which I shamefully forgot to say when reporting this problem.=
And it=20
seems this is the problem.

Using "perf top", I can see that when ksoftirq is at 100% CPU the sysc=
alls=20
being called are __xfrm4_find_bundle (occupying most of the CPU),=20
des3_ede_encrypt and des3_ede_decrypt. Last friday I thought the proble=
m were=20
just the latter two, and after some googling and finding that 3DES is a=
ctually=20
more inefficient than AES, I started to change 3DES for AES-128. But to=
day=20
it's clear that, though 3DES may be a part of the problem, the gist of =
it is=20
__xfrm4_find_bundle.

I've been searching for some explanation about what this syscall does,=
and=20
found a message in a forum=20
(http://www.linuxforums.org/forum/kernel/184652-high-softirq-cpu-usage-=
while-ipsec-active.html)=20
hinting to xfrm settings as the culprit. But anyway, I'm going to ask i=
n=20
openswan's list, which seems the proper place to do it.

Thanks,

--=20
Roberto Suarez Soto Allenta Consul=
ting
***@allenta.com www.allenta=
=2Ecom
+34 881 922=
600

Este correo electr=F3nico contiene informaci=F3n estrictamente confiden=
cial y
es de uso exclusivo del destinatario, quedando prohibida a cualquier ot=
ra
persona su revelaci=F3n, copia, distribuci=F3n, o el ejercicio de cualq=
uier
acci=F3n relativa a su contenido. Si ha recibido este mensaje por error=
, por
favor conteste a su remitente mediante correo electr=F3nico y proceda a
borrarlo de su sistema. Rogamos nos comunique inmediatamente sobre
cualquier inconveniente que pueda tener usted en relaci=F3n al env=EDo =
de este
tipo de correo electr=F3nico.

Sus datos personales ser=E1n tratados de forma confidencial y no ser=E1=
n
cedidos a terceros ajenos a ALLENTA CONSULTING, S.L. En cualquier caso,
podr=E1 ejercer los derecho de oposici=F3n, acceso, rectificaci=F3n y c=
ancelaci=F3n
de acuerdo con lo establecido en la Ley Org=E1nica 15/99, de 13 de dici=
embre,
de Protecci=F3n de Datos de Car=E1cter Personal dirigi=E9ndose a ALLENT=
A
CONSULTING, S.L. en C/Enrique Mari=F1as 36, 2=BA piso, oficina 8, 15009=
=96 A
Coru=F1a o en la direcci=F3n de electr=F3nico ***@allenta.com

--
To unsubscribe from this list: send the line "unsubscribe netfilter" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Vairavan
2012-05-18 14:51:14 UTC
Permalink
Hi Robe
Your problem looks insteresting. I would like how reacreate your problem and
test further.

Do you have any methods?


--
To unsubscribe from this list: send the line "unsubscribe netfilter" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...