Discussion:
clearing dont-fragment bit
Abraham van der Merwe
2003-10-09 13:43:11 UTC
Permalink
Hi!

Are there any iptables extensions out there that allow you to clear the DF
(Dont Fragment) bit in ip headers?
--
Regards
Abraham

The best diplomat I know is a fully activated phaser bank.
-- Scotty

___________________________________________________
Abraham vd Merwe - Frogfoot Networks CC
9 Kinnaird Court, 33 Main Street, Newlands, 7700
Phone: +27 21 686 1665 Cell: +27 82 565 4451
Http: http://www.frogfoot.net/ Email: ***@frogfoot.net
Maciej Soltysiak
2003-10-09 14:03:26 UTC
Permalink
Hi
Post by Abraham van der Merwe
Are there any iptables extensions out there that allow you to clear the DF
(Dont Fragment) bit in ip headers?
AFAIK no. Why would you want to do that?
I think I might write a module that would do that.

Regards,
Maciej
Abraham van der Merwe
2003-10-09 14:08:19 UTC
Permalink
Post by Maciej Soltysiak
Post by Abraham van der Merwe
Are there any iptables extensions out there that allow you to clear the DF
(Dont Fragment) bit in ip headers?
AFAIK no. Why would you want to do that?
I think I might write a module that would do that.
I need it for tunnels. In a perfect world that wouldn't be necessary at all,
but reality is that there's many brain dead admins that filter icmp, so if
you build a tunnel over the big bad internet, you're screwed.

You can use the TCPMSS target which solves it for tcp, but you still have
the same problem with udp packets, so imho the only way to solve this
properly is to fragment packets even if DF=1.
--
Regards
Abraham

Ask not for whom the telephone bell tolls...
if thou art in the bathtub, it tolls for thee.

___________________________________________________
Abraham vd Merwe - Frogfoot Networks CC
9 Kinnaird Court, 33 Main Street, Newlands, 7700
Phone: +27 21 686 1665 Cell: +27 82 565 4451
Http: http://www.frogfoot.net/ Email: ***@frogfoot.net
Ramin Dousti
2003-10-09 14:43:34 UTC
Permalink
Post by Abraham van der Merwe
Post by Maciej Soltysiak
Post by Abraham van der Merwe
Are there any iptables extensions out there that allow you to clear the DF
(Dont Fragment) bit in ip headers?
AFAIK no. Why would you want to do that?
I think I might write a module that would do that.
I need it for tunnels. In a perfect world that wouldn't be necessary at all,
but reality is that there's many brain dead admins that filter icmp, so if
you build a tunnel over the big bad internet, you're screwed.
You can use the TCPMSS target which solves it for tcp, but you still have
the same problem with udp packets, so imho the only way to solve this
properly is to fragment packets even if DF=1.
The applications that set the DF bit, do so for a reason not just for the
fun. Sometimes (well, actually most of the time) it's for the performance
reasons in which case turning it off and having a poor performance is
preferable than it not working at all. On the other hand, the DF bit would be
set by the application probes to figure the PMTU. Setting that off on the
firewall would harm the purpose.

Can you come up with a list of the non-TCP-based application protocols that
would use the PMTU (DF bit)?

Ramin
Post by Abraham van der Merwe
--
Regards
Abraham
Ask not for whom the telephone bell tolls...
if thou art in the bathtub, it tolls for thee.
___________________________________________________
Abraham vd Merwe - Frogfoot Networks CC
9 Kinnaird Court, 33 Main Street, Newlands, 7700
Phone: +27 21 686 1665 Cell: +27 82 565 4451
Abraham van der Merwe
2003-10-09 14:52:48 UTC
Permalink
Post by Ramin Dousti
Post by Abraham van der Merwe
Post by Maciej Soltysiak
Post by Abraham van der Merwe
Are there any iptables extensions out there that allow you to clear the DF
(Dont Fragment) bit in ip headers?
AFAIK no. Why would you want to do that?
I think I might write a module that would do that.
I need it for tunnels. In a perfect world that wouldn't be necessary at all,
but reality is that there's many brain dead admins that filter icmp, so if
you build a tunnel over the big bad internet, you're screwed.
You can use the TCPMSS target which solves it for tcp, but you still have
the same problem with udp packets, so imho the only way to solve this
properly is to fragment packets even if DF=1.
The applications that set the DF bit, do so for a reason not just for the
fun. Sometimes (well, actually most of the time) it's for the performance
reasons in which case turning it off and having a poor performance is
preferable than it not working at all. On the other hand, the DF bit would be
set by the application probes to figure the PMTU. Setting that off on the
firewall would harm the purpose.
Ideally one would want to leave DF untouched unless a packet with DF=1 is
resent in which case you clear it - that way you solve PMTU probes, but I
suspect this would be overly complicated / resource intensive.

Even better would be if there was a tunnelling protocol that would just take
packets on side A (incl ip headers, galore), chop it up, and reassemble it
on the other side. Unfortunately there is no such thing :P
Post by Ramin Dousti
Can you come up with a list of the non-TCP-based application protocols that
would use the PMTU (DF bit)?
Basically any UDP application that sends packets bigger than the maximum
allowed mtu. I would assume TFTP, SNMP, etc. would all get into trouble. I
know that some protocols such as DNS try to stay below 512 bytes payload,
but there is probably a gazillion protocols out there that don't.
--
Regards
Abraham

The meek shall inherit the earth; the rest of us will go to the stars.

___________________________________________________
Abraham vd Merwe - Frogfoot Networks CC
9 Kinnaird Court, 33 Main Street, Newlands, 7700
Phone: +27 21 686 1665 Cell: +27 82 565 4451
Http: http://www.frogfoot.net/ Email: ***@frogfoot.net
Ramin Dousti
2003-10-09 15:49:06 UTC
Permalink
Post by Abraham van der Merwe
Ideally one would want to leave DF untouched unless a packet with DF=1 is
resent in which case you clear it - that way you solve PMTU probes, but I
suspect this would be overly complicated / resource intensive.
Even better would be if there was a tunnelling protocol that would just take
packets on side A (incl ip headers, galore), chop it up, and reassemble it
on the other side. Unfortunately there is no such thing :P
Use conntrack on both sides at the entrance. It'll ensure the reassembly of
the fragments...
Post by Abraham van der Merwe
Post by Ramin Dousti
Can you come up with a list of the non-TCP-based application protocols that
would use the PMTU (DF bit)?
Basically any UDP application that sends packets bigger than the maximum
allowed mtu. I would assume TFTP, SNMP, etc. would all get into trouble. I
know that some protocols such as DNS try to stay below 512 bytes payload,
but there is probably a gazillion protocols out there that don't.
Neither TFTP nor SNMP set the DF bit and as you said DNS enforces the
packet size itself. NFS might do that though (not sure) but one would
think that NFS should not span over the Internet. So, what other UDP-based
applications use the DF bit?

Ramin
Abraham van der Merwe
2003-10-09 16:13:44 UTC
Permalink
Post by Ramin Dousti
Post by Abraham van der Merwe
Ideally one would want to leave DF untouched unless a packet with DF=1 is
resent in which case you clear it - that way you solve PMTU probes, but I
suspect this would be overly complicated / resource intensive.
Even better would be if there was a tunnelling protocol that would just take
packets on side A (incl ip headers, galore), chop it up, and reassemble it
on the other side. Unfortunately there is no such thing :P
Use conntrack on both sides at the entrance. It'll ensure the reassembly of
the fragments...
I'm not sure I understand? You're saying that given the following scenario:

+---+
| A |
+---+
| eth0 (mtu=1500)
|
|
| eth0 (mtu=1500)
+---+
| B |
+---+
| eth1 (mtu=1500), gre-tunnel-side-a (mtu=1476)
|
|
| eth1 (mtu=1500), gre-tunnel-side-b (mtu=1476)
+---+
| C |
+---+
| eth0 (mtu=1500)
|
|
| eth0 (mtu=1500)
+---+
| D |
+---+

Given that B and C have conntrack enabled, if A sends a 1500 byte packet to
D with DF=1 then B will fragment the packet, send it to C which will then
assemble it (in such a way that the packet that arrived at B will be
identical to the one at C with just the ttl updated) and send it to D?

If not, then please explain. The above behaviour is what I meant.
Post by Ramin Dousti
Post by Abraham van der Merwe
Post by Ramin Dousti
Can you come up with a list of the non-TCP-based application protocols that
would use the PMTU (DF bit)?
Basically any UDP application that sends packets bigger than the maximum
allowed mtu. I would assume TFTP, SNMP, etc. would all get into trouble. I
know that some protocols such as DNS try to stay below 512 bytes payload,
but there is probably a gazillion protocols out there that don't.
Neither TFTP nor SNMP set the DF bit and as you said DNS enforces the
packet size itself. NFS might do that though (not sure) but one would
think that NFS should not span over the Internet. So, what other UDP-based
applications use the DF bit?
Unless I'm missing something setting/clearing the DF bit is up to the
kernel, not the application. So if I do

fd = socket(PF_INET, SOCK_DGRAM, 0);
sendto(fd, buf, 1500, 0, ...);

from my shiny snmp server and buf contains a 1500 byte PDU, then it is up to
the kernel to decide whether to set DF or not...
--
Regards
Abraham

The 'A' is for content, the 'minus' is for not typing it. Don't ever do
this to my eyes again.
-- Professor Ronald Brady, Philosophy, Ramapo State College

___________________________________________________
Abraham vd Merwe - Frogfoot Networks CC
9 Kinnaird Court, 33 Main Street, Newlands, 7700
Phone: +27 21 686 1665 Cell: +27 82 565 4451
Http: http://www.frogfoot.net/ Email: ***@frogfoot.net
Ramin Dousti
2003-10-09 19:44:49 UTC
Permalink
Post by Abraham van der Merwe
Post by Ramin Dousti
Use conntrack on both sides at the entrance. It'll ensure the reassembly of
the fragments...
+---+
| A |
+---+
| eth0 (mtu=1500)
|
|
| eth0 (mtu=1500)
+---+
| B |
+---+
| eth1 (mtu=1500), gre-tunnel-side-a (mtu=1476)
|
|
| eth1 (mtu=1500), gre-tunnel-side-b (mtu=1476)
+---+
| C |
+---+
| eth0 (mtu=1500)
|
|
| eth0 (mtu=1500)
+---+
| D |
+---+
Given that B and C have conntrack enabled, if A sends a 1500 byte packet to
D with DF=1 then B will fragment the packet, send it to C which will then
assemble it (in such a way that the packet that arrived at B will be
identical to the one at C with just the ttl updated) and send it to D?
If not, then please explain. The above behaviour is what I meant.
No, what I was meaning to say was that if the conntrack in enabled on, say C,
then the reassembly takes place on C.

But for the above situation, I'd suggest you (according to the Cisco page
you sent in another email) to increase the MTU on the GRE interface and have
it just fragment the encapsulating packets on B and defragment it on C without
any involvement of the tunneled packets. Give it a go and see if it works.
Post by Abraham van der Merwe
Post by Ramin Dousti
Neither TFTP nor SNMP set the DF bit and as you said DNS enforces the
packet size itself. NFS might do that though (not sure) but one would
think that NFS should not span over the Internet. So, what other UDP-based
applications use the DF bit?
Unless I'm missing something setting/clearing the DF bit is up to the
kernel, not the application. So if I do
fd = socket(PF_INET, SOCK_DGRAM, 0);
sendto(fd, buf, 1500, 0, ...);
from my shiny snmp server and buf contains a 1500 byte PDU, then it is up to
the kernel to decide whether to set DF or not...
I think we're confusing two things here:

1) It is up to the IP stack to fragment based on the MTU of the interface,
size of the packet and the DF bit.

2) It is up to the application to ask for the setting of the DF bit.
I believe you need to work on a SOCK_RAW socket to (un)set this.
But I leave this discussion to the programming guru's. A good source
is Van Jacobson's traceroute.


Ramin
Ralf Spenneberg
2003-10-09 16:23:06 UTC
Permalink
Post by Abraham van der Merwe
Hi!
Are there any iptables extensions out there that allow you to clear the DF
(Dont Fragment) bit in ip headers?
If you clear the DF-Bit and use Linux on either side of the tunnel where
the packets are fragmented you are in deep trouble, because Linux 2.4
(when using PMTU) not only sets the DF-Bit but also clears the IP-ID
which is needed to defragment the packets again. So, when clearing the
DF-Bit you have to ensure unique numbers in the IP-ID field, too.

Cheers,

Ralf
--
Ralf Spenneberg
RHCE, RHCX

Book: Intrusion Detection für Linux Server http://www.spenneberg.com
IPsec-Howto http://www.ipsec-howto.org
Honeynet Project Mirror: http://honeynet.spenneberg.org
Abraham van der Merwe
2003-10-09 16:50:49 UTC
Permalink
Post by Ralf Spenneberg
Post by Abraham van der Merwe
Are there any iptables extensions out there that allow you to clear the DF
(Dont Fragment) bit in ip headers?
If you clear the DF-Bit and use Linux on either side of the tunnel where
the packets are fragmented you are in deep trouble, because Linux 2.4
(when using PMTU) not only sets the DF-Bit but also clears the IP-ID
which is needed to defragment the packets again. So, when clearing the
DF-Bit you have to ensure unique numbers in the IP-ID field, too.
Surely if I clear the DF-bit in the mangle table then the ipstack should
only defragment the packet later on when it made a routing decision and
decided over which interface to send the packet(s) and set the IP-ID fields
and MF-bit accordingly?

Are there any other side-effects when clearing the DF-bit?
--
Regards
Abraham

Who loves me will also love my dog.
-- John Donne

___________________________________________________
Abraham vd Merwe - Frogfoot Networks CC
9 Kinnaird Court, 33 Main Street, Newlands, 7700
Phone: +27 21 686 1665 Cell: +27 82 565 4451
Http: http://www.frogfoot.net/ Email: ***@frogfoot.net
Ralf Spenneberg
2003-10-09 17:12:51 UTC
Permalink
Post by Abraham van der Merwe
Post by Ralf Spenneberg
Post by Abraham van der Merwe
Are there any iptables extensions out there that allow you to clear the DF
(Dont Fragment) bit in ip headers?
If you clear the DF-Bit and use Linux on either side of the tunnel where
the packets are fragmented you are in deep trouble, because Linux 2.4
(when using PMTU) not only sets the DF-Bit but also clears the IP-ID
which is needed to defragment the packets again. So, when clearing the
DF-Bit you have to ensure unique numbers in the IP-ID field, too.
Surely if I clear the DF-bit in the mangle table then the ipstack should
only defragment the packet later on when it made a routing decision and
decided over which interface to send the packet(s) and set the IP-ID fields
and MF-bit accordingly?
Usually the IP-ID field is set by the sender and not by the router
fragmenting the packet. You have to set the IP-ID field and clear the
DF-Bit at the same time.
Post by Abraham van der Merwe
Are there any other side-effects when clearing the DF-bit?
Only maybe the overhead when a fragment is lost.

Cheers,

Ralf
--
Ralf Spenneberg
RHCE, RHCX

Book: Intrusion Detection für Linux Server http://www.spenneberg.com
IPsec-Howto http://www.ipsec-howto.org
Honeynet Project Mirror: http://honeynet.spenneberg.org
Abraham van der Merwe
2003-10-09 18:11:23 UTC
Permalink
Post by Ralf Spenneberg
Post by Abraham van der Merwe
Post by Ralf Spenneberg
Post by Abraham van der Merwe
Are there any iptables extensions out there that allow you to clear the DF
(Dont Fragment) bit in ip headers?
If you clear the DF-Bit and use Linux on either side of the tunnel where
the packets are fragmented you are in deep trouble, because Linux 2.4
(when using PMTU) not only sets the DF-Bit but also clears the IP-ID
which is needed to defragment the packets again. So, when clearing the
DF-Bit you have to ensure unique numbers in the IP-ID field, too.
Surely if I clear the DF-bit in the mangle table then the ipstack should
only defragment the packet later on when it made a routing decision and
decided over which interface to send the packet(s) and set the IP-ID fields
and MF-bit accordingly?
Usually the IP-ID field is set by the sender and not by the router
fragmenting the packet. You have to set the IP-ID field and clear the
DF-Bit at the same time.
Yes, I know, but as long as all the fragments have unique ids it shouldn't
matter. Also, if the packet is fragmented along the way under normal
circumstances (i.e. DF=0), then the IP-ID field would have to be incremented
by the router fragmenting the packet.

Have a look at this: http://www.cisco.com/warp/public/105/56.html

On IOS you can clear the DF-bit and Cisco actually recommends it for this
particular problem so as long as IP-ID is unique for the fragments (which
should be the case) I don't see any problems doing it on Linux other than
degraded performance.
--
Regards
Abraham

Why is it taking so long for her to bring out all the good in you?

___________________________________________________
Abraham vd Merwe - Frogfoot Networks CC
9 Kinnaird Court, 33 Main Street, Newlands, 7700
Phone: +27 21 686 1665 Cell: +27 82 565 4451
Http: http://www.frogfoot.net/ Email: ***@frogfoot.net
Ralf Spenneberg
2003-10-10 05:13:06 UTC
Permalink
Post by Abraham van der Merwe
Yes, I know, but as long as all the fragments have unique ids it shouldn't
matter. Also, if the packet is fragmented along the way under normal
circumstances (i.e. DF=0), then the IP-ID field would have to be incremented
by the router fragmenting the packet.
True but Linux 2.4 clears the IP-ID field when sending a packet with the
DF-Bit set. You have to manually recreate a unique IP-ID field when
clearing the DF-Bit on the firewall. Even when the router increments
this field all packets will have the ID of 1. When defragmenting the
receiver does not know which fragment belongs to which packet.

Linux 2.4 is the only operating system I know of that shows this
behavior.

Cheers,

Ralf
--
Ralf Spenneberg
RHCE, RHCX

Book: Intrusion Detection für Linux Server http://www.spenneberg.com
IPsec-Howto http://www.ipsec-howto.org
Honeynet Project Mirror: http://honeynet.spenneberg.org
Abraham van der Merwe
2003-10-10 08:17:41 UTC
Permalink
Post by Ralf Spenneberg
Post by Abraham van der Merwe
Yes, I know, but as long as all the fragments have unique ids it shouldn't
matter. Also, if the packet is fragmented along the way under normal
circumstances (i.e. DF=0), then the IP-ID field would have to be incremented
by the router fragmenting the packet.
True but Linux 2.4 clears the IP-ID field when sending a packet with the
DF-Bit set. You have to manually recreate a unique IP-ID field when
clearing the DF-Bit on the firewall. Even when the router increments
this field all packets will have the ID of 1. When defragmenting the
receiver does not know which fragment belongs to which packet.
Linux 2.4 is the only operating system I know of that shows this
behavior.
Ok, I see what you're getting at. That brings us back to my original
suggestion. If the tunnel could do the fragmentation _and_ reassembly then
this would not be a problem. *sigh*

- --

Regards
Abraham

I hate it when my foot falls asleep during the day cause that means
it's going to be up all night.
-- Steven Wright

___________________________________________________
Abraham vd Merwe - Frogfoot Networks CC
9 Kinnaird Court, 33 Main Street, Newlands, 7700
Phone: +27 21 686 1665 Cell: +27 82 565 4451
Http: http://www.frogfoot.net/ Email: ***@frogfoot.net

Loading...