[OpenAFS-devel] rx mtu

chas williams chas@locutus.cmf.nrl.navy.mil
Mon, 27 Jan 2003 14:01:29 -0500


In message <20030127.190056.85401938.haba@pdc.kth.se>,Harald Barth writes:
>nothing on today's CPUs. I do not know why fragmenting in the IP layer
>is cheaper that in the rx layer, but on a 40Mhz SPARC, it is
>much faster. On an 800Mhz PC it is not. I do not think we should make
>design descicions that care about 40Mhz SPARCs any more.

because some operating system dont properly support fragments.  i 
have mentioned this before somewhat.  the rx 'jumbo packet' goes up to the
ip layer as a cluster of smaller packets.  some ip stacks copy this
into a linear buffer and then fragment and transmit.  so you can see
why it would be beneficial to have a larger 'jumbo packet' size.

further, the read and writes are done for each rx packet not the single
jumbogram packets as a whole (since they are really individual rx packets).
this uses more cpu/interrupts/memory than a single read/write for the
entire packet.

another reason for larger than mtu packets would be kernel overhead.  its
really a performance killer to only move an mtu's worth per read/write
in the kernel.  the extra overhead of fragmenting and reassembling in the
kernel is small in comparision.

i am in favor of making a new 'jumbogram' type that isnt composed of
rx fragments but a single buffer.  the existing (relatively new actually)
rx congestion control could be used to adjust the size of the buffer.