Re: Linux networking and disk IO issues

Alan Cox (alan@lxorguk.ukuu.org.uk)
Mon, 4 Jun 2001 21:02:51 +0100 (BST)


> * The Linux networking stack requires all skbuff buffers to be
> contiguous. As far as I can tell, this makes it impossible to
> write high-bandwidth UDP applications on Linux. For instance, the
> kernel will drop a fragmented 8KB message if it cannot allocate 8KB
> of contiguous memory to reassemble it into. I have found that it
> is relatively easy to enter regimes where this can cause massive
> packet loss.

If you are fragmenting messages then you want to optimise the protocol a bit
more. IP fragmentation increases processing overheads and reduces performance
badly in the presence of link congestion and error.

Most modern file sharing protocols are TCP based for good reason

> * readv()/writev(). Linux serializes scatter/gather IO operations
> into an operation for each iovec entry. This is the relevent code
> from a 2.4-series kernel:

Not on a socket. On a file it makes very little difference. Socket readv/writev
behaviour varies by protocol family.

> * For writes, it forces read-modify-write when the individual
> iovecs are not block-aligned.

> * There is no preadv(), pwritev(). (The pread/pwrite() system calls
> combine a llseek with a read/write system call.) This means that

True. The single unix specification does not include preadv(). Really you want
to take it up with the Opengroup. That said Linux does add syscalls that are
not in SuS sometimes.

> * The requirement that everything about operations to raw character
> device files (length, offset in file, *and* address in memory) has
> to be 512-byte aligned is a real hassle.

Welcome to PC hardware. Large amounts of PC hardware genuinely has limitations
of this nature. Most disk controllers can only write whole sectors on a sector
alignment. Many network controllers can only handle burst or 32bit alignment
policies

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/