[pcap-ng-format] [tcpdump-workers] sane maximum length for block_length

Guy Harris gharris at sonic.net
Tue Sep 24 09:17:11 UTC 2019


On Sep 20, 2019, at 2:02 PM, Michael Richardson <mcr at sandelman.ca> wrote:

> Guy Harris <gharris at sonic.net> wrote:
>> Currently, Wireshark's pcapng reading code imposes a block size limit
>> for all blocks:
> 
>> enough that * the resulting block size would be less than the previous
>> 16 MiB limit.  */ #define MAX_BLOCK_SIZE (MIN_EPB_SIZE +
>> WTAP_MAX_PACKET_SIZE_DBUS + 131072)
> 
>> WTAP_MAX_PACKET_SIZE_DBUS is 16 MiB.
> 
> So, MIN_EPB_SIZE (28?) + 16MiB + 128KiB.
> I think that this is a fine maximum for quite a number of block types.
> I propose to introduce sane maximums for each block type, on a block type basis.

Your recent checkin has an SHB maximum size of 1 MiB.

An SHB is 24 bytes of fixed data plus option, so that allows almost 1 MiB of options.

The size of an EPB is 28 + packet data size (padded to a multiple of 4 bytes) plus options, so your Wireshark-derived maximum size for an EPB is pretty much based on a maximum 128 KiB of options.

Is there a reason to have different maximum-bytes-of-options values for different blocks?  If not, I'm OK with a maximum of either 128 KiB or 1 MiB (or other reasonable values) for the maximum number of option bytes.  The maximum size of an option is 4 plus 65536 (maximum option value size, rounded up to a multiple of 4), so 128 KiB is slightly under 2 maximum-sized options.  1 MiB wouldn't be enough to store all of *War and Peace* in a sequence of comment options (storing it in an English translation; storing it in the initial Russian would be worse, as that's two bytes per letter in UTF-8), but *The Great Gatsby* would fit. :-)

> I can live with 16GiB as the *maximum* that we will allocate.
> I'd like to put this in the draft: every block should have a *reasonable*
> maximum.  I plan to work on a mmap() based reading API,

Note that memory-mapping means that, on a read error, the program will probably die with a signal (UN*X) or exception (Windows).  Disks are pretty reliable, so you probably won't get many EIOs from the disk (I *did* get them at Sun when some SMD disk was failing, but that was the mid-to-late 1980's).  However:

	if the drive is removable, the user unplugging the drive could cause an error;

	if the "drive" is a share mounted from a file server, unless it's an uninterruptible NFS hard mount, either ^Cing a hard mount or getting a timeout on other mounts could cause an error.

At Apple, at least some software only used mmap() for files on a local, non-removable drive.  fstatfs() might be able to tell you whether the file is on a local drive (the MNT_LOCAL flag on at least some BSD-flavored OSes; checking the file system type field against known non-local file systems on Linux, although the latter is less robust).  I don't remember offhand how you distinguish volumes on removable vs. non-removable media.

> and I that shouldn't
> have a problem with block size on 64-bit systems.  But maybe on 32-bit
> systems, it should use mmap() in some more creative way.

Map in a region of the file and, if you need something outside that region, uncap the old region and map the new region.

> I'm not sure here. Are there any good libraries to outsource this problem?

I don't know of any offhand.

> I'd like to do an AIO (libuio)

libuio or libaio:

	https://pagure.io/libaio

?

Using POSIX aio_ routines would allow it to work on at least some other UN*Xes as well.  I guess the Windows equivalent is $QIOW^Woverlapped I/O.


More information about the pcap-ng-format mailing list