[ntar-workers] Re: [Ethereal-dev] Re: NTAR - PCAP nextgenerationdump file formatimplementation

Wed Jun 29 02:24:37 GMT 2005

Hi LEGO.

----- Original Message ----- 
From: "LEGO" <luis.ontanon at gmail.com>
To: "Gianluca Varenni" <gianluca.varenni at cacetech.com>
Cc: "Ethereal development" <ethereal-dev at ethereal.com>;
<ntar-workers at winpcap.org>
Sent: Tuesday, June 28, 2005 4:26 AM
Subject: Re: [ntar-workers] Re: [Ethereal-dev] Re: NTAR - PCAP
nextgenerationdump file formatimplementation

> Comments marked with --LEGO--
>
> On 6/28/05, Gianluca Varenni <gianluca.varenni at cacetech.com> wrote:
> > Comments marked with --GV--
> >
> > Have a nice day
> > GV
> >
> > ----- Original Message -----
> > From: "LEGO" <luis.ontanon at gmail.com>
> > To: "Gianluca Varenni" <gianluca.varenni at gmail.com>
> > Cc: "ronnie sahlberg" <ronniesahlberg at gmail.com>; "Ethereal development"
> > <ethereal-dev at ethereal.com>; "ntar-workers" <ntar-workers at winpcap.org>
> > Sent: Monday, June 27, 2005 4:21 AM
> > Subject: [ntar-workers] Re: [Ethereal-dev] Re: NTAR - PCAP next
> > generationdump file formatimplementation
> >
> >
> > I was not clear at all in my last mail so I'll re-expose myself.
> >
> > Please note that I'll refer to "chunks" for segments of data
> > containing more records and to "records" to refer to atoms. I find the
> > term block confusing (in my forma-mentis a block seems something that
> > contains more records).
> >
> > The essential point I tried to make is to have the marker-records at
> > predictable locations in the file.
> > That would be N*chunk_size bytes.
> >
> > --GV--
> > That's clear.
> > --GV--
> >
> >
> > To archieve that we need:
> > A padding record to fill in the space between the last packet and the
> > location of the next marker record.
> > And, a marker record that tell us what we have so far.
> >
> > --GV--
> > Well, I don't see why you cannot use the sections for the same purpose.
> > You
> > can jump between jumps easily, and we can easily add some information
> > (either in the interface description block or by means of a sort of
> > marker
> > block) that tells number of packets/bytes, timestamps... In my opinion,
> > this
> > is one of the main uses of sections. Any other opinion about that?
> --LEGO--
> You might use hsbs as "chunk markers" I find that more than OK. As far
> as the hsb contains the offset up to the next one (if any) [see my
> note about padding bellow].

Yes, it does. It contains the length of the section (or 0xFF...., if 
unknown) to allow easily jumps between sections.
Quoting from the pcap-ng spec:

a.. Section Length: 64-bit value specifying the lenght in bytes of the 
following section, excluding the Section Header Block itself. This field can 
be used to skip the section, for faster navigation inside large files.

>
> That means that we will not have to have a fixed chunk size for all
> the file but still we could skip from one hsb to the next. As far as
> enough info to know where we are and where to go its conveyed in the
> hsb that's ok.
>
> The problem, I did not consider up to now  is how while skipping do I
> find where the interface descriptions are?
>
> Once we sought forwrad several chunks most probably we've had skipped
> some. This needs a solution.
>
> We could replicate all the known interface descriptions after every
> HSB.  I don't like it... any other ideas?

The validity of an IDB is the section; the reason is quite simple: a section 
represents the minimal trace file (i.e. a file is at least formed by one 
section), and a trace file can contain multiple sections, either all written 
by the same tool, or possibly obtained by *physically* cat'ting multiple 
trace files together (i.e. "cat trace1.ntar trace2.ntar > trace3.ntar"). 
Moreover the IDB is mandatory in every section (or better, at least, at 
least one IDB should be present in section *before* any packet that refers 
this IDB), but the same section can contain multiple IDBs.

This is documented in the spec here:

https://www.winpcap.org/ntar/draft/PCAP-DumpFileFormat.html#sectionidb

This makes me think: what happens if you create a section that does NOT 
contain packets?

>
> --LEGO--
>
> > Moreover, the idea of having a padding block is troublesome: by
> > definition,
> > the smallest block in pcap-ng is 12 bytes. If we create a padding block
> > because we want to support these markers at fixed locations, we need to
> > be
> > *very* careful of where to put these padding blocks (i.e. we need to
> > take
> > care of having either 0 bytes left before a new marker, or >= 12 bytes,
> > otherwise we are screwed up).
> --LEGO--
>
> That's just a little more waste, the solution is somewhat trivial.
>
> if (packet_len + 12 > remaining && packet_len != remaining)  {
>     write_padding(remaining);
>     write_hsb();
> }
>
> write_packet();
>
> BTW. I believe that zero padding would be a better solution.
>
> --LEGO--
> > Finally, as I said before having these markers at fixed locations is a
> > real
> > pain in the neck when you have a file with multiple sections *and* the
> > SHB
> > does not contain the length of the section.
> ---LEGO---
>  The SHB is the right place for the chunk lenght option.
>
> It's just a matter whether it could, should or  MUST be there.

>
> I believe it must be there. Even more, I would say it's value MUST be
> between 2Mb and 16Mb.

Well, it's there by spec :-).
Regarding the "could/should/must be there" problem, we do have the section 
length, *but* its value can be "unknown" (0xFFF...), and this is needed in 
order to support pipes (i.e. when you cannot backwards seek and write the 
section length when you close a section). So I vote for "it should be there, 
if we create the trace file through pipe, it's *strongly* suggested to 
reprocess the trace file and write the correct section lengths in the SHBs".

Have a nice day
GV

> ---LEGO---
>
> > --GV--
> >
> > Another point is not to have to seek backwards to fill in fields while
> > dumping. Neither I like the Idea of having to keep a whole chunk in
> > memory nor to have to keep more than few very essential state
> > variables.
> >
> > --GV--
> > I totally agree with you.
> > --GV--
> >
> > I think either the file header or the HSB should tell which chunk-size
> > we are using.
> > The whole dump-file should use one an only one chunk-size.
> >
> > --GV--
> > No. A file contains multiple sections (possibly coming from multiple
> > cat'ted
> > files). So *if* we really want to use this chunks approach, the chunk
> > size
> > should be saved in the SHB.
> > --GV--
> --LEGO--
>
>    I volunteer to write a ntar-cat that either takes the chunk_size of
> the first hsb or allows the user to set it.

Uhm... what do you mean? Two trace files can be binary cat'ted, and the 
result is still a valid trace file. Do you mean an utility able to 
backpatch/rewrite the trace file and organize packets in sections of equal 
chunk_size? This could be interesting, and *will* be easily done with the 
ntar APIs (I mean file-rewriting, in-place patching is not possible at the 
moment, it's in my todo list). "will" because this feature needs backwards 
seeking in ntar, and currently disabled (due to the hell of having a 
portable fseek working with large files); I'm working on this stuff, I hope 
to have a new version of ntar on the website in the weekend.

>
> --LEGO--
> >
> > On 6/27/05, Gianluca Varenni <gianluca.varenni at gmail.com> wrote:
> > > In general, I prefer your solution over Ronnie's one: the basic idea
> > > of
> > > pcap-ng was to create not too much nesting in the file, so at the
> > > moment
> > > we
> > > basically have two levels, sections and blocks. In the future it's
> > > possible/probable that we will have some nesting, especially to
> > > support
> > > compressed and cyphered blocks.
> >
> > I like it that way. Nesting stuff would mean either to seek back to
> > write the length or to keep in memory a whole meta-record before
> > writing. I do not like either solution.
> >
> > [snip]
> > > In my opinion the marker should contain the following informations
> > > - number of packets up to the next "marker block"
> > This is a forward reference
> > > - number of bytes up to the next "marker block"
> > my point is we should know this before even reading the first
> > marker-record.
> > > - timestamp of the first and last packet up to the next "marker block"
> > I would put those before the marker record.
> > > - offset to the next "marker block"
> > In my scenario we know it beforehand
> > > - other??
> >
> > _______________________________________________
> > ntar-workers mailing list
> > ntar-workers at winpcap.org
> > https://www.winpcap.org/mailman/listinfo/ntar-workers
> >
> >
>
>
> -- 
> This information is top security. When you have read it, destroy yourself.
> -- Marshall McLuhan
>
> _______________________________________________
> Ethereal-dev mailing list
> Ethereal-dev at ethereal.com
> http://www.ethereal.com/mailman/listinfo/ethereal-dev