[ntar-workers] On Markers and Bookmarks

Loris Degioanni loris.degioanni at gmail.com
Fri Jul 8 05:42:11 GMT 2005


Christian Kreibich wrote:

> Hey Gianluca,
> 
> On Mon, 2005-07-04 at 19:21 -0700, Gianluca Varenni wrote:
> 
>>>So what's the verdict on the PacketsBlock/GoPBlock idea?
>>
>>My verdict would be to use markers and sections, and not make use of 
>>GoPBlocks. Actually, I'm still thinking of what is the best approach. In 
>>general GoPBlocks introduce a real hierarchy of blocks in the file format, 
>>and a hierarchical file format was discarded last year when the pcap-ng was 
>>first discussed.
> 
> 
> mhmm unfortunately I cannot recall that discussion any more. Note though
> that I don't mean arbitrary recursive nesting -- just packet blocks
> contained in a grouping block. This was on the assumption that it might
> be useful to have a grouping mechanism for a bunch of packets on a finer
> granularity than sections. So in a sense a GoPBlock is similar to a
> marker block that is valid until the next marker, with GoPBlocks having
> the added benefit that you can potentially have end-of-GoP content more
> naturally.
> 
> 
>>I think that the idea of the indexing block is worth more discussion, maybe 
>>before trying to implement it; moreover, I'm still working quite heavily on 
>>the library to clean it and offer some missing features that are needed for 
>>these new block types (in particular, something to allow random access to 
>>the blocks of a file).
> 
> 
> Isn't that just the indexing we're discussing here (including Chema's
> navigation blocks proposal)?

No, I think what Gianluca is talking about is really going back and 
forth in the file: fseek and friends. Implementing them in a portable 
and 64-bit-friendly way (since the specification requires 64-bit 
support) is really complicated. Moreover, fseek seems to be very slow 
(slower than reading everything) on several platforms.

> 
>>Some random issues/ideas/thoughts that came to my mind in the last days:
>>- do we really need markers? can the indexing block can point to packet 
>>blocks?
> 
> 
> I believe yes, as long as the indexing structure in the end just points
> to file offsets (relative to section starts), it can point to pretty
> much anything.
> 
> 
>>- we should remember that two ntar files can be concatenated using "cat". As 
>>a consequence, the file offsets should be relative.
> 
> 
> Yep.
> 
> 
>>- can the indexing block index block in multiple sections? There can be byte 
>>order and interface id issues.
> 
> 
> I'd really really really keep sections self-contained.
> 
> 
>>- Is the timestamp information to seek in the file enough for the majority 
>>of apps?
> 
> 
> Imho, yes. If the trace gets big, time becomes the primary mechanism for
> selections. Even if you're indexing semantically different items (say
> markers identifying the beginning of malicious flows, etc) you'll likely
> still use time to navigate among multiple instances. I'd love to hear
> counterexamples though.

I kind of like the idea of using time to navigate inside traces, it 
could have many cool applications. The other important indexing method 
in my opinion is the number of packets: you probably want to know how 
many packets you skip if you jump to the next GoPBlock. I think for 
example about tools like Ethereal, which could reserve the right space 
in the list of packets with a quick scan, without even reading the packets.
By the way, would it make sense to report the number of packets of a 
GoPBlock in the following GoPBlock marker? This would prevent us from 
having to go back in the file.

Christian, if you are still interested in working on this, we could come 
up with a more precise definition, and then write it down in the specs. 
At that point, we would be ready to start the implementation. What do 
you think?

Loris


> 
>>- how is the timestamp represented? pcap-ng does not have a fixed 
>>representation for the timestamp precision: the interface description block 
>>contains a specific option that tells the precision of the timestamps for 
>>the packets captured on that interface. By default it's microseconds.
> 
> 
> I think it doesn't matter too much as long as comparison operations work
> universally.
> 
> Cheers,
> Christian.


More information about the ntar-workers mailing list