WinDump is the Win32 porting of one of the most used UNIX network capture and analysis programs: TCPdump. It allows the user to capture and view the packets that are transmitted onto the network and that are received by the network interface card. It can run on Windows 95/98 and on Windows NT. WinDump is written to work primarily with Ethernet networks, but in Windows 95 it can be used without problems on PPP connections. This limitation is mainly due to the underlying layers, i.e. libpcap and the packet driver, whose behavior is not completely network-independent and has been tuned mainly for Ethernet networks.
Goal was to make a clean porting, maintaining the structure of the original version where possible, taking care at same time of the capture performance and the cleanliness of the code. WinDump uses the packet capture library (libpcap) for the capture process, a library written originally for UNIX that implements a set of high level capture functions. This library has been ported to Windows. It uses the NDIS packet capture driver, a device driver that interacts with NDIS to capture the packets from the network. This driver is system dependent so there are two versions: vpacket.vxd for Windows 95 and packet.sys for Windows NT. The pcap library and the packet capture driver can be very a powerful base for writing network capture and monitor application for Windows. They are also very useful to make the porting on Windows of applications already written for UNIX. For this reason a developer's pack is provided, with all the files and the instructions needed to use libpcap and the driver. Furthermore, the project has a didactic usefulness: in fact, the source code of the entire project is freely available under the GNU General Public License and can be used to learn how a capture application for Windows works.
Note: we will use the term packet even though frame is more accurate, since the capture process is done at the data-link layer and the data-link header is included in the frames received.
To capture packets transferred by a network, a capture application needs to interact directly with the network hardware. For this reason, in order to make possible the execution of such a program, the operating system must offer a set of capture primitives to communicate and receive data directly from the network adapter. The task of these primitives is primarily to capture the packets from the network (hiding the interaction with the network adapter), and to transfer them to the calling programs. This is heavily system dependent, so it is quite different in the various operating systems. The packet capture section of the kernel should be quick and efficient because it must be able to capture packets on high-speed LANs with (maybe) heavy traffic, limiting losses of packets. It should also be general and flexible in order to be used by different types of applications (analyzers, network monitors, network test applications...).
The capture application receives the packets from the system, interprets and processes them, and outputs them to the user in a comprehensible and productive way. It should be easy to use, system independent, modular and expandable, so to support the higher number of protocols and to make it possible to easily add a new protocol. These features are essential because of the high number of network protocols currently available and the speed they change, so completeness and expandability are important in a capture program.
WinDump.exe is only the upper part of a packet capture stack, made of a module that runs at the kernel level and one that runs at user level. These two modules have different purposes and are independent and isolated one from another. The first runs at ring 0 on Intel based machines, while the second runs at ring 3 like a normal Windows program. The kernel part is Windows specific, and it is very different according to various Windows flavors. The user-level part is very similar to the UNIX implementation, and it is equal under Win95 and WinNT. The structure of the capture stack from the network adapter to WinDump is shown in the next figure.
At the lowest level of the structure there is the network adapter. It is used to capture the packets that circulate in the network. During a capture the network adapter usually works in a particular mode (promiscuous mode) that forces it to accept all the packets, not only the ones directed to it.
The Packet Capture Driver is the lowest level software module of the capture stack. It is the part that works at kernel level, and interacts with the network adapter to obtain the packets. It supplies the applications a set of functions useful to read and write data from the network at data-link level.
Packet.dll works at the user level, but is separate from the WinDump program. It is a dynamic link library that isolates the capture programs from the driver, providing a system-independent capture interface. It allows WinDump to be executed on different Windows OS without being recompiled.
The pcap library, or libpcap, is a static library that is used by the packet capture part of the WinDump program. It uses the services exported by packet.dll, and provides to WinDump a higher level and powerful capture interface. Notice that it is statically linked with WinDump, i.e. it is part of the WinDump.exe executable file.
The user interface is the higher part of the WinDump program. It manages the interaction with the user and displays the result of a capture.
We will now describe these modules, their behavior and the architectural choices under them.
The WinDump program is the higher part of the capture stack and manages the interaction between the user and the system. It gets the users inputs (for example, which packets must be captured and in which way they must be showed to the user) from the command line and outputs the results on the screen. The WinDump executable running on a Windows platform is identical to the TcpDump executable running on a UNIX workstation from the user viewpoint. This means that the two programs have almost the same input parameters and the same output format.
TcpDump on UNIX makes use of a library for the packet capture process, called Packet Capture library, or pcap library, or libpcap, a system-independent interface for user-level packet capture. Libpcap provides a set of functions independent from the hardware and the operating system that an application can use to capture packets from a network. It is important to remember that libpcap is a general-purpose capture library, and in the UNIX world is used not only by TcpDump, but also by other network tools and applications. TcpDump does not interact directly with hardware, but uses the functions exported by libpcap to capture packets, set packet filters and communicate with the network adapter. In this way it is essentially system-independent and can easily be ported to any system on which the libpcap library works. For this reason the WinDump project includes a full porting of the libpcap library toward the Win32 platform. Furthermore, the pcap library is not limited to be used with WinDump, and we think that it can be very useful to people that want to convert network monitors and analyzers from the UNIX world, and it can be also a powerful base to create new network tools for the Win32 environment. Therefore, we tried to maintain the structure of the original version in our implementation. We modified only the section of libpcap that communicates with the kernel, implementing the interaction with the NDIS packet capture driver. An important characteristic of libpcap for Win32 is that there is only a version that works both on Windows 95 and on Windows NT. This can be obtained by putting a dynamic link library, called packet.dll, between libpcap and the capture driver, so that the system calls to the driver are the same in the various Windows environments. In this way, if a network tool uses libpcap, it works on Windows 95 and Windows NT without any modifications.
The porting of TcpDump, once a working implementation of libpcap for Win32 was ready, was not too difficult. Only few differences were introduced in WinDump, adding a pair of new switches:
The basic role of the kernel part of the capture stack is to take the link-layer packets from the network and to transfer them without modifications to the application level. We implemented it as a kernel driver (packet.sys) under Windows NT and as a Virtual Device Driver (vpacket.vxd) under Windows 95. Applications can have access to the capture driver with read and write primitives, and can treat the network adapter in a way similar to a normal file, reading or writing from it the data that comes from the network. The capture driver can be used by any Win32 application that needs to capture packets, and is not limited to the use with WinDump or Analyzer. The interaction with the packet capture driver usually passes through the packet.dll dynamic link library. The DLL implements a set of functions that make simpler the communication with the driver, avoiding the use of things such system calls or IOCTLs to use it.
The packet capture driver interacts with the network adapters device drivers through NDIS, that is a part of the network code of Win32. NDIS is responsible of the management of the various network adapters and of the communication between the adapters and the software portions that implement the protocols.
A basic network capture driver can be quite simple. It needs only to read the packets from the network driver and copy them to the application. However, in order to obtain acceptable performances, substantial improvements have to be done to this basic structure. The most important are:
The implementation of these features and the architecture of the driver were inspired by the BSD Packet Filter (BPF) of the UNIX kernel, of which we make a brief description in the next paragraph.
BPF, or BSD Packet Filter, described in [McCanne and Jacobson 1993], is a kernel architecture for packet capture proposed by Steven McCanne and Van Jacobson. It was created to work in UNIX kernel and it exports services used by TcpDump and many other UNIX network tools.
BPF is essentially a device driver that can be used by UNIX applications to read the packets from the network through the network adapter in a highly optimized way. It is an anomalous driver because it does not have a direct control on the network adapter: the adapter's device driver itself calls the BPF giving it the packets.
It has two main components: the network tap and the packet filter.
The network tap is a callback function that is a part of the BPF code but it is not executed directly by BPF. It is invoked by the network adapters device driver when a new packet arrives. The adapters driver MUST call the network tap for every packet, otherwise BPF will not work on the adapter.
The network tap collects copies of packets from the network device drivers and delivers them to listening applications. The incoming packets, if accepted by the filter, are put in a buffer and are passed to the application when the buffer is full.
The filter decides if a packet should be accepted and copied to the listening application. Most applications using BPF reject far more packets than those accepted, therefore good performance of the packet filter is critical to good over-all performance. The simplest packet filter is simply a function with boolean output that is applied to a packet. If the value of the function is true the kernel copies the packet to the application; if it is false the packet is ignored. BPF packet filter is quite more complex, because it determines not only if the packet should be kept, but also the number of bytes to keep. This feature is very useful if the capture application does not need the whole packet. For example, a capture application is often interested only in the headers and not in the data of a packet. Such an application can set a filter that accepts only the firsts bytes of the packet, in which the header is contained. This action speeds up the capture process (because decreases the number of byte to copy from the driver to the application) and decrease the loss probability (because increase the free space in the buffers that hold the packets).
Historically there have been two approaches to the filter abstraction: a boolean expression tree and a directed acyclic control flow graph or CFG. More details about them can be found in [McCanne and Jacobson 1993]. These two models of filtering are computationally equivalent, i.e., any filter that can be expressed in one can be expressed in the other. However, in implementation they are very different: The tree model maps naturally into code for a stack machine while the CFG model maps naturally into code for a register machine. BPF creators choose CFG because it can be implemented more efficiently on modern computers, that are register based. The BPF pseudo-machine is a virtual processor, and its abstraction consists of an accumulator, an index register (x), a scratch memory store, and an implicit program counter. It is able to execute load and store instructions, branches, arithmetic instructions and so on. Therefore, a UNIX application like TcpDump that wants to set a filter on the incoming packets, builds a program for the pseudo-machine and sends it to BPF with a IOCTL call. BPF executes the filter program on every packet and discards the ones that do not satisfy the filter. The BPF pseudo-machine has some nice features:
The next figure illustrates BPFs interface with the rest of the system.
The tap feeds the packet to each participating applications filter. This user-defined filter decides whether a packet is to be accepted and how many bytes of each packet should be saved. Notice that the filter is applied to the packet while it is still in the link-level drivers memory, without copying it. This optimizes greatly performances and memory usage, because in case the packet is not accepted, it is discarded before any copy. If the filter accepts the packet, the tap copies the number of bytes specified by the filter from the link-level divers memory to the store buffer associated with that filter. At this point the interfaces device driver re-obtains control and the normal protocol processing proceeds.
The process performs a read system call to receive packets from BPF. When the hold buffer is full, BPF copies it to the process memory and awakes the process. Since the process might want to look at every packet on a network and the time between packets can be only a few microseconds, it is not possible to do a read system call per packet and BPF must collect the data from several packets and return it as a unit when the monitoring application does a read. To maintain packet boundaries, BPF encapsulates the captured data from each packet with a header that includes a time stamp, length, and offsets for data alignment.
Notice that not all the UNIX versions have BPF (i.e. filtering and buffering capabilities in the kernel), but the pcap library can compensate this lack. It is in fact able to work without a kernel buffer, ant to filter the packets in a BPF compatible way at user level if used in a system without the BPF filter machine in the kernel. This solution was adopted by the first release of the NDIS packet driver. It is easier to implement, but has limited performances, mainly for two reasons:
These reasons brought to the implementation of the filtering process and the buffering of the packets in the packet capture driver and not in WinDump.
NDIS (Network Driver Interface Specification) is a set of specifics that defines the communication between a network adapter (or, better, the driver that manages it) and the software that implements the various protocols. The main purpose of NDIS is to form a wrapper that allows network drivers to send and receive packets on a LAN or WAN without caring either the particular adapter or the particular operating system.
NDIS supports three types of network drivers:
Next figure shows a sample NDIS structure with two capture stacks on the same network adapter: one with the NIC driver and a protocol driver, the other with the NIC driver, an intermediate driver and a protocol driver.
The packet capture driver needs to communicate both with the network drivers (to get data from the net) and with user-level applications (to provide them the packets), so it is implemented in the NDIS structure as a protocol driver. This allows it to be independent from the network hardware, thus working with all the network interfaces supported by Windows. Notice however that the packet capture driver works at the moment only on Ethernet adapters and on some WAN connections due to limits imposed by the driver's and filter's architecture. Notice also that a WAN connection is usually seen by the protocol drivers as an Ethernet NIC, and every received packet has a fake Ethernet header created by NDIS. This allows to the protocol drivers written for Ethernet to work on WAN connections without any change, but implies also that specific packets like PPP NCP-LCP are not seen by the protocol drivers, because the PPP connection is virtualized. This means that the packet driver cannot capture this kind of packets.
Notice that the various Win32 operating systems have different versions of NDIS: the version of NDIS under Windows 95 is 3.0, while Windows NT has NDIS 4.0. NDIS 4 is a superset of NDIS 3, therefore a driver written to work with NDIS 3 (usually) works also with NDIS 4. The packet capture driver is written for NDIS 3, but works also with the more recent versions of NDIS. This means that the interaction between the driver and NDIS is the same under Windows 95/98 and under Windows 2000.
Next figure shows the position of the packet capture driver in the Win32 architecture.
A protocol driver that communicates with lower level NDIS drivers uses NDIS-provided functions. For instance, a protocol driver must call NdisSend or NdisSendPackets to send a packet or packets to a lower level NDIS driver.
The lower level drivers, on the other hand, communicate with the protocol driver in an asynchronous way. They call particular functions of the protocol driver to pass the data from the network, indicate events, and so on.
The lower-level driver indicates the incoming of a new packet calling a callback function of the protocol driver and passing a pointer to a lookahead buffer, to its size, and the total size of the received packet. The name of this callback function in the packet capture driver is Packet_tap. The behavior of the packet driver is however quite different from the one of a standard protocol driver. In fact:
Note that in UNIX implementation, BPF is called before the protocol stack, directly by the network interfaces driver. This is not possible for the packet capture driver, that is a part of the protocol stack. For this reason, for example, the packet capture driver is not able to sniff the PPP specific packets, because it does not work at hardware level, but over NDIS. Having the precedence on NDIS would imply changes in the kernel or in the NIC drivers, which is not possible in Windows.