[Winpcap-bugs] Potential waste of NonPaged kernel buffer space onHyper-Threading machines in the NPF Driver?

Gianluca Varenni gianluca.varenni at cacetech.com
Tue May 26 13:32:26 PDT 2009

  ----- Original Message ----- 
  From: Gadi Elishayov 
  To: winpcap-bugs at winpcap.org 
  Sent: Saturday, May 23, 2009 12:33 PM
  Subject: [Winpcap-bugs] Potential waste of NonPaged kernel buffer space onHyper-Threading machines in the NPF Driver?

I've conducted some tests on an Hyper-Threading enabled machine with 2 Physical CPUs and I found out that NdisSystemProcessorCount() returns 4 (that is due to the virtual cpu).Also, to my understanding, in the HT technology, only one CPU out of a pair of 2 virtual will ever handle interrupts. The code in the NPF driver uses this value to divide its per-CPU buffer,but given the fact that only half of the processors handle interrupts, the buffer itself is wasted on those CPUs who don't handle interrupts. Am I wrong?Well, it depends a lot on where the ProtocolReceive(Packet) handlers run. In most cases such routines run in the context of the DPC routine of the miniport driver. In any case theydo not run in the interrupt handler. But unless the ISR of the miniport decides to schedule the DPC on a different processor (and I don't even know if that is possible), thethe DPC runs on the same CPU where the ISR was executed. So, yes, in practice only 2 out of 4 buffers are used. Is there a way to fix this besides making a single buffer for the entire machine?I don't think there is any way to know such information. The idea would be to get the number of physical cores on the machine, and use that as the amount of buffers. And thengiven the virtual CPU#, get the physical CPU# (for example vCPU#0 and 1 are pCPU#0 and vCPU#2 and 3 are pCPU#1). I don't remember of having seen any API for that.Maybe some good way to know which CPUs will handle interrupts, drain the DPC queue?Not really, it depends a lot on how the miniport is written. Also, the miniport behind winPcap could be a virtual one or the upper halfof an intermediate NDIS driver, so the concept of DPC in that case is no longer that clear.In reality, I'm not even sure that such per-cpu mechanism offers any benefit. The problem is that even if at the moment we use multiple kernel buffers,in the end the bottleneck is bringing the packets to user mode, because the packets from the various per-CPU buffers are aggregated into onesingle "stream" to the applications.Does it make any sense to you?Have a nice dayGV


  Winpcap-bugs mailing list
  Winpcap-bugs at winpcap.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.winpcap.org/pipermail/winpcap-bugs/attachments/20090526/73109ede/attachment.htm 

More information about the Winpcap-bugs mailing list