[ntar-workers] Some perf test results on NTAR.

Gianluca Varenni gianluca.varenni at gmail.com
Tue Jul 12 06:31:47 GMT 2005


Hello everyone.

During the weekend I run some tests to have a rough idea of the performance
of ntar on windows and linux, and check the issue of "slow" seeks me and
Loris were talking about in some other threads of the mailing lists.

Basically, I used two tests (test017 and test018, included in the latest
NTAR release I've put on the website), that respectively create sintetic
traces, and read traces (the test reads the type of each block, only, the
block data are *not* retrieved).

As you can see from the data below, I used a quite powerful DELL server
using performing SCSI disks. All the traces were saved/read from a freshly
formatted HD (with NTFS or ext2), in order to avoid fragmentation issues.
Moreover, I used some tricks before performing each read test, in order to
empty all the OS and disk caches (the results change a lot if you don't pay
attention to this detail).

The read tests (test018) were run using the standard (vanilla) NTAR library,
and a modified one that does *not* use seeks to jump from a block to another
(instead, I read all the data from each block in a fake buffer).

The results are shown below.

Some thoughts about the tests:
- the tests are rather incomplete, it would be interesting to repeat the
  tests
  + with other linux filesystems
  + tuning the cluster size of the NTFS filesystems
  + trying RAID 0
  + using another read app, that actually reads the block contents.
- the performance of test017 (dump) is pretty much the same on linux&windows
- considering an ethernet link type, the write performance is about 400Mbps
  (+/-20Mbps) for packets between 64 and 1518 bytes.
- on windows test018 (read) is heavily (badly) affected by seeks, especially
  if seeks are short. The same tests with seeks disabled show an impressive
  improvement. I think this is mainly due to the buffering used by the FILE
  calls (fwrite/fread/fseeks), which are currently used by NTAR. It would be
  very interesting (and it's quite easy to do) to implement the
  read/write/seek NTAR callbacks using the native windows
ReadFile/WriteFile/
  ???seek???.
- on linux test018 (read) is somewhat affected by short seeks (but much less
  that windows). Again, I think this is mainly due to the buffering used by
  the FILE calls (fwrite/fread/fseeks), which are currently used by NTAR. It
  would be very interesting (and it's quite easy to do) to implement the
  read/write/seek NTAR callbacks using the native posix read/write/??seek??.
- in general, read operations seem to be comparable between windows and
  linux with big packets and using the "no-seek optimization" under windows.
  When small packets are involved, linux seems to be better. I would like to
  investigate the problem, trying to reimplement the read/write/seek
  callbacks using some more low level calls, and see what happens.
- CPU load: arghh, it's quite a pain to measure it. Basically, the CPU load
  tends to be quite variable, going from 0% to some peak value, and I was
  not able to compute a mean value for that in most cases. Under Windows, in
  any case, the CPU load seems to be concentrated on one single CPU. Under
  linux, I tried using "top" to measure the cpu load, and I had some strange
  results (maybe strange for me because I'm not so used to it). Basically,
  during some of the tests the cpu load for the test processes (test017 or
  test018) was very high (85%), but the total load of every single CPU in
  the system (hitting "1" while top runs) was much lower. I think I'm
  missing something...

Any comment or idea for new tests (or volunteers for tests) is extremely
well accepted.

Have a nice day
GV

PS. Sorry for the *extremely* long mail...


------------------------------------------------------------------

Platform: DELL PowerEdge 2850
Dual XEON 3GHz Hyperthreading (=4 virtual CPUs)
1 GB RAM
1 x Seagate Cheetah 15k rpm 36GB SCSI (OS)
1 x Seagate Cheetah 15k rpm 36GB SCSI (trace files)


==============================Windows==============================

OS: Windows Server 2003
Trace file disk formatted with NTFS, standard cluster size.

Test 017-A (Dump of packets to disk)
------------------------------------

Vanilla NTAR 1.1.0.190 (static VC7 CRT)

PacketSize  NumPackets  Time(runA,runB)  1 CPU load estimate
64           10M         17,  14         0-100% (variable)
64          100M        147, 161         0-100% (variable)
1518          1M         29,  25         0-40% (variable)
1518         10M        293, 263         0-40% (variable)


Test 017-B (Dump of packets to disk)
------------------------------------

NTAR 1.1.0.190 compiled with DLL VC7 CRT (msvcrt71.dll)

PacketSize  NumPackets  Time(runA,runB)  1 CPU load estimate
64           10M         11,  11         0-100% (variable)
64          100M        151, 146         0-100% (variable)
1518          1M         21,  18         0-40% (variable)
1518         10M        300, 257         0-40% (variable)

Test 018-A (Read of blocks from disk)
------------------------------------

Vanilla NTAR 1.1.0.190 (static VC7 CRT, all seeks enabled)

PacketSize  NumPackets  Time(runA,runB)  1 CPU load estimate
64           10M         77,  79         80%
64          100M        788, 793         80%
1518          1M         41,  44         30%
1518         10M        387, 395         30%

Test 018-B (Read of blocks from disk)
------------------------------------

NTAR 1.1.0.190 (static VC7 CRT, seeks to jump from a block to
                                another disabled)

PacketSize  NumPackets  Time(runA,runB)  1 CPU load estimate
64           10M          24,  24        100%
64          100M         241, 242        100%
1518          1M          22,  22        30%
1518         10M         231, 231        30%


===============================Linux===============================
OS: Linux Fedora CORE 3, kernel 2.6.11.something
Trace file disk formatted with Linux LVM filesystem + ext2.

Test 017 (Dump of packets to disk)
------------------------------------

Vanilla NTAR 1.1.0.190

PacketSize  NumPackets  Time(runA,runB)  "top" CPU load estimate
64           10M         10,   7, 13     ????
64          100M        142, 147         ????
1518          1M         14,  24         ????
1518         10M        302, 296         ????


Test 018-A (Read of blocks from disk)
------------------------------------

Vanilla NTAR 1.1.0.190 (all seeks enabled)

PacketSize  NumPackets  Time(runA,runB)  "top" CPU load estimate
64           10M         19,  18         85%
64          100M        176, 168         85%
1518          1M         27,  26         ????
1518         10M        227, 227         ????

Test 018-B (Read of blocks from disk)
------------------------------------

NTAR 1.1.0.190 (reads to jump from a block to another disabled)

PacketSize  NumPackets  Time(runA,runB)  "top" CPU load estimate
64           10M         15,  15         85%
64          100M        129, 141         85%
1518          1M         27,  26         ????
1518         10M        223, 227         ????




More information about the ntar-workers mailing list