Microsoft® iSCSI Target 3.2 availability and performance white paper (2010)

noname studio

published: December 2010

(download as pdf)

Abstract

In April 2010 noname studio conducted independently a set of preliminary iSCSI performance benchmarks using the available Microsoft® iSCSI Target 3.2, distributed to OEMs with Windows Storage Server 2008 editions. The test series were focused on a gigabit Ethernet over copper (1000BASE-T) with a single link. The results were – comparing to other software iSCSI targets and entry level OEM iSCSI SANs – satisfactory.

Our objective

Our goal was firstly to test if Microsoft® iSCSI Target 3.2 can offer a good performance, compared to other iSCSI Target software solutions. Furthermore we wanted to check if software iSCSI targets can saturate a single 1000Mb/s channel. For this purpose we controlled if there are any performance differences when using the OS out-of-the-box TCP/IP parameters and when adjusting the parameters as proposed in various SAN OEM forums (see Appendix C). Also we additionally ran the stress tests with enabled jumbo frames to measure alleged improvements.

Strictly speaking the objective does not have a comparative but rather informative character. As such the values and results should serve as orientation standpoints. Auxiliary stand feet were provided with two other SAN appliances for orientation – one was Open-E on top of the same hardware workbench, the other was an iSCSI SAN solution from EasyRAID (see Appendix A for hardware details).

Note: This white paper should not be taken in any case as an ultimate reference! Firstly all the tests were sported with one benchmark application called h2benchw, which provides steadfast results for direct attached SATA/SAS HDDs, but is not optimized for iSCSI, i.e. cannot set variable profiles for data block sizes, randomizing, outstanding I/Os etc. Also the consequent test iterations were afterwards considered not sufficient, because of abnormally high deviations (around 10-12%) of the results on the same workbench. As such the results and our opinions here should rather be considered a starting point for further investigations.

Our Observations

IPERF

The preliminary tests started with a sanity check for the ethernet connection: we used iperf and visualized the results in excel sheet. Since a theoretical technical bandwidth for 1GbE is 128 megabyte/s the expectation was laid within 110-124 MB/s in one direction. The results were a little less though.

Three TCP window sizes & two MTU sizes – matrix was used (see details under Appendix B). Whereas the MTU size could not contribute to a measurable performance boost, the TCP window size was crucial and estimated that for 1000BASE-T with latency lower than 1ms set parameters lower than 128 Kbytes were unsatisfactory.

As can be seen from the figure with TCP window size values less than 128 Kbyte the bandwidth was not only throttled, but also unstable. Also any other attempts to improve the network speed were ineffective: either the current drivers, or the TCP/IP stack were hindering peak values higher than 112MB/s.

Reference with OEM SAN systems

As stated above we needed a reference point for the three chosen software iSCSI Targets (Microsoft® iSCSI Target; Starwind iSCSI Server and iStorage Server). At our disposal were two applications, the first was EasyRAID, which came out of the box configured with 16x 1TB SATA II HDDs in one RAID-6. The other solution was Open-E DSS 6. We used the same hardware provided for the software iSCSI Servers tests and booted DSS6 directly from the CD. Since Open-E does not explicitly support RAMDISK as LUN target we had to test the hardware RAID configuration provided by an Adaptec RAID controller. Under Microsoft® Windows Server 2008 we then installed the latest Adaptec drivers and used h2benchw to measure the performance of the directly attached logical volumes

The first figure concentrates on the maximal sequential reads/writes and the latency (access time). Please refer to the two lowest bars. Since the Windows Server OS was installed on Array0, RAID-1 could only be tested in read mode, but write performance is expected to have been similar or higher than the RAID-5 values. As expected the latency on RAID-1 was very low – 0,32 ms, because of the SSDs and lack of parity calculation. Still even the SATA drives on RAID-5 were amazingly “fast” with 695,36Mbyte/s read and 381,47MB/s write, allegedly due to Adaptec’s smart caching algorithms.

In Zone Measurement reads/writes and the application index RAID-1 was logically the fastest volume (423,4MB/s), and because of the very low latency could achieve also the best AppIndex mark. RAID-5 performed also way above the average expectations with 378,9MB/s read and 297,8MB/s write. Perhaps this is the place to mention that this constellation proves that SATA II HDDs can be used even in high end SANs, given the appropriate hardware RAID controller, and serve reliably large sequential I/Os. In such case SAS (especially 15K rpm) or SSD drives are only needed for applications that require fast access time (i.e. less than 10ms read / 3ms write)

Still most important observation for this healthy check was that the Adaptec RAID controller could handle transfer as high as the technically possible 128MB/s and won’t be the bottleneck during the experiments with Open-E.

H2BENCHW

The second type of tests was conducted using block based disk benchmark h2benchw available from heise.de. This tool is widely used to measure physical disks performance, including test profiles like zone measurement, sustained read and access time. The results can therefore be used as comparison between current SATA/SAS hard disk tech specs and iSCSI disks performance benchmarks.

The following chart represents the summary of all the three software solutions and the two OEM SAN solutions for the maximum achieved read and write throughputs. The reader is being warned not to compare apples with oranges – EasyRAID and Open-E were both using SATA HDDs whereas the software iSCSI Targets were using RAMDRIVEs. As such OEM’s latency time was expectedly much higher. Much more relevant was the disturbing value 3,53ms read latency by Starwind’s software. Whether it was due to bad implementation of the RAMDRIVE or the iSCSI Target software itself could not be detected. Microsoft® iSCSI Target on the other hand was performing relatively good in both disciplines: max write 86,17MB/s; max read 88,33MB/s; latency write 0,24ms; latency read 0,42ms.

Overall observation was that none of our candidates could approximate the synthetically achieved ethernet speed of 108MB/s. Peak values were utilizing with 95,84MB/s only around 89 percent of the possible capacity.

The last statement applies also for the zone measurement benchmarks, where values were between 51,1MB/s (iStorage) and 79,8MB/s (ZM read Microsoft®). The best write results were delivered from Starwind. Still in the overall performance AppIndex test Microsoft was measurably faster, most likely because of the comparatively lower latency times.

What are not illustrated in the above charts were the differences between the tests with TCP registry changes and larger MTU sizes (jumbo frames). As a matter of fact – at least with the h2benchw tool – the differences were relatively small and allowed to be interpreted within the statistical standard deviation.

Jumbo frames and switches

The following table provides a nice example:

The values were scattered and as such not offering reliable interpretation: whereas with switch the read/write speeds seem lower the degradation proportion for the BCM5721 NIC was much less than the one with the BCM5708 ethernet card. Against other iSCSI targets the switch performed comparably better and on some occasions the latencies through the switch were shorter than through the cable, which didn’t make any sense. So we had to extend the standard deviation to the worst and best values and represent an arithmetical mean from all the values for the comparison between the different iSCSI solutions.

Note: in the above table the reader may have noticed the unusually low maximum write speeds. This was caused from a bad driver support for this generation of Broadcom NICs, which was corrected in a later build of the OEM provider.

TCP window size

The last example for scattered benchmarks is being illustrated with the following table:

From the two tested iSCSI Targets the results were straight forward: the TCP window size “tweak” had lowest values for max read and second worst values for max write for Target 1. For Target 2 it owned the median value for max read and the worst value for max write. The overall picture for all software solutions was even more complex.

Conclusions

To summarize the questions in our objective:

  1. Microsoft® iSCSI Target 3.2 was performing very well compared to other software solutions: it had zone measurement marks comparable to the ones from Starwind and the highest AppIndex
  2. Concerning ethernet speeds none of the candidates could fully utilize a single 1000Mb/s link. The worst performances were actually using only half of the technical capacity of the link
  3. As for jumbo frames improvements, although we could imagine that an MTU of 9000 bytes could advance Microsoft SQL transactions (allegedly using 8KB block sizes which can be packed within a single ethernet frame), in real world cases where the iSCSI link is used for primarily sequential reads/writes (such as large data transfers) the jumbo frames could not be assessed as performance improvement
  4. The same goes for TCP window sizes larger than the Windows OS standard 64KB: here iperf could definitely prove better and more reliable link bandwidth but as soon as the higher values were implemented in the Windows TCP/IP stack via registry keys there were seldom marks that lay above the ones without TCP tweaks. If it was due to bad TCP/IP stack implementation could not be investigated

As such the final remark is: never rely blindly on statements from an iSCSI provider or OS manufacturer, you will have to run preliminary tests for your exact hardware, OS, updates, drivers and software workbench to decide whether to implement deviant configurations before your introduce the iSCSI solution to the productive environment.

Appendix A: test server specifications

For the tests we used two hardware systems, one configured as a Storage Server (iSCSI Target) the other as Client Server (iSCSI Initiator)

Hardware Storage Server

Supermicro X8DT3

CPU – 2x Intel® Xeon® E5502 (Dual Core @ 1,87 GHz)

RAM – 6GB (6x 1GB PC3-6400E-6-6-6-14)

NIC – Intel® 82576 Gigabit Dual Port (Drivers 11.4.7.0 from 04.12.2009)

The connection was built over Cat6E double-shielded (F/FTP) cable; no switch was used

Only Protocol IPv4 was enabled (see Appendix C)

Hardware Client Server

DELL PowerEdge 860

CPU – Intel® Xeon® X3220 (Quad Core @ 2.40 GHz)

RAM – 4GB (4x 1GB PC2-5300E-5-5-5-12)

NIC – Broadcom Dual NetXTreme Gigabit Ethernet (BCM5712 B1, Drivers 14.0.0.7 from 3/12/2010)

NIC (when with jumbo frames) – Broadcom Dual NetXTreme II Gigabit Ethernet (BCM5708 C, Drivers 14.0.0.7 from 3/12/2010)

The connection was built over Cat6E double-shielded (F/FTP) cable; benchmarks were run with directly attached cables and on second iteration with a Cisco Catalyst 2960G switch with enabled jumbo frames

Only Protocol IPv4 was enabled (see Appendix C)

OS

Microsoft® Windows Server 2008 x64 English

All Updates, online available as of Apr, 16 2010

Software

(Only on the Storage Server)

iSCSI_Software_Target_32 with RAMDRIVE 2GB

iStorageServer.x64.1.60.exe with RAMDRIVE 2GB

Starwind iSCSI Server 5.3.1310

Open-E underlying hardware (iSCSI Target)

Supermicro X8DT3

CPU – 2x Intel® Xeon® E5502 (Dual Core @ 1,87 GHz)

RAM – 6GB (6x 1GB PC3-6400E-6-6-6-14)

NIC – Intel® PRO/1000 PT Quad Port (Drivers 11.4.7.0 from 04.12.2009)

The connection was built over Cat6E double-shielded (F/FTP) cable on dedicated Cisco C2960G switch

Only Protocol IPv4 was enabled (see Appendix C)

RAID:

Adaptec 5445Z RAID Controller; 512MB cache write-back

Array0 – RAID-1; 2x SSDs (Kingston SNV325S; 120GB; cache write-back)

Array1 – RAID-5; 6x SATA HDDs (Western Digital WD20EVDS-63T; 2TB; cache write-back)

Open-E specs: DSS6; update 12; Build 3836

EasyRAID tech specs

Model – Q16QS-4GR3

CPU – XSC3-IOP8134x

RAM – ECC Unbuffered DDR-II 1024MB

Cache – 546MB global; write-back

RAID:

Array0 – RAID-6; 16x SATA HDDS (Seagate ST31000340NS; 1TB; cache write-back)

Appendix B: workload and test procedures

The following synthetic and native workload programs were used during the test phase:

iperf-1.7.0

Tests included one iteration, 20 seconds long, with a single thread. The following options were changed subsequently:

  1. TCP Window Size: 0.01MByte (default); 0.13MByte; 0.50MByte
  2. Jumbo Frames disabled (1500 MTU) or enabled (9000 MTU)
  3. As such a 3×2 matrix was created

H2benchw-3.12

Version 3.12 includes application benchmarks, sequential read/write and zone measurement tests (including latency measurement).

The command used was

h2benchw 1 -a -!

The txt outputs can be acquired on request

Appendix C: software and registry optimization specifications

It was decided to test the reliability and performance deltas from the TCP/IP (v4) stack.

We named the two different scenarios: “notweaks” and “TCPtweaks”. Lastly the scenarios were implemented parallel on both servers, to assure that the systems were in a consistent-affiliates state

Notweaks

This configuration was based on a vanilla installation of Windows Server 2008 Standard Edition. Still since IPv4 iSCSI configuration was deployed, all other protocols were disabled on the iSCSI interfaces. Sufficient preliminary checks with iperf have proven that there is no negative impact when disabling those additional protocols.

Furthermore netsh int tcp set global was reset to default, in case any of the Microsoft® OS Updates could have changed their values unattended

TCPtweaks

This configuration was based on the “onlyISCSItweaks” and additionally the TCP/IP stack was tweaked as follows:

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\Tcpip\Parameters\Interfaces\{<iSCSI NIC ID here>}]

“TcpDelAckTicks”=dword:00000001

“TcpWindowSize”=dword:00080000

“GlobalMaxTcpWindowSize”=dword:00080000

“Tcp1323Opts”=dword:00000003

“SackOpts”=dword:00000001

“TcpAckFrequency”=dword:00000001

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s