I installed Fedora 18 on a KVM virtual machine via virt-manager and pushed /dev/md10 (the 6-disk RAID10 array) into the virtual machine as a virtio device. I then did raw I/O to /dev/vdb (what it showed up as in the virtual machine), and found that I was getting roughly the same performance as native -- which, as you recall, was 311Mb/sec. I was getting 308Mb/sec, which is close enough to be no real difference. The downside was that I was using 130% of a CPU core between the virtio driver and kflushd (using write-back mode rather than write-through mode), i.e., using up one CPU core plus 1/3rd of another to transfer the data from the VM to the LSI driver. For the purposes of this test, that is acceptable -- I have 8 cores in this machine, remember.
The next question was whether XFS performance would show the same excellent results in the VM that it showed native. This proved to be somewhat disappointing. The final result was around 280mb/sec -- or barely faster than what I was getting from ZFS. My guess is that natively XFS tries to align writes with RAID stripes for the sake of performance, but with the RAID array hidden behind the emulation layer provided by the virtualization system, it was not able to do so. That, combined with the fact that it only had half as much buffer cache to begin with (due to my splitting the RAM between the KVM virtual machine and the host OS -- i.e., 10Gb apiece) made it more difficult to effectively schedule I/O. I/O on the KVM side was "bursty" -- it would burst up to 1 gigabyte per second, then down to 0 gigabyte per second, as shown by 'dstat'. This similarly caused I/O on the host side to be somewhat "bursty". Also, this tends to support the assertion that it's the SEL (Solaris Emulation Layer) that's causing ZFS's relatively poor streaming performance when compared to BTRFS, since the SEL effectively puts the filesystem behind an emulation layer too. It also supports the assertion that the Linux kernel writers have spent a *lot* of time working on optimizations of the filesystem/block layer interface in the recent Linux kernels. It also raises the question of whether hardware RAID controllers -- which similarly hide the physical description of the actual RAID system behind a firmware-provided abstraction layer -- would have a similar negative impact upon filesystem performance. If I manage to snag a hardware RAID controller for cheap I might investigate that hypothesis but it's rather irrelevant at present.
What this did bring out was that it is unlikely that testing NTFS throughput via a Windows virtual machine is going to produce accurate data. Still, I can compare it to the Linux XFS solution, which should at least tell me whether its performance is within an order of magnitude for streaming loads. So that's the next step of this four-part series, delayed because I need to write some Java code to do what my script with 'dd' did.
Update: My scrap heap assemblage of spare parts disintegrated -- the motherboard suddenly decided it was in 6-beep "help I can't see memory!" heaven and no amount of processor and/or memory swapping made it happy -- and thus the NTFS test never got done. Oh well.