OS/2 Filesystem Shootout
A Tale of Intrigue and Discovery
by Michal Necasek, March 2002
The days of DOS when you could choose any filesystem as long as it was
FAT are long gone. Since the introduction of OS/2 Warp Server for e-Business,
OS/2 and eCS users can choose between three IBM supplied filesystems (FAT,
HPFS, JFS) and four file system drivers - for HPFS there are two possibilities,
"regular" HPFS and HPFS386, although the latter is only available in server
versions and at considerable extra cost. I don't even count the variety
of third-party IFSs - most of them have very specific purposes and are not
suitable as generic filesystem drivers due to poor performance, lack of features
or special intended uses.
Because I'm a curious person, I wanted to know which one of these IFSs
is the best. The following paragraphs are an attempt to find an answer to
this question. Perhaps not too surprisingly, there is no clear answer... but
read on.
If we want to choose the "best" IFS, we need to be able to tell if one
IFS is better than another. I decided to compare the filesystems based on
performance and features. Features are rather difficult to express in numbers
- which leaves us with performance, and to compare the filesystem performance
it is necessary to run benchmarks. Unlike some hot benchmarking "pros" I'm
not going to pretend that my benchmarks are universally applicable or necessarily
even important. Draw your own conclusions from the numbers - and at your own
risk. If you want to be really sure, roll your own benchmarks which model
how you use your computer.
First I will describe the hardware used for benchmarks. It is a 600 MHz
Pentium III with 256MB RAM and Maxtor 40GB 7200RPM EIDE disk. I did not run
the tests on this disk however - for that I used a Fujitsu Ultra-160 18.2GB
7200RPM SCSI disk attached to a venerable Adaptec 2940UW PCI SCSI host controller.
This setup allowed me to create one relatively small 1GB volume and reformat
it with various filesystem. That ensured that all tests were run on a clean,
unfragmented volume and that there was no skew introduced by different speeds
of different parts of the disk. With this setup I am reasonably certain that
it was only the IFSs that made the difference between test results and no
other factors.
In retrospect, using the fast SCSI drive for benchmarking wasn't a very
good idea for a simple reason: a fast drive tends to equalize the filesystem
differences in performance. Look at it this way - if the disk was infinitely
fast, there should be no differences in filesystem performance. The slower
the disk, the more important the IFS efficiency becomes. Still, even with
the fast drive there were clearly discernible trends in filesystem performance.
For both HPFS386 and JFS I used 32MB caches with default settings for lazy
write timeouts etc. For plain HPFS I used the maximum 2MB cache and for FAT
the maximum 14MB cache. The base sytem was eCS GA with HPFS386.IFS from WSeB
GA and JFS.IFS from October 16, 2001.
At first I intended to run "classic" benchmarks from SysBench but then
decided against it for two important reasons:
- Sysbench for some reason refused to run on my JFS volumes. That alone
was a very good reason not to use it.
- I realized that Sysbench runs read/write tests in 1:1 ratio. That
is very different from real world usage where the read:write ratio is usually
more like 10:1 or even much higher.
So I decided to set up my own tests. Yes, these tests were completely arbitrary.
And yes, I believe they told me more than Sysbench would have - because
they were "real world" tests. I will briefly describe the tests:
- ZIP test. A largish (120MB) ZIP file containing mostly large files
(about hundred total) was unzipped, zipped up to another archive and the uncompressed
files deleted. This test stressed the primarily the read and write throughput
of the filesystem with relatively little file creation and deletion.
- Build test. Running "dmake build" on the SciTech MGL libraries using
Watcom 11.0c C/C++ compiler. The choice of compiler and source code is largely
irrelevant, I'm sure the results would be similar with other projects and
other compilers. This test primarily stressed cache efficiency and also
creation and deletion of a large amount of mostly very small files (lots
of small temporary files get created and deleted in the build process).
- Read test. Reading a big (about 600MB) file. This test stressed raw
filesystem read throughput (the results were rather intriguing).
The following table summarizes the test results. Each test was run at least
three times and results were averaged to suppress any flukes. All figures
are given in seconds, hence smaller numbers are better. For clarity, best
score in each test is highlighted in blue and the worst score in red.
|
HPFS
|
HPFS386
|
JFS
|
FAT
|
ZIP Test |
171 |
159 |
163 |
185 |
Build Test |
107 |
97 |
97 |
135 |
Read Test |
38.71 (16.0MB/s) |
25.41 (24.4MB/s) |
20.13 (30.7MB/s) |
23.30 (26.5MB/s) |
The above table makes several points very clear:
- The performance of plain HPFS not very good. The maximum cache size
is ridiculously small but that doesn't explain the surprisingly poor performance
of large sequential reads.
- The performance delta between HPFS386 and JFS is very small.
- HPFS386 is the fastest on writes, JFS is somewhat slowed down by
the journaling overhead.
- JFS has clearly the best read throughput, most likely due to straighter
path through the kernel. I suspect that FAT has a similar advantage (though
for different reasons).
- The differences in raw read throughput are simply amazing. The winner
(JFS) was very nearly 100% faster than the loser (HPFS). I was impressed by
JFS's performance because the theoretical maximum throughput of UW SCSI is
40MB/sec. I consider achieving slightly over 75% of the theoretical maximum
at application level quite good.
- It is necessary to differentiate between the filesystem layout on
storage media and the actual filesystem driver. The latter is obviously tremendously
important as the comparison of HPFS versus HPFS386 shows. The performance
difference is striking when we consider that both IFSs organize the data on
storage media in exactly the same way.
- It is interesting that out of only three tests and four filesystems,
no filesystem consistently scored best or worst. That shows how difficult
it is to pick a winner.
So which filesystem is the best? The answer is "it depends" - that is,
it depends on user's needs. To make things simpler, first let's see which
filesystem is not the best:
- FAT - the performance isn't terribly good even with a big fat cache.
And when it comes to features, FAT is the clear loser. Lack of long filenames
and maximum volume size limit of 2GB preclude FAT from serious use. Its only
saving grace is wide compatibility with other OSes and the fact that FAT is
still a good filesystem for floppies.
- HPFS - features are almost as good as HPFS386 but the performance
isn't stellar. Extremely small cache size limit seems to be HPFS's worst deficiency
but sequential read performance isn't very impressive either - HPFS was by
far the slowest in that test, slower even than FAT. My recommendation: use
plain HPFS as little as possible.
That leaves two contestants ahead of the pack: HPFS386 and JFS. There is
no clear winner. There is little difference between these two IFSs performance-wise.
HPFS386 is faster on writes but JFS has a clear edge when it comes to reading
big chunks of data. Both have very efficient caches - in the build test the
CPU was 100% utilized almost all the time with both filesystems. Unless you
actually take a stopwatch, both IFSs perform equally well, although each of
them has specific strengths and weaknesses.
These differences are not suprising given these filesystems' very different
history: HPFS386 was reportedly written by Gordon Letwin, the father of
HPFS, in late 1980's. JFS was developed for IBM's variant of Unix, AIX (probably
in early 1990s). HPFS386 is hand optimized 386 assembly code, JFS is written
in C - but the above numbers show that there's more to performance than tight
loops.
If we can't decide which IFS is better on performance, we can try to base
the decision on features. As I hinted above, this is not easy because features
are impossible to quantify. Only individual user with specific needs can decide
which feature set suits him or her best. A detailed description of the respective
filesystems' feature sets can be found elsewhere so I'll just summarize the
points that I find most important:
- HPFS386 is bootable, JFS is not (although that might change in future).
- HPFS386 supports files with maximum size of 2GB, the JFS limit is
far larger.
- CHKDSK on a large HPFS386 volume may take hours, on JFS it's usually
seconds.
- JFS comes with newer versions of OS/2 and eCS, HPFS386 costs extra.
I have been using both JFS and HPFS386 in the past years. I have found both
to be fast and reliable - those are my own experiences, I can't speak
for anyone else.
My personal choice is JFS for one simple reason: HPFS386 is a dead end.
It is partially owned by Microsoft and no one can seriously expect any new
development to be done on it. That was the reason why IBM decided to include
JFS in WSeB after all. JFS on the other hand is available in source code
and JFS support was added to Linux. In other words - nobody can take JFS
away which is more than can be said for HPFS386.
Conclusion
I'm not going to make a choice for anyone else. You decide what's best for
you. I attempted to present some interesting (and perhaps even unexpected)
data here but you have'll to draw your own conclusions. I won't pretend I
know what's best for you. Maybe you don't either, but that's another
matter.