The Conifer Systems Blog

Why a Windows NT file system driver?

no comments

In Cascade 0.1.x, CFS under Windows was implemented as an SMB server called “cfs_smb.exe”.  SMB (also known as CIFS) is the Windows network file sharing protocol.  CFS tricked your computer into thinking that it was talking to a remote network file server, when in fact the file server was running off your own computer.  (This was the reason that setting up CFS required installing a “Microsoft Loopback Adapter” virtual network device.)

Starting in Cascade 0.2.0, the SMB server implementation of CFS has been retired.  Now, CFS is implemented as a true kernel-mode Windows NT file system driver called “cfs_nt.sys”.  Actually, most of the implementation still lives in user space: cfs_nt.sys relies on a user-space service called “cfs_nt.exe” to implement most file system operations.  Since some of you may have legitimate hesitations about installing a kernel driver, we’d like to explain why we made this decision.

The new Windows CFS implementation is conceptually quite similar to the Linux and Macintosh implementation.  On those operating systems, CFS is implemented as a user-space server process called “cfs_fuse” that relies on a third-party kernel module called FUSE (Filesystem in User Space).  The main difference on Windows is that we’ve supplied the kernel driver ourselves.

From time to time, there has been discussion in the open-source community of porting FUSE to Windows, so that any FUSE-based file system could run automatically on Windows.  Unfortunately, it isn’t quite so simple.  Windows and Unix file system semantics are not the same.  The best-known difference is that Windows file systems are case-insensitive and Unix is case-sensitive, but there are a number of other important differences underneath the covers.  We believe it’s very important to provide, to the best of our ability, correct Windows NT file system semantics, and this rules out a Windows-based FUSE solution, even if one were currently available.

There is another open-source project called Dokan that aims to provide a FUSE-like (though not entirely FUSE-compatible) interface for user space file systems on Windows.  Unfortunately, after some investigation, we concluded that Dokan is currently not (for a number of reasons) a production-worthy solution for our needs.

In that case, what are the alternatives?  We can either provide our own kernel driver, or we can take advantage of one of the network file system clients built into Windows — the most ubiquitous of which is the SMB/CIFS client driver.  This is certainly a tempting implementation approach.  Not only does it not require you to install any special kernel drivers, but also you might expect that the performance cost of going through a network file system layer is relatively small.  After all, you’re not really sending packets over a real network.

As always, the devil is in the details.

The Microsoft SMB client uses only a single TCP connection to talk to any SMB server, regardless of how many threads or processes are simultaneously making file system requests to that server.  The SMB protocol is not pipelined: in most cases, when you send a request, you must wait for a reply before you can send another request.  Further, SMB has a number of buffer size limitations and other protocol limitations that hurt performance; for example, a single SMB read or write request is limited to 64KB of data, whereas the Windows NT kernel allows unlimited-size reads or writes.  Thus, SMB performance will never match local disk performance.  This performance gap will only get worse as multi-core CPUs grow in popularity.

On top of the inefficiencies inherent in SMB, we’ve also observed that the Microsoft SMB client also does not seem to use the protocol in a particularly efficient fashion, sending large numbers of seemingly redundant queries.

Unfortunately, performance is not the only issue with the SMB-based implementation.  OS compatibility is also a problem.  Microsoft has now several times changed the rules on whether and how applications are allowed to bind to TCP ports 139 and 445, and how the OS decides which port to connect to.  These rules differ between Windows 2000 and Windows XP, for instance.  The biggest issue, though, has to do with Windows Vista.  Microsoft released a patch “KB942624” that fixes a security hole in Vista’s SMB implementation.  This patch is also part of Vista SP1.  Unfortunately, this patch seems to prevent any application from binding to port 445.  This same problem affects other applications such as SSH port forwarding.  The only known workaround is to uninstall the patch and/or uninstall SP1 — unappealing at best, and impossible for users running OEM versions of Vista that have SP1 preinstalled out of the box.

On top of this, SMB is not a simple protocol, and each major version of Windows (Windows 2000, Windows XP, Windows Server 2003, Windows Vista) has a slightly different implementation.  It is challenging to write an SMB server that interoperates correctly with all versions of Windows, whereas at the file system driver API level, the differences between OS versions are much smaller.

Another major issue with SMB is the timeout handling.  Microsoft’s SMB client has an adaptive timeout mechanism where the time required to handle previous SMB requests is used to estimate an appropriate timeout for future requests.  This works if the server’s performance is either predictably slow or predictably fast, but it is inappropriate for CFS.  CFS’s response time is very fast if the data is already locally cached and slower if it is not.  These fast response times can deceive the SMB client into reducing the timeouts; then, if you access a large file that isn’t already cached locally, it may time out much too quickly.

File system request timeouts are inappropriate for some uses of CFS — do you really want a build to mysteriously fail in an automated build system just because a single file system request is taking longer than expected?  Wouldn’t it make more sense to have a higher-level timeout mechanism that kills a build that is taking longer than expected?  With a standard Windows NT file system driver, these decisions are placed back in the hands of applications.  Applications can “cancel” a file system request that is taking too long to finish, and when a process or thread is forcibly terminated, its file system requests are automatically cancelled.  Or, if you want to wait for the request to finish, no matter how long it takes, that is your right also.

If all this wasn’t enough, there were several other miscellaneous problems:

  • Installation of the Microsoft Loopback Adapter was not automated and was somewhat tricky.  While it is possible to automatically install a device driver like this, the device needs an IP address, and the IP address we choose might conflict with another computer on a private network.  Another possible source of trouble is that some users may already have a Microsoft Loopback Adapter installed for some other reason.
  • It is easy to unintentionally expose your local CFS SMB server to the network, creating a security hole where another user not logged into your computer can access your CFS trees.
  • The .NET Framework considered CFS to be in the largely untrusted “Internet” security zone.  This caused problems trying to run .NET applications off CFS.  .NET can be told to trust CFS, but this isn’t the default and isn’t easy to set up.
  • CFS would appear in Windows Explorer as a “Disconnected Network Drive” instead of as “Cascade File System.”

Given all these issues with the SMB approach, we hope you will understand our decision to provide a file system driver instead.  It ends up being much simpler implementation-wise, certainly; cfs_nt.sys plus cfs_nt.exe is less code than the old cfs_nt.exe.  It is, in our benchmarking, substantially faster, and it will help us provide even better performance in the future.  It improves OS compatibility.  It fixes the timeout problem.  It eliminates the Microsoft Loopback Adapter and the .NET Framework security workaround.  What’s not to like?

Written by Matt

September 9th, 2008 at 1:46 am

Posted in Cascade

Leave a Reply