Chris Evans

Subscribe to Chris Evans: eMailAlertsEmail Alerts
Get Chris Evans via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Related Topics: Virtualization Magazine, Microsoft Developer, CIO/CTO Update

Blog Feed Post

Windows Server 2012 (Windows Server “8″) – Resilient File System

This is one of a series of posts discussing the new features in Windows Server 2012, due to be shipped this year and currently in public beta as Windows Server 8.  You can find references to other related posts at the end of this article.  This post reviews the new file system format known as the Resilient File System or REFS.

Since the introduction of Windows NT 3.1, NTFS has been the preferred file system for Microsoft operating systems.  It displaced FAT, providing a higher degree of reliability and scalability.  NTFS itself has been revised and improved steadily with each version of both the client and server versions of Windows.  With Windows 8 (aka Server 2012), Microsoft has introduced a new file system – the Resilient File System or REFS.

REFS looks to improve upon NTFS, while providing some specific benefits;

  • Auto-correct data corruptions
  • Cater for extremely high scale capacity
  • Provide high availability – always have the file system offline
  • Integrate with Storage Spaces for high resliency
Microsoft were clearly concerned about introducing a completely new file system format; new code means new bugs and also requires the re-writing of many internal and external interfaces – compatibility and relability become a problem.  As a result, REFS is based on large parts of the NTFS code, with what look to be tweaks and enhancements.  There are some notable features.  The list quoted by Microsoft is:
  • Metadata integrity with checksums
  • Integrity streams providing optional user data integrity
  • Allocate on write transactional model for robust disk updates (also known as copy on write)
  • Large volume, file and directory sizes
  • Storage pooling and virtualization makes file system creation and management easy
  • Data striping for performance (bandwidth can be managed) and redundancy for fault tolerance
  • Disk scrubbing for protection against latent disk errors
  • Resiliency to corruptions with “salvage” for maximum volume availability in all cases
  • Shared storage pools across machines for additional failure tolerance and load balancing
Some interesting changes in design address common corruption scenarios.

Torn Writes

A torn write occurs when an I/O write operation is interrupted (for example during a power loss) as metadata is being updated.  REFS uses an “allocate on write” technique to ensure metadata is not updated in place, but is written to a new location.

Bit Rot

Bit Rot – undetected corruptions of data – represents a problem in large scale file systems.  REFS uses the benefits of Storage Spaces to perform data scrubbing and validation of data across multiple mirrors of a volume.  This involves reading and validating data and metadata checksums – a new feature of REFS.

Integrity Streams

Another new feature in REFS are integrity streams.  When enabled, data is stored in new locations on each write.  This provides obvious benefits, but can be overridden for applications that are highly dependent on managing their own data layout.  This and other features seem to have been added specifically to work with JBODs rather than SAN devices, which doesn’t mean Microsoft are moving away from SAN – merely that they are providing options.
As a test, I created two LUNs on my Drobo 1200i in the lab, both of equal size but formatted differently – one with REFS integrity enabled, one without.  The integrity streams feature, from what I can tell, can only be specified when formatting a volume at the command line (see the screenshot).  The test consisted of a fixed queue depth of 8, with blocksize from 512 bytes to 64K and 100% sequential, 100% write data.
The results are certainly interesting.  For throughput, with integrity enabled, the achieved figures were about half that when integrity was not enabled.  For IOPS, the Drobo clearly excelled at 4K (this is the version with SSD, which has been tuned for 4K blocks).  Integrity streams randomizes the I/O traffic by writing in new locations.  This could be a significant issue for SATA JBODs in a server and is worth evaluating before using.
Throughput and Integrity Streams #1 Throughput and Integrity Streams #2 Formatting a REFS volume Enabling Integrity Streams via CLI REFS in Computer Management


Read the original blog entry...