January 18, 2022

Western Digital introduces new non-SMR 20TB HDDs with onboard NAND – Ars Technica

This isn't Western Digital's first 20TB drive—but it <em>is</em> the first shipping drive to achieve that density without the use of Shingled Magnetic Recording (SMR) technology.
Enlarge
/ This isn’t Western Digital’s first 20TB drive—but it
is the first shipping drive to achieve that density without the use of Shingled Magnetic Recording (SMR) technology.

Western Digital

At Western Digital’s HDD Reimagine Event yesterday, the company introduced its newest hard drive architecture—a hybrid spinning rust/NAND flash design it calls OptiNAND. But as WD President of Technology and Strategy Dr. Siva Sivaram told Ars in an interview, OptiNAND bears almost no resemblance to the much-maligned hybrid SSHD drives first introduced in 2011 and 2012.

Instead of promising SSD-like speeds via caching of customer data, OptiNAND offers increased areal density by removing firmware-accessible metadata from the disk itself and storing it on NAND instead.

20TB per disk without SMR

The most tangible milestone achieved by Western Digital’s newly announced architecture is a nine-platter, 20TB drive that does not require Shingled Magnetic Recording (SMR) techniques. The new disk uses a subset of Western Digital’s EAMR technology, which has been rebranded ePMR—presumably to emphasize that it’s not SMR, which has severe performance and usability implications for many common workloads.

The new drive uses the same EAMR and triple-stage actuator technology as last July’s 18TB and 20TB offerings but gets its boost in areal density from offloading onboard metadata to the flash side of the OptiNAND architecture. The metadata we’re talking about isn’t filesystem metadata—it’s hardware metadata, accessible by the drive’s firmware but not exposed outside the drive itself.

Repeatable Runout (RRO) is a description of a rotational system’s inaccuracies that can be predicted ahead of time—for example, a steady wobble caused by the microscopically imperfect alignment of a rotor. RRO data is specific to each individual drive and is generated at the factory during manufacturing and stored on the disk itself.

In typical conventional drives, RRO metadata is interleaved with customer-accessible data on the platters themselves, reducing the overall areal density of the disk due to reducing the number of tracks per inch (TPI) available for customer data. OptiNAND architecture allows Western Digital to move this metadata off the platters and onto the onboard NAND.

In order to hit 20TB on a nine-platter drive with current recording technologies, you need a next-generation “edge” of some sort. In last year’s 20TB drives, that edge was SMR—in this year’s newest models, it’s OptiNAND.

Sync write acceleration

On this Seagate Ironwolf NAS disk, sync writes are an order of magnitude slower than async. OptiNAND should do away with the majority of that performance bottleneck.
Enlarge
/ On this Seagate Ironwolf NAS disk, sync writes are an order of magnitude slower than async. OptiNAND should do away with the majority of that performance bottleneck.

Although Sivaram opened up our interview declaring that OptiNAND is not for storing customer data at all, there is one exception—the contents of the drive’s DRAM cache can be flushed to OptiNAND in the event of an unexpected power loss.

This has a massive impact on the drive’s performance when the drive is asked to perform sync writes—a special mode of operation typically requested by databases, virtual machines, and most NFS exports. When an application requests a “sync write” to a drive, all of the normal write aggregation and caching operations become unavailable—the application tells the operating system, “I’m not doing anything else until you verify that this data is stored permanently and safely on disk.”

OptiNAND architecture allows drives to complete those sync() calls instantaneously without lying about the safety of sync write data. Instead of pausing everything the drive is doing to seek the heads to the proper places and write the data to the magnetic medium, the drive can simply say, “Your data is safe” once the writes are accepted into the drive’s own DRAM buffer.

Normally, this would be considered an unsafe violation of write barrier mandates. But on the OptiNAND drives, up to 100MiB of “cache-disabled” writes can be accepted into DRAM because, in an emergency power-off event, the drive’s onboard capacitor can keep the DRAM viable for long enough to flush the write cache to OptiNAND. On restoration of power, the dirty data temporarily written to OptiNAND is read back and immediately committed to rotational storage.

Allowing the drive to “fudge” on write barriers by relying on capacitor power to flush dirty writes from DRAM to OptiNAND can improve performance for those writes drastically—in many cases, probably by a factor of ten or more.

Use cases and availability

According to Sivaram, the new OptiNAND drives will be useful anywhere conventional hard drives are used. Without the performance bottleneck of either drive or host-managed SMR, the drives can be expected to function as plug-and-play replacements for older, smaller drives in everything from the NAS to the data center.

The 20TB OptiNAND disks are currently in the early stages of production, with samples shipping to select Western Digital enterprise customers only. However, the technology is expected to serve as “the foundation for future designs and innovations” across Western Digital’s entire rotational storage product line, with market-specific products becoming available later this year.

We have requested product samples for direct testing and review.

Leave a Reply

Your email address will not be published. Required fields are marked *