In part one of this article, Dalet Director of Marketing Ben Davenport lists and explains the key concepts to master when selecting storage for media workflows.
Part two, authored by Quantum Senior Product Marketing Manager Janet Lafleur, focuses on storage technologies and usages.The first time I edited any media, I did it with a razor and some sticky tape. It wasn’t a complicated edit – I was stitching together audio recordings of two movements of a Mozart piano concerto. It also wasn’t that long ago and I confess that every subsequent occasion I used a DAW (Digital Audio Workstation). I’m guessing that there aren’t many (or possibly any) readers of this blog that remember splicing video tape together (that died off with helical-scan) but there are probably a fair few who have, in the past, performed a linear edit with two or more tape machines and a switcher. Today, however, most media operations (even down to media consumption) are non-linear; this presents some interesting challenges when storing, and possibly more importantly, recalling media. To understand why this is so challenging, we first need to think about the elements of the media itself and then the way in which these elements are accessed.
Media Elements
The biggest element, both in terms of complex and data, is video. High Definition (HD) video, for example, will pass “uncompressed” down a serial digital interface (SDI) cable at 1.5Gbps. Storing and moving content at these data rates is impractical for most media facilities, so we compress the signal by removing psychovisually, spatially, and often temporally redundant elements. Most compressions schemes will ensure that decompressing or decoding the file requires less processing cycles that the compression process. However, it is inevitable that some cycles are necessary and, as video playback has a critical temporal element, it will always be necessary to “read ahead” in a video file and buffer at the playback client. Where temporally redundant components are also removed, such as in a MPEG LongGOP compression scheme like Sony XDCAM HD, the buffering requirements are significantly increased as the client will need to read all the temporal references, typically a minimum of one second of video, or 1Gb of data.
When compared to video, the data rate of audio and ancillary data (captions, etc.) is small enough that often it is stored “uncompressed” and therefore requires less in the way of CPU cycles ahead of playback – this does, however, introduce some challenges for storage in the way that audio samples and ancillary data are accessed.
Media Access
Files containing video, even when compressed, are big – 50Mbps is about as low a bit rate as most media organizations will go. On its own, that might sound well within the capabilities of even consumer devices – typically a 7200rpm hard disk would have a “disk-to-buffer” transfer rate of around 1Gbps, but this is not the whole story.
- 50Mbps is the video bit rate – audio and ancillary data results in an additional 8-16Mbps
- Many operations will run “as fast as possible” – although processing cycles are often the restricting factor here, but even a playback or review process will likely include “off-speed” playback up to 8 or 16 times faster than real-time – the latter requiring over 1Gbps
- Many operations will utilize multiple streams of video
Sufficient bandwidth is therefore the first requirement for media operations, but this is not the only thing to consider. If we take a simple example of a user reviewing a piece of long-form material, a documentary for instance, in a typical manual QC of checking the beginning, middle and end of the media. As the media is loaded into the playback client, the start of the file(s) will be read from storage and, more than likely, buffered into memory. The user’s actions here are fairly predictable, and therefore developing and optimizing a storage system with deterministic behavior in this scenario is highly achievable. However, the user then jumps to a pseudo-random point in the middle of the program; at this point the playback client needs to do a number of things. First, it is likely that the player will need to read the header (or footer) of the file(s) to find the location of the video/audio/ancillary data samples that the user has chosen – a small, contained read operation where any form, if buffering, is probably undesirable. The player will then read the media elements themselves, but these too are read operations of varying sizes:
- Video: If a “LongGOP” encoded file, potentially up to twice the duration of the “GOP” – in XDCAM HD, 1 sec ~6MB
- Audio: A minimum of a video frames-worth of samples ~6KB
- Ancillary data: Dependent on what is stored, but considering captions and picture descriptions ~6B
Architecting a storage system that ensures that these reads of significantly different orders happen quickly and efficiently to provide the user with a responsive and deterministic way for dozens of clients often accessing the exact same file(s) requires significant expertise and testing.
Check back tomorrow for part two of “Shared Storage for Media Workflows,” where Janet Lafleur looks at how storage can be designed and architected to respond to these demands!