Advanced Authoring Format - Whitepaper

The Advanced Authoring Format, or AAF, is an industry-driven, cross-platform, multimedia file format that will allow interchange of media and compositional information between AAF-compliant applications. These applications are primarily content creation tools such as Adobe Premiere, Photoshop and AfterEffects, Avid Media Composer and Cinema, Sonic Foundry's Sound Forge, and SOFTIMAGE|DS, to name a few. These applications typically run on top of hardware supplied by Matrox, Pinnacle Systems, and Truevision.

Background

High-end, rich content authoring is a delicate struggle, a wrestling together of highly disparate source media, and arranging all of these elements to form a coherent whole.

Consider the scenario of putting together all of the audio elements for a film soundtrack: this involves transferring all of the music tracks, the ambient sound tracks, the performer's dialogue, and the Foley effects from their original source, remixing or editing all of them, and doing split-second synchronizations to the motion picture elements. This process requires a lot of information about each audio source element, as well as information about other media associated with it at the moment of playback.

The media industry uses a wide range of source materials, as well as a set of highly varied capture tools with very different constraints (cameras, keyboards, audio input sources, scanners). This wide variety leads to a great deal of time and effort spent converting media into formats that can be used by the wide variety of authoring applications. Other issues include synchronization accuracy for time-based media (film, video, audio, animation), operating system and hardware dependencies for interactive media titles, and download, streaming and playback performance in Internet media applications.

The Society of Motion Picture and Television Engineers (SMPTE) has addressed these problems in the dedicated hardware world by creating a set of standards that has worked very well through its history. Computer-based media tool vendors have come up with many varied, mostly proprietary approaches that all have many strengths as well as weaknesses. As digital technology for media capture, editing, compositing, authoring, and distribution approaches ubiquity, the industry demands better interoperability and standard practices. This document is a specification for a new media industry standard file format, designed to meet information interchange needs.

Digital Media File Formats and Issues

Rich media authoring often involves manipulating several types of digital media files concurrently and managing interactions and relationships between them. These types of media generally fall into the following categories:

Despite the relatively small number of categories, the sheer number of available digital media file formats, each with its own strength or specific quality (i.e. preferred compression codec, optimized file size, preferred color resolution, support for transparency, support for sequential display, analog-to-digital fidelity, or operating system platform), results in many file format-to-file format conversions to produce a high-quality end product.

The following formats are just a few of the many in use today:

The Advanced Authoring Format helps the content creation and authoring process by addressing the shortcomings of these and other formats. In this way, AAF will allow creative energies to be more focused on the quality of the media compositions rather than dealing with unnecessary and painful interchange issues, and allows software development to focus on improvements to the authoring application's feature set.

Digital Media Authoring

The multimedia content authoring process generally involves 1) opening one or more source media files, 2) manipulating or editing the media, and 3) saving the results. Multimedia authoring applications read and manipulate certain types of media, and save the resulting file to their own proprietary format, usually specific to a particular hardware platform or operating system. This closed approach generally makes the reuse or repurposing of media extremely difficult. In particular, the compositional metadata (the data that describes the construction of the composition and not the actual media data itself) is not transferable between authoring applications.

The Advanced Authoring Format defines authoring as the creation of multimedia content including related metadata. In the authoring process, it is important to record not only the editing and scripting decisions that have been made, but also the steps used to reach the final output, the sources used to create the output, the equipment configuration, intermediate data, and any alternative choices that may be selected during a later stage of the process.

For example, using DigiDesign's Pro Tools, an audio engineer might be recording, editing and mixing the sound for a video. She could record or load the source media tracks, do gain normalization, and then mix the tracks while applying pan, volume, and time compression transforms to the individual tracks. When the work is complete, she can save the files in two different formats. One format is Pro Tools' native file format, the Sound Designer 2 audio file (SD2F), which is the transformed output file information (with a little bit of metadata available in the resource fork such as number of channels or sampling frequency). The second format is the Pro Tools Session Files format, which saves the metadata information (edit decisions, volume gradient transforms, audio processing) separate from the original source media, allowing for additional changes to be made to the sound output in a non-destructive fashion.

If the authoring application saves the resulting media information as a single, "flattened" file, then changes cannot be made without going through all of the steps and processes involved (edits, etc.) Users may spend much time and energy re-converting and transferring information and reentering instructions, and ultimately rewriting the entire file.

If the authoring application saves the editing and transform data separately from the media data, then the media can be changed directly by a sound editing application without having to open the authoring application. However, the metadata (data used to describe any compositional positioning, layering, playback behavior, editing cut lists, media mixing or manipulation, etc.) is not accessible without opening the authoring application. In an ideal environment a user would be able to use many different applications and not be concerned with interchange. The media data and the decisions made in one application would be visible to a user in another application.

The Advanced Authoring Format's unified interchange model enables interoperability between applications. This offers distinct advantages over the current model of separate formats and authoring tools for each media type:

The authoring process requires a wide range of applications that can combine and modify media. Although applications may have very different domains, such as an audio editing application and a 3-D graphics animation application, the authoring process requires both applications to work together to produce the final presentation.

Applications can extract valuable information about the media data in an AAF file even when it does not understand the media data format. It can display this information, which allows the user to better coordinate the authoring process. By enabling interoperability between authoring applications, AAF enables the user to focus on the creative production processes rather than struggling with conversions during the authoring and production phases of the project. Although there are many other issues related to completely transparent interoperability, the significant benefit that AAF provides to end users is assurance that compositions output by AAF-compliant applications will be accessible by the right tool for the job, without risk of being "stranded" by proprietary file format restrictions.

The authoring applications that can use AAF for interchange include:

Television studio systems, including picture and sound editors, servers, effects processors, archiving, and broadcast automation systems

Digital Media Interchange

The Advanced Authoring Format provides applications with a mechanism to interchange a broad range of media formats and metadata, but applications may have interchange restrictions due to other considerations. For this reason, it is important to understand the different kinds of interchange possible and to describe the various levels of interchange between authoring applications. The following is a general description of the levels of AAF interchange that applications can adopt. For detailed information on a specific product's AAF support level, see that product's documentation.

Interchange of limited set of media data Interchange of broad set of media data with some related metadata Interchange of media data and rich set of metadata including compositions but having limited support for some media types Full interchange of all media types and all metadata described in this specification and preserving any additional private information stored in the AAF file,

The Advanced Authoring Format is designed to be a universal file format for interchange between systems and applications. It incorporates existing multimedia data types such as video, audio, still image, text, and graphics. Applications can store application-specific data in an AAF file and can use AAF as the application's native file format. AAF does not impose a universal format for storing media content data. It has some commonly used formats built in, such as CDCI and RGBA images, WAV and AIFC audio, but also provides an extension framework for new formats or proprietary formats. As standard formats for media are adopted by groups such as the Society for Motion Picture and Television Engineers (SMPTE) and the Audio Engineering Society (AES), AAF will provide built-in support for these formats.

Data Encapsulation

At its most basic level, AAF encapsulates and identifies media data to allow applications to identify the format used to store media data. This makes it unnecessary to provide a separate mechanism to identify the format of the data. For example, AAF can encapsulate and label WAV audio data and RGB video data.

Compositional Information

The actual audio, video, still, and other media data makes up only part of the information involved in authoring. There is also compositional information, which describes how sections of audio, video or still images are combined and modified. Given the many creative decisions involved in composing the separate elements into a final presentation, interchanging compositional information as well as media data is extremely desirable, especially when using a diverse set of authoring tools. AAF includes a rich base set of media effects (such as transitions or chroma-key effects), which can be used to modify or transform the media in a composition. These effects use the same binary plug-in model used to support codecs, media handlers or other digital processes to be used to process the media to create the desired impact.

Media Derivation

One of AAF's strengths is its ability to describe the process by which one kind of media was derived from another. AAF files contain the information needed to return to an original source of media in case it needs to be used in a different way. For example, when an AAF file contains digital audio and video data whose original source was film, the AAF file may contain descriptive information about the film source, including edgecode and in- and out-point information from the intermediate videotape. This type of information is useful if the content creator needs to repurpose material, for instance, for countries with different television standards. Derivation information can also describe the creation of computer-generated media: if a visual composition was generated from compositing 3-D animation and still images, the AAF file can contain the information to go back to the original animation sources and make changes without having to regenerate the entire composition.

Flexibility and Efficiency

The Advanced Authoring Format is not designed to be a streaming media format, but it is designed to be suitable for native capture and playback of media, and to have flexible storage of large data objects. For example, AAF allows sections of data to be broken into pieces for storage efficiency, as well as inclusion of external references to media data. AAF also allows in-place editing; it is not necessary to rewrite the entire file to make changes to the metadata.

Extensibility

The Advanced Authoring Format defines extensible mechanisms for storing metadata and media data. This ensures that AAF will be able to include new media types and media data formats as they become commonly used. The extensibility of the effects model allows ISVs or tool vendors to develop a rich library of new and engaging effects or processes to be utilized with AAF files. The binary plug-in model gives AAF-compliant applications the flexibility to determine when a given effect or codec has been referenced inside of the AAF file, determine if that effect or codec is available, and if not, to find it and load it on demand.

Digital Media Delivery

In contrast to authoring systems, delivery systems and mechanisms are primarily used to transport and deliver a complete multimedia program. Although it would be ideal to use a single format for both authoring and delivery, these processes have different requirements. With authoring as its primary focus, AAF's metadata persistence enables optimal interchange during the authoring process. By allowing the content files to be saved without the metadata (i.e. "stripping out the metadata" or "flattening the file"), AAF optimizes completed compositions for delivery, without restricting features needed for authoring. From a technical standpoint, digital media content delivery has at least two major considerations: 1) target playback hardware (TV, audio equipment, PC) and 2) distribution vehicle (Film, Broadcast TV, DVD and other digital media, network). When content is delivered, the delivery format is usually optimized for the particular delivery vehicle (DVD, DTV, and others), and the media data is often compressed to conserve space or enable fast download.

We expect that the content created using AAF in the authoring process will be delivered by many different vehicles including broadcast television, packaged media, film, and over networks. These delivery vehicles will use data formats such as baseband video, MPEG-2 Transport Stream and the Advanced Streaming Format (ASF). These formats do not need the rich set of metadata used during the authoring process, and can be optimized for delivery by stripping out this metadata or "flattening" the file.

AAF File Format

The Advanced Authoring Format is a structured container for media and metadata that provides a single object-oriented model to interchange a broad variety of media types including video, audio, still images, graphics, text, MIDI files, animation, compositional information and event triggers. The AAF format contains the media assets and preserves their file-specific intrinsic information, as well as the authoring information (in- and out-point, volume, pan, time and frame markers, etc.) involving those media assets and any interactions between them.

To meet the rich content authoring and interchange needs, AAF must be a robust, extensible, platform-independent structured storage file format, able to store a variety of raw media file formats and the complex metadata that describes the usage of the media data, and be capable of efficient playback and incremental updates. As the evolution of digital media technology brings the high-end and low-end creation processes into convergence, AAF must also be thoroughly scalable and usable by the very high-end professional applications as well as consumer-level applications.

Structured storage, one of the technical underpinnings of AAF, refers to a data storage architecture that uses a "file system within a file" architecture. This container format is to be a public domain format, allowing interested parties to add future developments or enhancements in a due process environment. Microsoft is specifically upgrading the core technology compound file format on all platforms (Microsoft Windows®, Apple® Macintosh®, UNIX) to address the needs of AAF, for instance, files larger than 2 gigabytes and large data block sizes.

Other important features of AAF include: Version control, allowing an AAF file's data to be edited and revised while retaining the history of the changes such that an older version of the file can be recalled, if necessary. AAF files will retain information about the original sources, so that the resulting edited media can be traced back to its original source. References to external media files, with files located on remote computers in heterogeneous networks. An extensible video and audio effects architecture with a rich set of built-in base effects. Support for a cross-platform binary plug-in model.

AAF Specification Development

The Advanced Authoring Format is the product of seven industry-leading companies, each contributing valuable solutions and technologies. The AAF task force members include Microsoft, Avid Technology, Adobe, DigiDesign, Matrox, Pinnacle Systems, Softimage, Sonic Foundry, and Truevision. As the Advanced Authoring Format specification evolves, the promoting companies will concurrently integrate AAF support with their product offerings. In addition, the Advanced Authoring Format Software Development Kit (AAF SDK) will enable other adopting companies to readily provide AAF support in their own products.

 


Copyright © 2000
Advanced Authoring Format Association. All rights reserved
Last modified: March 17, 2000