AAF Object Specification

1. Introduction

The Advanced Authoring Format, or AAF, is an industry-driven, cross-platform, multimedia file format that will allow interchange of essence and compositional information between AAF-compliant applications. These applications are primarily content creation tools such as Adobe® Premiere® , Photoshop® and After Effects® , Avid® Media Composer® , Softimage® DS, and Avid Cinema® , and Sonic Foundry's Sound Forge®, to name a few. These applications typically run on top of hardware supplied by Matrox and Pinnacle Systems.

Background

High-end, rich content authoring is a delicate struggle, wrestling together highly disparate source media, and arranging all of these elements to form a coherent whole.

Consider the scenario of putting together all of the audio elements for a film soundtrack: this involves transferring all of the music tracks, the ambient sound tracks, the performer's dialogue, and the Foley effects from their original source, remixing or editing all of them, and doing split-second synchronizations to the motion picture elements. This process requires a lot of information about each audio source element, as well as information about other essence associated with it at the moment of playback.

The media industry uses a wide range of source materials, as well as a set of highly varied capture tools with very different constraints (cameras, keyboards, audio input sources, scanners). This wide variety leads to a great deal of time and effort spent converting data into formats that can be used by the wide variety of authoring applications. Other issues include synchronization accuracy for time-based data (film, video, audio, animation); operating system and hardware dependencies for interactive media titles; and download, streaming and playback performance in Internet media applications.

AAF is an industry-driven, cross-platform, multimedia file format that allows interchange of data between AAF-compliant applications. There are two kinds of data that can be interchanged using AAF:

The Society of Motion Picture and Television Engineers (SMPTE) has addressed these problems in the dedicated hardware world by creating a set of standards that has worked very well through its history. Computer-based media tool vendors have come up with many varied, mostly proprietary approaches that all have many strengths as well as weaknesses. As digital technology for essence capture, editing, compositing, authoring, and distribution approaches ubiquity, the industry demands better interoperability and standard practices. This document is a specification for a new media industry standard file format, designed to meet information interchange needs.

Digital Essence File Formats and Issues

Rich media authoring often involves manipulating several types of digital essence files concurrently and managing interactions and relationships between them. These types of essence generally fall into the following categories:

Despite the relatively small number of categories, the sheer number of available digital essence file formats, each with its own strength or specific quality (i.e. preferred compression codec, optimized file size, preferred color resolution, support for transparency, support for sequential display, analog-to-digital fidelity, or operating system platform), results in many file format-to-file format conversions to produce a high-quality end product.

The following formats are just a few of the many in use today:

The Advanced Authoring Format helps the content creation and authoring process by addressing the shortcomings of these and other formats. In this way, AAF will allow creative energies to be more focused on the quality of the compositions rather than dealing with unnecessary and painful interchange issues, and allows software development to focus on improvements to the authoring application's feature set.

Digital Essence Authoring

The multimedia content authoring process generally involves 1) opening one or more source essence files, 2) manipulating or editing the essence, and 3) saving the results. Multimedia authoring applications read and manipulate certain types of essence and save the resulting file to their own proprietary format, which is usually specific to a particular hardware platform or operating system. This closed approach generally makes the reuse or repurposing of essence extremely difficult. In particular, the compositional metadata (the data that describes the construction of the composition and not the actual essence data itself) is not transferable between authoring applications.

Authoring process: compositing media

The Advanced Authoring Format defines authoring as the creation of multimedia content including related metadata. In the authoring process, it is important to record not only the editing and scripting decisions that have been made, but also the steps used to reach the final output, the sources used to create the output, the equipment configuration, intermediate data, and any alternative choices that may be selected during a later stage of the process.

For example, using Avid DigiDesign® Pro Tools® , an audio engineer might be recording, editing and mixing the sound for a video. She could record or load the source media tracks, do gain normalization, and then mix the tracks while applying pan, volume, and time compression transforms to the individual tracks. When the work is complete, she can save the files in two different formats. One format is Pro Tools native file format, the Sound Designer II™ audio file (SD2F), which is the transformed output file information (with a little bit of metadata available in the resource fork such as number of channels or sampling frequency). The second format is the Pro Tools Session Files format, which saves the metadata information (edit decisions, volume gradient transforms, audio processing) separate from the original source essence, allowing for additional changes to be made to the sound output in a nondestructive fashion.

If the authoring application saves the resulting essence information as a single, "flattened" file, then changes cannot be made without going through all of the steps and processes involved. Users may spend much time and energy reconverting and transferring information and reentering instructions, and ultimately rewriting the entire file.

If the authoring application saves the editing and transform data separately from the essence data, then the essence can be changed directly by a sound-editing application without having to open the authoring application. However, the metadata (data used to describe any compositional positioning, layering, playback behavior, editing cut lists, essence mixing, or manipulation) is not accessible unless the authoring application is opened.

In an ideal environment a user would be able to use many different applications and not be concerned with interchange. The essence data and the decisions made in one application would be visible to a user in another application.

The Advanced Authoring Format's unified interchange model enables interoperability between applications. This offers distinct advantages over the current model of separate formats and authoring tools for each essence type:

Authoring application file interchange

By enabling interoperability between authoring applications, AAF enables the user to focus on the creative production processes rather than struggling with conversions during the authoring and production phases of the project. Although there are many other issues related to completely transparent interoperability, the significant benefit that AAF provides to end users is assurance that compositions output by AAF-compliant applications will be accessible by the right tool for the job, without risk of being "stranded" by proprietary file format restrictions.

The authoring applications that can use AAF for interchange include:

Digital Essence Interchange

The Advanced Authoring Format provides applications with a mechanism to interchange a broad range of essence formats and metadata, but applications may have interchange restrictions due to other considerations. For this reason, it is important to understand the different kinds of interchange possible and to describe the various levels of interchange between authoring applications.

The following is a general description of the levels of AAF interchange that applications can adopt. For detailed information on a specific product's AAF support level, see that product's documentation.

The Advanced Authoring Format is designed to be a universal file format for interchange between systems and applications. It incorporates existing multimedia data types such as video, audio, still image, text, and graphics. Applications can store application-specific data in an AAF file and can use AAF as the application's native file format. AAF does not impose a universal format for storing essence content data. It has some commonly used formats built in, such as CDCI and RGBA images, WAV and AIFC audio, but also provides an extension framework for new formats or proprietary formats. As standard formats for essence are adopted by groups such as the SMPTE and the Audio Engineering Society (AES), AAF will provide built-in support for these formats.

Data Encapsulation

At its most basic level, AAF encapsulates and identifies essence data to allow applications to identify the format used to store essence data. This makes it unnecessary to provide a separate mechanism to identify the format of the data. For example, AAF can encapsulate and label WAV audio data and RGB video data.

Compositional Information

The actual audio, video, still, and other essence data makes up only part of the information involved in authoring. There is also compositional information, which describes how sections of audio, video or still images are combined and modified. Given the many creative decisions involved in composing the separate elements into a final presentation, interchanging compositional information as well as essence data is extremely desirable, especially when using a diverse set of authoring tools. AAF includes a rich base set of essence effects (such as transitions or chroma-key effects), which can be used to modify or transform the essence in a composition. These effects use the same binary plug-in model used to support codecs, essence handlers, or other digital processes, used to process the essence to create the desired impact.

Media Derivation

One of AAF's strengths is its ability to describe the process by which one kind of media was derived from another. AAF files contain the information needed to return to an original media source in case it needs to be used in a different way. For example, when an AAF file contains digital audio and video data whose original source was film, the AAF file may contain descriptive information about the film source, including edgecode and in- and out-point information from the intermediate videotape. This type of information is useful if the content creator needs to repurpose material, for instance, for countries with different television standards. Derivation information can also describe the creation of computer-generated essence: if a visual composition was generated from compositing 3D animation and still images, the AAF file can contain the information to go back to the original animation sources and make changes without having to regenerate the entire composition.

Flexibility and Efficiency

The Advanced Authoring Format is not designed to be a streaming essence format, but it is designed to be suitable for native capture and playback of essence, and to have flexible storage of large data objects. For example, AAF allows sections of data to be broken into pieces for storage efficiency, as well as including external references to essence data. AAF also allows in-place editing; it is not necessary to rewrite the entire file to make changes to the metadata.

Extensibility

The Advanced Authoring Format defines extensible mechanisms for storing metadata and essence data. This ensures that AAF will be able to include new essence types and essence data formats as they become commonly used. The extensibility of the effects model allows ISVs or tool vendors to develop a rich library of new and engaging effects or processes to be utilized with AAF files. The binary plug-in model gives AAF-compliant applications the flexibility to determine when a given effect or codec has been referenced inside of the AAF file, to determine if that effect or codec is available, and if not, to find it and load it on demand.

Digital Essence Delivery

In contrast to authoring systems, delivery systems and mechanisms are primarily used to transport and deliver a complete multimedia program. Although it would be ideal to use a single format for both authoring and delivery, these processes have different requirements. With authoring as its primary focus, AAF's metadata persistence enables optimal interchange during the authoring process. By allowing the content files to be saved without the metadata (that is by stripping out the metadata or flattening the file), AAF optimizes completed compositions for delivery, without restricting features needed for authoring.

From a technical standpoint, digital media content delivery has at least two major considerations: 1) target playback hardware (TV, audio equipment, PC) and 2) distribution vehicle (Film, Broadcast TV, DVD and other digital media, and network). When content is delivered, the delivery format is usually optimized for the particular delivery vehicle (DVD, DTV, and others), and the essence data is often compressed to conserve space or enable fast download.

We expect that the content created using AAF in the authoring process will be delivered by many different vehicles, including broadcast television, packaged media, film, and networks. These delivery vehicles will use data formats such as baseband video, MPEG-2 Transport Stream, QuickTime 4, and the Advanced Streaming Format (ASF). These formats do not need the rich set of metadata used during the authoring process, and can be optimized for delivery by stripping out this metadata or flattening the file.

AAF File Format

The Advanced Authoring Format is a structured container for essence and metadata that provides a single object-oriented model to interchange a broad variety of essence types including video, audio, still images, graphics, text, MIDI files, animation, compositional information and event triggers. The AAF format contains the essence assets and preserves their file-specific intrinsic information, as well as the authoring information (in- and out-point, volume, pan, time and frame markers, and so on) involving those essence assets and any interactions between them.

To meet the rich content authoring and interchange needs, AAF must be a robust, extensible, platform-independent structured storage file format, able to store a variety of raw essence file formats and the complex metadata that describes the usage of the essence data, and must be capable of efficient playback and incremental updates. As the evolution of digital media technology brings the high-end and low-end creation processes into convergence, AAF must also be thoroughly scalable and usable by the very high-end professional applications as well as consumer-level applications.

Structured storage, one of the technical underpinnings of AAF, refers to a data storage architecture that uses a "file system within a file" architecture. This container format is to be a public domain format, allowing interested parties to add future developments or enhancements in a due process environment. Microsoft is specifically upgrading the core technology compound file format on all platforms (Microsoft Windows®, Apple® Macintosh®, UNIX®) to address the needs of AAF, for instance, files larger than 2 gigabytes and large data block sizes.

Other important features of AAF include:

AAF Specification Development

The Advanced Authoring Format is the product of seven industry-leading companies, each contributing valuable solutions and technologies. The AAF task force members include Microsoft, Avid, Adobe, Matrox, Pinnacle Systems, Softimage, and Sonic Foundry. As the Advanced Authoring Format specification evolves, the promoting companies will concurrently integrate AAF support with their product offerings. In addition, the Advanced Authoring Format Software Development Kit (AAF SDK) will enable other adopting companies to readily provide AAF support in their products.

 


Copyright © 2000
Advanced Authoring Format Association. All rights reserved
Last modified: March 17, 2000