Hey there! Jan Morgenstern here. I’m responsible for the scoring, sound design, and re-recording mix of Sintel, and I just realized that I’ve been cowardly ducking any blogging duties that might have arised ever since I started working on this project. Clearly, I’m due. For my blog inauguration, I’ve chosen the proven subject of “what am I doing here anyway”.
Quick note: I boldly assume that most of you are more at home in the realm of 3D and video than audio, and seeing that I regularly go “wait, rig what with the what now?” when I read posts from the rest of the team, I’ve put a little glossary at the end of this article. I hope that this keeps things more readable than liberal use of parentheses would.
Taking care of all audio aspects of a film by oneself provides some great opportunities, but the sheer number of things that need to come together to build a cohesive soundtrack can be a bit daunting, at least to me. Sintel is a relatively complex production in that the action is pretty condensed and takes place in a lot of different settings during a short amount of time. Because of this, I’ve been looking for strategies to avoid wearing too many hats at once. Here’s what seems to work for me:
– Split the whole task into smaller, more manageable chunks, loosely modelled after the audio workflow of a feature film production.
– Keep these chunks in separate DAW projects.
– Work on each of them in isolation until it can stand on its own.
– Worry about glueing them together later on.
Apart from keeping me from going bonkers, this approach has another big advantage, which has to do with the fact that scoring and mixing, while traditionally being post production tasks (and thus taking place after the film has been shot), are happening alongside the main production itself in our case. We’re trying our best to follow a strict process for controlling which aspects of the movie are “frozen” and thus ready for post production, but it’s still inevitable that some things slip through the process and get reconsidered at a later point. If I kept everything in one place, it’d make accommodating these late changes an exercise in frustration.
Whenever I’m finished with a subproject (for reasonable values of “finished”), I bounce it to one or multiple submixes, called stems; these are just audio files, which then get incorporated into the next stage of production. It’s pretty similar to the process of rendering scenes in layers, which are then being post-processed and combined in compositing. In this manner, the subprojects form a dependency graph, which in the case of Sintel looks like this:
If something changes way up in the graph – say, in the Dialogue Edit –, it takes a bit of work to deploy those changes up to the final mix; in the example (which does occur from time to time), it comes down to bouncing the dialogue tracks of the characters that appear in the affected scene, bringing these changes up in the respective Scene Mix, bouncing the dialogue stem of that mix, and picking up the new stem in the Master Mix. You can figure that it’s a good idea to get the topmost elements as close to final as possible before advancing to the next stage.
Here’s what happens in each stage:
Dialogue Edit. This is a fairly simple 3-track project in which I assemble the 200-ish takes from our dialogue recording session into contiguous dialogue tracks for each of the characters. Each character’s dialogue is then bounced to a simple mono file with no automation or further processing (these happen within the Scene Mixes, as I need context from the rest of the soundscape in order to do that properly.)
Sound Design. Some of the special effects are a bit more complex, as they’re made up of layered and processed sounds; this is especially true for the Dragon utterances, which also use granular synthesis a lot. It would be rather impractical to handle this in the actual Scene Mixes, as it would drive their track counts even higher and make them more confusing and prone to mistakes. For this reason, I’m building “palettes” of raw source sounds inside separate projects, which I then pick up as stereo spot effects in the…
Scene Mixes. Here’s where everything that makes up the soundscape of a scene (except the music) comes together for the first time. These projects are by far the most complex and automation-heavy in the whole workflow; they often have around 60-80 multichannel tracks. Each track in these projects falls into one of 4 categories: Dialogue, Spot Effects, Foley, or Ambience (as you can see in the screenshots, these are color-coded.) When I’m done with a scene, I bounce the tracks in each of these categories into a 5.1 stem, which is then combined with the other Scenes in the Master Mix.
Scoring. Here’s where I compose and arrange the film music for each scene (or combination of scenes, in case a music cue spans several scenes). These are just traditional stereo music production projects, but they can get pretty entangled all on themselves (my basic orchestral template consists of 100-ish MIDI tracks, each of which addresses an average of 15-20 switchable instrument articulations), so I’ve been moving away from handling the mixing within the same project. Instead, I’ve begun sourcing the mix out into yet another project (wheee!) and bouncing the orchestral sections (Woodwinds, Brass, Percussion, Choirs, Strings…) into separate, unprocessed stereo stems, which then go into the…
Score Master Mix. In this single audio-only project, I’m picking up the 7 to 8 stereo stems from each of the music cues and combine them into a continuous music stem for the whole movie. This is a pretty linear process, not unlike working with a tape machine. This is also where the stereo stems are upmixed into a 5.1 stem. It wasn’t quite easy to let go of the idea of mixing the music in 5.1 natively from the beginning, but I experimented with that during Elephants Dream and Big Buck Bunny, and I just didn’t feel that the end result justified the extra effort.
Master Mix. Here I combine the output stems from the scene mixes and the score mix into the final soundtrack. In order to accommodate smooth scene transitions, some of the stems need to be dovetailed into one another (instead of cut abruptly), so I have at least two tracks for every kind of stem.
Once the Master Mix is finished, it’s almost time for me to whip out the 30 year old scotch. The 5.1 output of the mix ends up on the DVD and the surround web releases, and is downmixed to stereo for other release formats. For the theatrical release, I’ll deliver the stems to a Dolby-certified mastering lab that prepares SR-D and SR printmasters for duplication. I’ll also try to get access to a cinema-style dubbing stage for the final mix, as that always gives a much more accurate picture how things will sound in cinemas out there than my humble studio could.
Finally, some big words which you can use to impress audio engineers at the next cocktail party (I’m kidding, we’re all miserable social misfits who don’t get invited to cocktail parties):
Automation: A programmed variation of a parameter over time. Commonly automated parameters are level, panorama position, equalizer or filter settings, reverb send levels, and so on. Automation can be pretty useful in music production; in post production, it’s absolutely essential.
Bounce: The audio equivalent of “render”.
DAW: Digital Audio Workstation. The central piece of software that an audio guy uses to do stuff; in highly proprietary systems such as Pro Tools, the term often includes the hardware, too. My DAW of choice is Steinberg’s Nuendo. Nuendo isn’t free software, which makes me kind of a cool renegade underdog.*
Dolby SR-D and SR: Widespread systems of getting multichannel sound on physical 35mm copies. SR-D is the professional equivalent to Dolby Digital on DVDs (which provides 5.1 discrete channels), while SR refers to the old method of encoding 4 non-discrete channels into an analog optical soundtrack. If you get SR-D on a copy, you always need to provide an SR soundtrack as a fallback, and both need to go through the hands of a Dolby-approved voodoo priest before a duplication lab will print them.
Downmix: Catch-all phrase for reducing a multichannel signal to fewer channels. When you’re watching a DVD with a 5.1 soundtrack on your stereo TV set, your DVD player downmixes the signal on-the-fly, so that information from the center and surround speakers doesn’t get lost.
Foley: Footsteps, cloth movements, subtle sounds of characters interacting with things. The stuff that even film audio geeks don’t notice unless it’s either missing or done wrong.
Stem: A submix of tracks that usually belong to a common category, such as dialogue, ambience, or music. Usually, everything up to the final mix is being handled in stems, which is a good compromise between having total control and keeping things practical. A lot of audio post contractors, such as mastering studios, expect their clients to deliver material in stems instead of monolithic mixes so that they can do finishing touches.
* It was a joke. Please don’t pummel me and take my lunch money again.