Exploring the Virtual Orchestra through Blood, Sweat, and Tears. Part I

EN | FR

Exploring The Virtual Orchestra Through Blood, Sweat, and Tears

Part I: Creating a Virtual Orchestration from a Pre-Existing Score. 

by Elizer Kramer

Published: May 6th, 2021 | How to cite

DOI

1. Introduction

I was introduced to the virtual orchestra in a composition for film course that I took as part of my master’s at the University of Montreal in 2015. My initial experience was one of disappointment and frustration because using the virtual orchestra negatively impacted the quality of my compositions. Furthermore, I could not make it sound nearly as compelling as the real orchestra. Eventually, the disappointment began to diminish, and the frustration led me to the question central to my doctorate: can one develop virtual orchestration as an independent art form by developing an aesthetic that is proper and unique to it?

To investigate this, I am creating virtual orchestrations of pieces that I have written and recorded for live orchestra to gain insight into the strength/weaknesses of the virtual orchestra and composing pieces for the virtual orchestra that showcase its independence from human performers. This TOR entry outlines my first attempt at making a virtual orchestration from one of my compositions, Récitatif, chant et danse. In addition, it provides some background on virtual orchestration and discusses the different elements involved in creating one. At a later stage, I will follow up with an entry in which I discuss the process and outcome of composing a piece specifically for the virtual orchestra. 

Although virtual orchestrations can be found in abundance in film and television, it does not appear to be a topic often studied in academia. By sharing a window into the creative process of making a virtual orchestration, I hope it will provide insight to the reader who is unfamiliar with this art form. For the virtual orchestrator reading this entry, I hope it serves as a reminder that you do not suffer alone.

2.   Virtual Orchestration

Virtual orchestration is the practice of creating an orchestration or an orchestral composition in a digital audio workstation (DAW) using virtual instruments or sample libraries. Note that this is not equivalent to the automatically generated simulations one finds in notation software such as Sibelius, Finale, or Dorico: virtual orchestrations are created within a DAW by working extensively on a piece’s instrumental playback, pacing, and spatial properties. These highly realistic simulations are often used to propose music to film directors before making the final recordings (Furduj 2019, 2.), but in some cases, where time and financial constraints do not allow for a recording, the virtual orchestrations serve as the final product (Pejrolo and DeRosa 2017, xvi). Alternatively, when a full orchestral recording is not an option, composers will sometimes supplement their virtual orchestrations with layers of some recorded instruments.

The sounds of the virtual orchestra are generated by virtual instruments, which are typically composed of recordings of actual performers (Morgan 2016, 22) playing individual pitches across the entire dynamic and chromatic range of a given instrument (Furduj 2019, 103). The instruments’ parameters (dynamics, timbre, vibrato, etc.) are manipulated through MIDI control changes (CCs), which are usually assigned to a MIDI automation controller, such as a mod wheel, a fader, or a breath controller (Pejrolo and DeRosa 2017, 7).  

Sampled virtual instruments use velocity or dynamic layers to simulate the change of timbre in relation to the volume at which a note is played. For instance, the clarinet in the Vienna Symphonic Library Special Edition Volume 1 contains three velocity layers, meaning that each pitch was recorded at three different dynamic levels, resulting in three different timbres. The shift between velocity layers sounds convincing when performing short articulations because each note is rearticulated before the change occurs. When performing sustained notes, however, the transition from one velocity layer to the next can sound unrealistic, especially when exposed. Whereas with a real instrument,[1] the timbre of a sustained note will develop gradually in relation to its volume, the timbres of a sustained note on a sampled instrument will crossfade without the same progressive transformation of colour. To avoid sudden shifts in timbre, one can increase the volume of sustained notes without triggering changes in velocity layers. This, of course, ignores the laws of physics and can result in an unconvincing sound if not done with caution.[2]

A virtual orchestration is generally considered most successful if it makes the listener believe that she is hearing a live orchestral performance.[3] One of the few studies on the realism of the virtual orchestra concludes that the average music consumer can hardly distinguish between real recordings and virtual simulations of Stravinsky’s The Rite of Spring (Kopiez et al. 2016, 1).[4] A high-quality virtual orchestration is not only reliant on a composer’s skill at composition and orchestration, but also on the sample libraries at her disposal: a composer may often imagine music that will sound compelling if performed by a real orchestra, but does not when performed by a virtual one due to “technical limitations in virtual instrument technology” (Furduj, p. 74). Similarly, the rules and logic governing software may change the nature of a composer’s output by complicating or facilitating certain compositional tasks (Adenot 2012, 154). For instance, if composing in Cubase, one may feel reluctant to write tuplets greater than triplets because it is not possible to set the grid value to them (see figure). Lastly, the absence of specific articulations from a sample library will limit what one can write.

Cubase grid

“You can’t write for the orchestra and then mock it up. You really have to write for the limitations of the sounds you have”-Christophe Beck, film and television composer (Karlin and Wright, p. 102).

3.   A Multidisciplinary Art

Unlike with the real orchestra, in which the conductor and instrumentalists bridge the gap between the written notes and musical experience, the virtual orchestrator is responsible for creating the final product that her audience will listen to. To do this, she must additionally become the trinity of performer, conductor, and sound engineer (Furduj 2019, 97).

3.1 Performer

One of the reasons why proper virtual orchestrations sound much better than automatically generated simulations is that one can edit and control the instrument’s parameters, effectively creating one’s own performance. The three most common ways to input a virtual instrument’s notes into a DAW are:

1)  By drawing them directly into the MIDI editor using the pencil tool:

 

2) By using a midi keyboard in conjunction with step input:

 

3) By recording one’s own performance:

The video shows me performing a melody while controlling several of the instrument parameters with MIDI controllers. Flutter tongue, mutes, and volume are all controlled by moving the faders, vibrato by biting the breath controller, and dynamics/timbre by a combination of breathing into the breath controller and playing the MIDI keyboard; the speed at which the key is struck modifies the sound.

 

Like the houses in the fable of The Three Little Pigs, these three methods show a progressive increase in substance and craftsmanship. Drawing with the pencil tool, while easy, does not allow for simultaneous control over any CC values. Step input requires limited keyboard skills, but only records the notes’ velocity values. Moreover, it does not offer very human-sounding playback because the inputted notes are automatically quantized, resulting in complete rhythmic accuracy. Finally, recording one’s performance requires the most skill but can, without question, give the most musical and realistic results; in addition to playing the piano, one should manipulate one or several CC controllers to, for instance, shape the virtual instrument’s timbre and volume. The combination of playing while using these controllers in real-time effectively requires one to learn a new instrument. Much as a pianist needs to adjust to the different pianos she performs on, so too must the virtual orchestrator adapt to the virtual instruments she uses.[5] 

These methods are not mutually exclusive. In fact, one will likely make use of all of them in a given project. Latency issues deriving from software, hardware, or virtual instruments may cause a virtuoso pianist to find passages she would otherwise perform with ease on an acoustic piano to be unplayable in a DAW on a MIDI keyboard. In these cases, the virtual orchestrator must input the music using step input or the pencil tool.

3.2 Conductor

The virtual orchestrator must sculpt musical phrases by meticulously working on their balance and timing. Even if the individual instrumental lines have been performed and programmed well, the orchestrator may notice that they are not satisfactory in the context of the entire ensemble. As a conductor may do in a rehearsal, the virtual orchestrator should, for instance, listen to the trumpets alone, confirm that they are well-balanced with the other brass instruments, and then balance them with the rest of the orchestra. The virtual orchestrator must also modify the tempo so that it is flexible enough (or in some cases rigid enough) for the appropriate phrasing. Unlike a live performance in which the conductor can bend tempo to her will, the tempo of a virtual orchestration is fixed. There are two ways to approach tempo in a virtual orchestration:

1) By performing freely, ignoring the project tempo.

2) By performing with the metronome and subsequently automating the project’s tempo to create the desired phrasing.

Performing freely allows for the same control and spontaneity of a regular performance but at the cost of the formal structure of a piece or a passage. For instance, a ritardando over three measures in a written score may equate to a five-bar passage when performed freely in a DAW. This method may be viable when there is limited activity in the orchestra, for instance, in solo passages, but it quickly becomes impractical with textures consisting of several elements that need to be synchronized. In contrast, performing to the project’s tempo results in a well-organized workspace but creates a disconnect between the performance and the result. Timing is inextricably linked to phrasing, and if one separates the two, then other aspects of it, such as touch and tone, will undoubtedly be affected.

While there are circumstances that can benefit from either method, it would be unwise to use the first when creating a virtual orchestration from a pre-existing score that is defined by a rigid temporal structure: the sacrifice of the project’s structure would create too much inconvenience for the benefits of a free performance to bear fruit.

3.3 Sound Engineer

Regardless of how well-performed or fine-tuned a virtual orchestration is, it will rarely sound convincing if it is not properly mixed and processed. This is because the virtual orchestrator’s sound palette typically consists of sample libraries that were recorded in different spaces, some of which with reverberation (wet) and others without (dry). Dry virtual instruments lack a sense of body and depth but are flexible for this reason and can be sculpted to fit into almost any space. Libraries with baked-in reverberation offer spatial realism but do not always blend well with other wet libraries. 

The term “blend” differs in virtual and live orchestration. In live music, blend refers to the extent to which timbres fuse and is based on a variety of acoustic factors (McAdams and Giordano 2016, 5), proper orchestration, and the musicians’ capacity to adjust in real-time. In virtual orchestration, although blend can also be used to describe timbral fusion, it more typically refers to creating cohesion between different sample libraries, regardless of their timbre. For instance, two violin ensembles from different sample libraries playing in unison may not blend well because they were recorded in different locations, with different positioning of the instruments, or with different microphone setups. To unify wet libraries, one typically sends them to the same hall reverberation channel and adjusts their send levels until they sound cohesive.

Other effects, such as equalization, compression, and delay, also help provide a sense of realism and polish to virtual instruments. It is common practice to create a template in which both these effects and the volume of the sample libraries have been adjusted, allowing one to begin composing with a well-balanced and realistic-sounding orchestra at hand. The benefit of this is twofold: it saves time and encourages creativity by removing this more technical stage from the process from the beginning of every new project. The drawback of a template is that it may influence one to always compose using the same sound palette.   

4. Récitatif, chant et danse

Récitatif, chant et danse is an orchestral piece that I composed for the Orchestre de la Francophonies orchestral composition workshop in 2019. [6] As part of the workshop, I was allocated two one-hour rehearsals and a thirty-minute recording session. This generous amount of rehearsal time made Récitatif, chant et danse an ideal candidate to make a virtual orchestration out of: it allowed me to become intimately acquainted with the balance and blend of the piece and it allowed the musicians enough practice to perform it accurately. Furthermore, Récitatif, chant et danse’s rather traditional writing would, in theory, lend itself well to the virtual setting. The lack of extended techniques means that most sample libraries would be compatible with its reproduction, and its straightforward rhythm would allow me to record myself performing the individual instrumental lines, creating a more human performance. To be clear, the goal of the simulation was not to replicate the performance of the recording, by mimicking its tempo and phrasing, but to offer my own interpretation of the piece. (PDF of score)

The live recording        

  The virtual orchestration

4.1 Software, Equipment, and Sample Libraries

The following is a list of the software, equipment, and virtual instruments that I used for the virtual orchestration. Due to the expense of almost anything related to virtual orchestration, I had to use the tools already at my disposal.

Software

  • Cubase 10.5‒a digital audio workstation.

  • Vienna Ensemble Pro 6‒a program used to host virtual instruments.

  • East West Space II-a reverb plugin.

Equipment

  • M-Audio Keystation 88es‒a MIDI keyboard.

  • TEControl USB MIDI Breath and Bite Controller 2‒a MIDI controller that allows one to control CC values through breathing, biting, tilting, and nodding the head.

Virtual Instruments

  • Vienna Symphonic Library Special Edition 1‒woodwinds, timpani, most percussion, harp, solo strings.

  • Sample Modelling‒brass.

  • Kontakt Factory Library‒glockenspiel, triangle, and woodblocks.

  • Cinematic Studio Strings‒string ensembles.

Workstation

  • Windows 10

  • i9-10850k @ 3.6 ghz

  • 64gb ram

 4.2 Performing the Simulation

To avoid sounding mechanical, I performed most of the instrumental lines directly into my DAW while controlling their timbre with my breath controller. After performing the lines, I removed any blemishes in the CC data (my breaths, for example) and added CC data for the parameters that I had not been able to control in real-time. Latency issues required me to adjust the start positions of the notes and occasional errors in performance sometimes required me to adjust their rhythmic values. The latency issues also made passages with rapid gestures, such as the climax of the chant militaire, extremely difficult and inefficient to perform. To save time, I copied and pasted the MIDI notes that I had imported from Dorico and randomized their lengths and start positions so that they would not be played with 100% rhythmic accuracy.[7] I was unable to perform most of the legato string parts because, as mentioned, CSS contains three different delays based on the legato transition that is triggered.[8] To remedy this, I used the MIDI from Dorico and adjusted the start position of the notes based on the legato transitions that I had chosen. Once these were in place, I recorded myself manipulating the CC data corresponding to timbre and volume with my breath controller and modulation wheel, respectively.  

4.3 Obstacles

Despite the favourable conditions outlined for creating the virtual orchestration of Récitatif, chant et danse, I encountered several obstacles, often due to the lack of articulations in my sample libraries. The need for adjustments is to be expected when translating something from one medium to another, however, I was surprised by the extent of compromises I had to make. The following describes the problems that I encountered and my solutions to them.

String divisi and multiple stops.

Although string divisi is common practice in live orchestration, it is a commodity that is only included in a few sample libraries. To simulate divisi, I duplicated my string sections and lowered their volume. Although this allows for all the notes in the divisi sections to be played, it does not offer the most realistic results. For instance, CSS contains ten first violins, and so a passage played divisi a 2 becomes two sections of ten violins playing, rather than two sections of five.

Multiple stops can be faked within one instance of the same instrument, but otherwise follow the same principle as divisi (a double stop with CSS translates to two sections of ten violins each playing one note). To simulate multiple stops in which all the pitches of the chord cannot be attacked simultaneously, I offset the appropriate notes in the chord and use sforzando and marcato articulations to create an acciaccatura-like effect (see figure).  

 

Writing a multiple stop in the score vs in Cubase (m. 40, chant militaire)

 

A comparison of divisi in the recording and the virtual orchestration.

A comparison of multiple stops in the recording and the virtual orchestration.

Woodwinds à 2

Because VSLS1 only contains one of each woodwind, any passage to be played in unison by two identical instruments (for instance, flutes I and II) would trigger the same sample twice and result in phasing. To remedy this, I input the notes for the second instrument a whole step higher and transposed them down a whole step using the pitch bend function.[9] 

Flutter tongue

To give the impression of flutter tongue in the woodwinds[10], I used fast repeated note articulations. Although the timbre of the repeated note articulations does not resemble flutter tongue in the slightest, the articulations provide movement that adds to the intensity of the passages in which flutter tongue is supposed to be used. Had the flutter tongue been exposed, this would not have worked.

Examples of flutter tongue in the woodwinds and brass:

Refer to the recordings from m. 37-43 above [recording or simulation].

Glissandi

In the composition, glissandi occur in the trombones, harp, and strings. Of these three, only the trombones contained samples of them. To give the impression of a harp glissando, I performed a white-note glissando on my MIDI keyboard and then modified the pitches to fit the desired scale. While far from a perfect solution, it succeeded in creating the sweeping gesture that characterizes a harp glissando. The same was not possible for the string glissandi. Instead, I connected the departure and arrival notes of the glissando with a portamento articulation. While the effect of the glissando was lost, it at least allowed the two notes to be connected by a slide.  

Examples of glissandi in the trombones:

Examples of glissandi in the harp and strings:

Refer to the beginning of the recordings of m. 37-43 above (strings only) [recording or simulation]..

Refer to the end of the recordings of m. 45-50 of Danse above [recording or simulation].

4.4 General Impressions

Of all the instruments, the strings were most problematic. In general, they lack clarity and definition and do not contain a wide enough range of expression. I managed to improve their definition by layering[11] CSS with the solo strings from VSLS1, but they still leave something to be desired. Measures 11-14 in the Récitatif section illustrate the difference in expression between the recording and simulation. In the recording, the strings grow in intensity and volume and lead to a satisfying point of arrival. The simulation feels “black and white” in comparison, it sounds as though the musicians are playing without “feeling” because there is not the same increase in intensity. It is difficult to diagnose the exact cause of these issues, but it is probably a combination of inexperience, the inability to perform most of the string parts, the limitations of my sample libraries, and the limitations of virtual instrument technology. 

While many aspects of virtual instruments transfer from one library to the next, each library has its own set of peculiarities that need to be overcome to obtain the best results. Moreover, virtual instruments often need to be sculpted using different audio processing techniques to adapt and perform well in a given orchestration. Admittedly, I am least comfortable with this aspect of virtual orchestration, so it seems plausible that there is room for improvement in this area. For instance, my difficulties setting up the reverb resulted in a dull and muddy that certainly caused the strings to sound less transparent. Another likely reason is the sheer complexity of string instruments: bow pressure, the position of the bow, the speed of the bow, and vibrato speed are only some of the many elements that define the timbre and expressivity of a string passage. While some of these can be manipulated with virtual instruments, realistic and complete control over them is simply not possible at this time.[12]

Although producing sound with woodwinds and brass is also complex, I was much more satisfied with the results. This is almost certainly due to my having performed most of the parts. Furthermore, in using a breath controller to control timbre, I could simulate a performance that partially mirrors how wind instruments produce sound. Notwithstanding this, I question whether my satisfaction with the winds over the strings stems from an attachment to my performance rather than an objective view that they were better realized. Lastly, given that the woodwind virtual instruments I used only contain three velocity layers, I recognize that their timbral spectrum is emaciated. 

It was striking to what extent the quality of the playback from the individual instruments differed from that of the ensemble. Although I strived for musical and realistic playback for each instrumental line, sometimes the virtual instruments used did not allow for optimal results. When listening to the virtual harp alone, for example, I find myself questioning the integrity of the artistic director and wondering how the harpist made it past the first round of the auditions. In the context of the entire piece, however, I am much more forgiving. 

5. Final Thoughts

Composing Récitatif, chant et danse took one month from the first note penned to the last part printed. Creating the virtual orchestration, on the other hand, took three months. Composing the piece was an inspiring and creative process while simulating it was a frustrating and technical one. Upon finishing the composition, I was proud of what I had accomplished and excited to hear it played. Upon completing the simulation, I was relieved but disappointed that it did not and could not sound as good as the live recording. Comparing these processes may not be entirely fair, since one focuses on creation and the other mainly involves production. Furthermore, I have a lot more experience with composition than with virtual orchestration and so, naturally, I am much more at ease with it. My intention is not to detract from the artistic merits of virtual orchestration: it is an art form that requires musicality, creativity, and an incredible amount of work for success. I would like to stress, however, that for all its merits, if used in its typical fashion, it will remain in the shadow of the real orchestra. Since the virtual orchestra lacks what gives the real orchestra life, one cannot expect to deliver the same level of expression with it. Instead, why not ask, “what can the virtual orchestra do that the real orchestra cannot?” What kind of unique textures could it yield if one were to view the virtual orchestra’s imperfections as its defining characteristics? Rather than viewing increasing the volume of a note without changing its velocity layer as unrealistic, can it be the basis for a piece? What kind of new and interesting textures could one create with unplayable passages or by redefining the relationship between the timbre and volume of notes in a chord? These are questions that interest me and are reasons why I think virtual orchestration has the potential to assume a voice of its own.

Bibliography

Previous
Previous

Technology and Timbre

Next
Next

Koan — James Tenney