Static orchestration

Orchidea

Oct 20

Written By Carmine Cella and Daniele Ghisi

TOR / Modules / Assisted Orchestration / Orchidea / Tutorials / What is assisted orchestration

Dychotomies

It is our belief that no machines or algorithms can compose at your place. Our toolbox is meant to give you some help in orchestrating: this help is more about making calculations at your place than taking musical decisions. These calculations, of course, are driven by some pre-compositional choices that either you or the designer of the tool that you are using must take.

When designing a tool for assisted orchestration you are faced with several important choices. What does it mean to make a 'good' automatic orchestration? Can we really evaluate the quality of an orchestration? Our choice, then, has been to generate orchestrations that are close to the target sound in term of spectral content. Generally speaking, we tried to have a mimetic copy of the original sound and not a completely new interpretation of it. But this choice can be modified by carefully selecting some parameters, in order to obtain more personal interpretations of the target sound. Mimesis and katharsis are important phylosophical cornerstones for assisted orchestration: what do you want to achieve?

A kathartic approach is meant to generate orchestrations that are not always close to the target but are considered musically interesting...

...a mimetic approach, instead, can produce results that sound similar to the target, but are probably less interesting for composers

The first important choice you have to do when performing assistant orchestration is about the relative importance of pitch and timbre: do you prefer to preserve the real pitches present in the target, or you want to maximize the timbre similarity? In the former case you should use the Fourier spectrum as the feature to be optimized; in the latter case, MFCC (Mel frequency cepstral coefficients) are a good choice.

Pitch is related to the position of the peaks in the spectrum; Fourier transform is a good feature to describe these positions

Timbre is difficult to define; in this context we will accept that is related to the overall shape of the spectrum. MFCC are a good description of such shape

You can use a set of different databases together, the only constraint is that each database should contain the same type of features (ex. MFCC). The main object used to perform assisted orchestration is orchide.solve.

Tip: look at the help file of the orchidea.db.query object to get information about databases

Fundamental parameters

The first important parameter to choose is the orchestra. A message specifies the instruments of the orchestra to be used; the names depend on the database used. The character '|' in between two or more instruments means that the instruments are played by a single player and are consider as doubles; the algorithm will pick up either instrument depending on the optimization.

The parameters for the optimization are really important, but normally you don't need to change them unless the solutions are not satisfactory. The population size and the number of epochs determines the width of the search; usually the bigger the better, but the computation time can be long. For static targets this is often not a problem and using 300 for both is ok.

To determine the initial population for the optimization, we use a method called 'relaxed pursuit'. Generally speaking, the method tries to generate right away some solutions that are close to the target. This is usually better but can produce final solutions that are too homogeneous. Using 0, the initial population is random; using 1 the pursuit will be not relaxed and the inital population will be made of a single solution repeated (so it is not good). Increasing this value, for example to 3, means that the initial population will be made of a number of different solutions that are somehow close to the target. Suggestion: use 0 if you are unsure, 3 or more if you understand this parameter.

Cross-over rate and mutation rate have the meaning usually adopted in genetic optimization. Generally speaking, you should not change cross-over rate and you should increase mutation rate only to increase diversity of solutions, in spite of consistency. Suggestion: mutation rate should stay between 0.001 and 0.1.

A first attempt

We are now ready to try a real static orchestration! Static, in this context, means that the target sound is considered a single average timbre in time and no onsets are considered. The outcome of a static orchestration is a single chord.

The effect of sparsity

Sparsity determines the freedom that the algorithm has to remove instruments of the given orchestra from the solution. If sparsity is 0, each solution will use all instruments; if this value increases, the probability of dropping an instrument increases correspondingly. Suggestion: use 0.01 for general targets, 0.001 for symphonic targets, 0.1 for single notes of instruments used as targets.

Pitched orchestrations: controlling target analysis

If your target sound has some pitches that you want to capture during orchestration, than you need to help the algorithm by tuning the analysis phase. There are two main parameters for the target analysis: the window size and the partials filtering. The window size (in samples) should be large if you know that the target contains very low frequency (ex. A1). Partials filtering works like that: if it is 0 there is no filtering and all the sounds in the database are used. If it is between 0 < x < 1 than the database is reduced by removing the sounds that does not correspond to the partials in the target. Increasing this value produces a stricter selection and only strongest partials (dynamically) are retained. Suggestion: use 0 for inharmonic or noisy sounds where timbre is more important than pitch, and something between 0.1 < x < 0.3 if pitches are also important.

Tip: follow the next tutorial to learn more about dynamic orchestrations!

Carmine Cella and Daniele Ghisi