Spectral Envelope

EN | FR

Spectral Envelope

by Amélie Bernier-Robert
Timbre Lingo | Timbre and Orchestration Writings

Published: November 6, 2023 | How to cite

In the process of analyzing the spectrum of a sound, it is sometimes useful to describe its spectral properties in terms of energy distribution rather than individually mapping all its components. In so doing, we invoke the concept of spectral envelope, a curve that can be obtained by successively connecting the peaks of the partials shown in the frequency representation of the sound (i.e., with frequency on the x-axis and energy or amplitude on the y-axis). Spectral envelopes are important factors in timbre perception. They reflect the acoustic properties of an object that produces sound in terms of the energy distribution across the frequency spectrum. Here is an example showing the spectral envelope of an oboe sound next to a clarinet sound, with amplitude measured in decibels (dB): 

Figure 1: Comparison of the spectral envelopes of the oboe and the clarinet. We can see from these spectral envelopes (blue curves) that the oboe’s spectrum has more energy in the higher harmonics than the clarinet. This correlates with a higher spectral centroid for the oboe. 

Oboe: (play sample)

Clarinet: (play sample)

Spectral envelopes can also be described with other descriptors, which include : 

Spectral Spread 

Spectral spreads represent the standard deviation of amplitude distribution around a sound’s spectral centroid, calculated using the square root of its second-order moment (see this video about moments in statistics, if you want to know more). A sound with a high spectral spread will look “larger” and “flatter” (literally more “spread”) than a sound with a lower spectral spread value.

Figure 2: Comparison of spectral spreads for a crash cymbal (left) and clarinet (right). The frequency spectrum extracted from the sound of two crash cymbals (on the left) is characterized by a higher spectral spread value than that of a clarinet sound (on the right) as shown by the width of its spectral envelope. This means that the sound components are generally more distant from the spectral centroid value (red line) in the case of crash cymbals. 

Crash cymbals: (play sample)

Clarinet: (play sample)

Spectral Skewness 

Spectral skewness measures the symmetry of the spectrum around its spectral centroid. The spectral skewness is calculated using the third-order moment of the amplitude distribution. A positive value indicates more energy in the frequencies that are below the spectral centroid (i.e., the “tail”/inclination of the spectrum is longer on the right, hence the attribution of a positive value), while a negative value means that the energy is greater in frequencies that are above the spectral centroid (i.e., the left “tail” of the spectrum is longer). 

Figure 3: Comparison of spectral skewness curves for pink noise (left side) and blue noise (right side). The spectral skewness of pink noise (on the left of figure 3) is positive due to the presence of higher energy below the spectral centroid (red vertical line). Notice how its energy has been condensed to the left, creating a longer “tail” to the right of the spectral centroid. The spectral skewness of blue noise, on the other hand, shows a slight negative tendency due to the presence of energy above the spectral centroid (on the right). 

Pink noise: (play sample)

Blue noise: (play sample)

Pink noise and blue noise are two common types of filtered noise, as discussed in this article. 

Spectral Kurtosis 

Spectral kurtosis quantifies the “peakedness” or “tailedness” of the spectrum around its spectral centroid (calculated using the fourth-order moment of the amplitude distribution). A higher kurtosis value means that more energy is concentrated in the region of the peak (which will be more “pointy”) while a lower value usually describes a flatter distribution. At first, this seems very similar to what the spectral spread is used for. However, one can change the spectral kurtosis without modifying the spectral spread value, and vice-versa.

Figure 4: Comparative representation of different spectral kurtosis values of filtered crash cymbals. Both plots present the amplitude spectra and spectral envelopes of filtered crash cymbals, which were processed in a way that the spectral spread remains constant with different spectral kurtosis values. See how the left spectrum is “flatter” (lower kurtosis) while the right picture shows a more “pointed” distribution, with thicker tails on the sides (higher kurtosis). In both cases, the energy is distributed following essentially the same average frequency distance from the spectral centroid, which is why the spectral spread values are identical. 

Filtered crash cymbals (lower kurtosis): (play sample)

Filtered crash cymbals (higher kurtosis): (play sample)

Spectral Slope 

The spectral slope indicates the tendency of amplitude decrease as the frequency increases. It is calculated using a linear regression, that is, finding a straight line that best fits the general tendency of the spectrum and getting its gradient. 

 

Figure 5: Amplitude spectrum with a regression line (in black). The spectral slope corresponds to the gradient of the line (from [1], p.14). 

 

Spectral Rolloff  

The spectral rolloff point is the limit frequency under which 95% of the energy is located.

 

Figure 6: Frequency spectrum with a red vertical line indicating the spectral rolloff point (from [1], p.15). The red line marks the point under which  95% of the energy is stored.

 

Spectral spread, skewness, kurtosis, slope, and rolloff enable us to characterize and classify sounds, which can be useful in the study of timbre perception. For instance, these are the kinds of descriptors looked upon when finding acoustic correlates in the study of timbre space. They are also very often used in phonetics (e.g., the sibilant fricatives /sh/ and /s/ mainly differ in spectral centroid, skewness and kurtosis [2]) and in automatic instrumental timbre recognition for machine learning [3]

References

[1] Peeters, G. (2004). A large set of audio features for sound description (similarity and classification) in the CUIDADO project. IRCAM. 

[2] Jongman, A., Wayland, R. and Wong, S. (2000). Acoustic characteristics of English fricatives. Journal of the Acoustical Society of America, 108(3), 1252–1263.  

[3] Fujinaga, H. (1999). Machine Recognition of Timbre Using Steady-State Tone of Acoustic Musical Instruments. Peabody Conservatory of Music, Johns Hopkins University.

Previous
Previous

Masking

Next
Next

Spectrogram