Sound Spectrum Definition
Understanding the sound spectrum requires a few key definitions.
A sound spectrum is a plot of the simple wave components of a complex wave.
The definition of a sound spectrum relies on the concepts of simple and complex waves.
A simple wave, also known as a sine wave, is the result of simple repeated motion.
A complex wave is a combination of simple waves.
Any complex, repeating wave can be broken down into a series of simple waves. Each simple wave is called a component of the complex wave.
Fig. 1 - All of these simple waves combine to make one complex wave.
You can think of a sound spectrum as a list of all the simple waves that make up a given complex wave.
The plural of spectrum is spectra.
The Spectrum of a Sound Wave
The spectrum of a sound wave displays frequency on the x-axis and amplitude on the y-axis.
Frequency is the number of times a wave repeats itself in one second.
Amplitude is the intensity of a wave. The amplitude of a sound wave is perceived as "loudness."
Amplitude is often measured in decibels (dB). Relative amplitude, i.e., how loud one sound is compared to another sound, is most relevant for linguistic analysis.
The above components are all part of the spectrum of a sound wave. So how do we build the sound spectrum for a complex wave? Here are the 4 main steps:
Break the complex wave down to its simple wave components.
Measure the frequency of each component.
List out each component's frequency in a row from lowest to highest.
Plot this list of frequencies by each wave's relative amplitude.
Luckily, linguistic analysis programs do this work automatically. However, understanding how a program generates a spectrum can help you interpret the spectrum for linguistic analysis. Now let's take a look at frequency in more detail:
Frequency on the Sound Spectrum
The sound spectrum shows the frequency of each component of the complex wave. Frequency is typically measured in hertz (Hz), meaning cycles per second.
If a wave has a frequency of 100 Hz, then its pattern repeats 100 times every second.
The frequency of a complex wave is the number of times the entire complex wave pattern repeats in one second. This is the lowest-frequency component of the wave, also called the fundamental frequency.
On a spectrum, the fundamental frequency is the first complete visible peak. If the first peak on the spectrum has a frequency of 100 Hz, then the fundamental frequency of the entire wave is 100 Hz.
Fig. 2 - The first full "peak" visible on the spectrum (shown by the arrow) is the wave's fundamental frequency.2
Each peak to the right of the fundamental frequency represents a higher-frequency component.
You can calculate the frequency of a wave by measuring the time it takes for the wave to repeat itself a single time. Looking at a waveform, subtract the time point at the beginning of a cycle from the time point at the end of the cycle. This time measurement is called the period of the wave. The reciprocal of the period (1 divided by the period) is the wave's frequency.
For example, say that a wave takes 0.25 seconds to repeat itself once. The period of the wave is 0.25 seconds. The frequency is 1/0.25 seconds, which simplifies as 4/1. This means that the wave repeats itself four times every second and that the wave's frequency is 4 Hz.
Sound Spectrum Analysis
Sound spectrum analysis can be conducted for a variety of purposes. For example, some scientists conducted sound spectrum analysis on baby birds to compare bird songs both before and after they slept. They found that the baby birds were learning the songs in their sleep!
Here are a few things that are helpful to keep in mind when using a sound spectrum for speech analysis.
The "peaks" on the spectrum represent the components of the glottal source wave.
The harmonic frequencies of the glottal source wave components are whole number multiples of the fundamental frequency.
The wavy amplitude pattern on the spectrum represents vowel formants.
The terms highlighted in bold may be tricky to understand, so let's unpack these:
When you produce a vowel, you push air past the vocal folds in your throat, causing them to vibrate. This vibration is the glottal source wave. Every peak on the sound spectrum of a vowel is a component of the glottal source wave.
Once you know the fundamental frequency of the glottal source wave, you can easily calculate the rest of the harmonic frequencies. A harmonic frequency is a whole number multiple of the fundamental frequency. For example:
The sound wave in Fig. 2 has a fundamental frequency of 100 Hz. The frequency of every other peak is a whole number multiple of 100. The next few harmonic frequencies are:
100*2 = 200 Hz
100*3 = 300 Hz
100*4 = 400 Hz
The sound spectrum of a vowel looks "wavy" at the top. The high points on this wavy pattern represent frequency ranges with higher amplitude. These high-amplitude ranges are the vowel's formants.
A formant is a high-amplitude frequency band within a vowel that acoustically differentiates one vowel from another.
Formants help identify the posture of the vocal tract during different vowels. For example, formant frequencies can distinguish [a] from [i].
Sound Spectrum Issues
Some issues can arise when using a sound spectrum for real speech analysis. The biggest issue is that sound spectra in real speech are more difficult to read than sound spectra in synthesized speech.
A synthesized sound wave can have a neat, clean spectrum with clear and intelligible components. A recording of human speech, on the other hand, consists of imprecise waves and often contains background noise that confuses the program.
Fig. 3 - The smaller peaks on this spectrum represent background noise and imperfections in the speaker's voice.2
The sound spectrum of a natural human vowel tends to look jagged and rough compared to the smooth, uniform spectrum of a computer-synthesized vowel. This is because a human speech recording often comes with background noise, and the human voice doesn't produce even pure tones.
In the spectrum of a human recording, only the largest peaks represent the harmonics of the glottal source wave. The rest are the result of background noise and imperfections in the speaker's voice.
A recording of human speech will never provide a completely pure sound. However, the sound spectrum still serves as a useful tool for speech analysis.
Sound Spectrum - Key takeaways
- A sound spectrum is a plot of the simple wave components of a complex wave.
- The spectrum of a sound wave displays frequency on the x-axis and amplitude on the y-axis.
- The fundamental frequency of a complex wave is the number of times the entire wave repeats in one second. On a spectrum, the fundamental frequency is the first complete visible peak.
- In human speech, the peaks on the spectrum represent the components of the glottal source wave.
- The wavy amplitude pattern on the spectrum represents vowel formants. Formants are high-amplitude frequency ranges that help differentiate one vowel from another.
References
- Fig. 1 - Waveform Addition (https://commons.wikimedia.org/wiki/File:Complex_wave_with_its_components.jpg) by BoH (https://commons.wikimedia.org/wiki/User:BoH) is licensed by CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0)
- Boersma, Paul & Weenink, David (2022). Praat: doing phonetics by computer [Computer program]. Version 6.3.01, retrieved 21 November 2022 from http://www.praat.org/.
How we ensure our content is accurate and trustworthy?
At StudySmarter, we have created a learning platform that serves millions of students. Meet
the people who work hard to deliver fact based content as well as making sure it is verified.
Content Creation Process:
Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.
Get to know Lily
Content Quality Monitored by:
Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.
Get to know Gabriel