In this section, we will cover three main questions:
How do we hear sound?
What exactly is sound?
What is analog storage?
Neuroscience of music
There's two parts of our connection with music. The first is the neuroscience behind how we hear sound, and the other is how we perceive sound.
I. Neuroscience of Sound Let's be honest, the science behind music would not be as relevant if our brains were unable to process the sounds around us. Sounds are essentially patterns of compression in the air that our ear picks up and our brains interpret. The process goes from mechanical energy (alterations of air pressure, mechanisms in our ear) to electric energy (action potentials in our neurons). When pressure and displacement of air occurs (which is what sound is), the vibration reaches our eardrum, which moves the three tiny bones in our ear that then transmit the vibration to tiny hairs in our ear that produce electrical signals\(^1\). These electrical impulses are then transmitted into our brain through the auditory nerve. II. How we perceive sound The field of study that looks into how humans perceive sound is called psychoacoustics\(^2\). The study looks into how the brain makes sense of the sounds in the world that travel from our ear into our neurons in the auditory cortex, meaning it's truly a sensory experience we're studying.
For one, our range of sensory experience with hearing is limited, for humans cannot hear all sounds. For example, dog whistles are not within the human register. Generally, we can hear sounds at at frequencies from about 20 Hz to 20,000 Hz, though we hear sounds best from 1,000 Hz to 5,000 Hz, where human speech is centered\(^3\).
But we can do really cool things with our hearing perception. Sound localization is one cool thing our brain does where it takes into account subtle differences in loudness, tone and timing between the two ears figure out where the sound is coming from\(^4\).
Another perception we have auditory masking. There's two types: simultaneous masking and non-simultaneous/temporal masking. Simultaneous masking happens when a signal and a masker are played together, where the masker is stronger than the signal, causing the person to be unable to hear the weaker signal. Temporal masking happens when the masker occurs either before or after the weaker signal stops. In addition, the closer in time the potentially masked sound is to the louder, masker, sound event, the louder it needs to be in order to remain perceivable\(^5\).
Non-simultaneous/temporal masking in three regions. Citation: https://community.sw.siemens.com/s/article/masking
As a little foreshadowing, psychoacoustics is relevant to our later section on lossy data compression. Here's a hint: Do we really need all the sounds to hear music?
How does sound work?
I. Sound in the Physical World Congrats! If you've made it this far, you've developed some understanding of what it means to perceive sound. But maybe you have some lingering questions.
Like, what's a "frequency," and why is it measured in "Hz" (however that's pronounced!)? What does it mean to "displace" air? What's a signal? There's a more fundamental question at the root of all of these: what is sound?
Let's discuss what happens when you hum into a kazoo. What precisely is going on, and how does it make sound?
When you hum into the mouthpiece of the kazoo (the wide end), you send air towards that circular vent at the top of the body. The air wants to rush out, but it crashes into a flexible membrane also known as a diaphragm. The diaphragm can be made of paper or cellophane, and it is sandwiched between a ring-shaped cover and the body of the kazoo. Observe the anatomy of a kazoo below:
Your breath hits the diaphragm and the air around it begins to vibrate, causing periodic variation in air pressure, or sound! Sound is what occurs when air molecules get shoved around - or displaced - over a period of time. We can model it with sinusoids (sine or cosine graphs) like the one on the right.
Tip: If you're unsure what a sinusoid is, don't fret! Check out our quick section on sinusoids under the "Digital Storage" tab.
The more air molecules displaced, the the louder the sound - this is the physical reason why a high amplitude indicates a loud sound.
That said, a frequency is a just the rate at which a vibration occurs (how many cycles of a wave in a given period of time?), and "Hz" (pronounced "hertz") is simply the SI unit for frequency. If that sounds abstract (or if you haven't yet studied physics), feel free to interpret "frequency" as the formalization of sound, or simply as some variation of the graph seen above.
But that's enough about sound for the time being. Let's talk about music storage!
Analog recording is the first music storage method that we will cover in our blogpost! Analog recording methods store signals as a continuous single in or on the media of which it is stored. It is the oldest method of storing sound, dating all the way back to the 1870s when French inventor Charles Cros was able to transfer phonautograph recordings to grooves on a disc\(^6\).
Thomas Edison was one who revolutionized analog recording by creating the first device to store and playback sound. His invention, the phonograph cylinder, was invented on July 18, 1877. The idea is simple, a needle touches a cylinder covered in tin foil, where sound waves that hit the diaphragm jiggle the needle and creates imprints onto the cylinder to represent the sound\(^7\).
However, there were many issues with the phonograph. One issue was the physical contact between the phonograph needle and the tinfoil diaphragm, which was required to play the recording. The contact of the needle on the tinfoil caused the grooves on the tinfoil to wear down with each play, causing the recording to be lost over time. In addition, the sound quality of the phonograph was of low quality, as it did not accurately represent the original sound (bad fidelity)\(^8\).
Needle on record groove
Long-playing (LP) records work similarly, but the sound waves are stored on a flat disc rather than a cylinder. Sounds are recorded on LP records as grooves which vary in size, spacing, etc. for several reasons. For example, bass frequencies create larger squiggles in the groove and pitch equates to the amount of space between the grooves\(^9\).
While analog recordings revolutionized music due to the ability to store sound on a medium to replay the recording at a later time, it also has its limitations. The main issue is that over time, the medium of which the recording is stored on physically changes, as stated with the phonograph. In addition, the medium itself imposes limitations on how to store the music. For example, the phenomenon known as diameter loss for LP records occurs when higher frequencies (the fastest, smallest squiggles) get scrunched together so much that it’s hard to reproduce in the center of the disc\(^9\).
Fun fact: Vinyl records are LP records that are made of polyvinyl chloride ("vinyl").