Would it surprise you that what is commonly taught about digital audio is actually wrong? Almost anyone you talk to – even very smart and successful audio engineers – will explain digital audio as basically a representation of sound created when a computer (an analog-to-digital converter, or just “converter” in the usual common speech) takes many “pictures” of the audio very quickly. These pictures are called “samples.” How often does a converter take these pictures/samples? The most common rate is about 44 thousand times per second (written as 44.1KHz). This is called sampling frequency.
Because of the fact that audio waveforms are depicted as smooth curves – especially simple sine wave tones, it seems reasonable that discreet “pictures – static representations of something dynamic (like a still pictures compared to the movie), lined up sequentially next to each other, might get close, but could never truly capture all the real sound accurately. The tops of these “picture bars” would leave a stair-step pattern along their ends, that when laid over the actual smooth sound wave would have little gaps between the flat tops and the curve. You could get close by increasing the number of pictures, but you could never fully be accurate. Well guess what? All that is wrong! In the VERY early days of digital audio, there was some truth to it. But the technology now is such now that the whole stair-step thing is simply no longer true.
Besides, the above is talking about sampling frequency and not bit-depth. We haven’t even mentioned bits yet. Bit-depth refers to how many bits are used in each sample (picture) to convey information about dynamic range – how much of the source audio can be accurately represented in each sample. I wrote a much better explanation of this in the post – 16-Bit Audio Recording – What The Heck Does It Mean? It even uses a champagne metaphor;).
Anyway, there is a common explanation out there that bit depth is very much like “resolution” in video. It turns out that this is a bad comparison. We all know that 8-bit video is pretty sucky – all pixelated and stuff. But in audio, we’re merely talking about dynamic range. If we only had 2 bits available per sample, we could not represent enough of the original audio accurately, so there would be a lot of just noisy hiss. More bits gives us less noise, in theory. But we only need enough bits to get the noise low enough so we humans can’t hear it. Any more bits employed to reduce noise we already can’t hear seems silly doesn’t it? Why yes, yes it does. And it is! This always reminds me of the joke about two people running away from a hungry tiger. One of them says “I don’t have to be faster than the tiger. I only have to be faster than YOU!”
So 16 bits has been the standard for audio on CDs forever. With 16 bits, that low-level hiss is pushed all the way down to – 96 decibels. That is a lot more dynamic range than we really need, and allows for very quiet audio to be heard with no audible hiss. In fact, you could even go down to 8 bits and the audio would still sound a LOT better (in terms of hissy noise) than cassette tapes!
There is fantastic article about all of this where you can delve more into the nitty and (not so) gritty (ha! A resolution joke – get it?) details here: http://www.sonicscoop.com/2013/08/29/why-almost-everything-you-thought-you-knew-about-bit-depth-is-probably-wrong/
Cheers!
Ken
sampling frequency
Neil Young's Pono High Resolution Audio Project
In 2012, Neil Young started making waves (ha! – OK, that was an accidental pun) about how the public is consuming audio in a very low-resolution way. For instance, the files you listen to on your iPod, iPhone or other mobile device are data-compressed. That means that files like mp3s and aac audio files have had information removed and rearranged from the original master audio in order to reduce the file size so they can be streamed more easily over the internet. This is why you can put thousands of songs on an iPod too. Young wants the public to be able to listen to the full sound of “high resolution” audio. So he invented a thing called “Pono.” It’s a digital player that would let you hear “the full audio,” instead of the mp3s and aacs we’re listening to now.
The process of creating good-sounding version of much smaller file-sizes for audio began in the early 90s with the mp2 and then mp3. To give you an example of the difference in file size between what is on a CD and the mp3 version – if you take a song from a CD and convert it (people call this “ripping” for some reason) to mp3, you end up with a file that is about a tenth the size of the original file on the CD. For example, a song on a CD might be a 40MB wav file. But when ripped to mp3, it ends up only 4MB.
Additionally, even the files on the CD are technically lower quality than the master recording. CD standards require a 16-bit file (see our article on bit-depth in audio recording here) and a 44.1 KHz sampling rate (see our article on sampling frequency here). But master recordings used to make those CDs are almost always 24-bit and either 48KHz or 96 KHz. So even CDs are “lower rez” than the master recording.
On top of all that, mastering audio can be even higher resolution than 24/96! Some masters are using 192KHz. So it might be natural to think that since 24-bit/96KHz (some call this the minimum “bona-fide high-resolution audio”) is being reduced to 16-bit/44.1KHz on a CD, and then further reduced to create mp3s, that end product on your device should sound like ass, right? Well I’m betting that a good “99-point-a-whole-bunch-more-9s” percent of people could not tell the difference between any of these resolutions. It isn’t like your TV screen, where the difference between standard and HD is painfully obvious to most people. In audio, it just isn’t so obvious.
However, a small percentage of true audiophiles can certainly hear a difference, and they think it’s a shame that the audio consuming public is mostly listening to what they consider to be inferior quality audio. One such person is Neil Young. There was a large publicity push about Pono in 2012. But then it seemed to quietly fade as an idea. So what is the latest status? Take a look at this article for the latest on Pono.
http://news.cnet.com/8301-13645_3-57608168-47/whats-up-with-neil-youngs-pono-high-resolution-music-system/
What Is Sampling Frequency?
Sampling frequency in audio recording can perhaps best be described by first taking a look at your favorite Hollywood movie. Even though life happens continuously, movies show us fast chains of still images that our brains interpret as continuous movement. Usually the rate is around 24 frames per second. Sound is also continuous, but it takes a lot more than 24 “frames” of sound per second to sound decent. Although a film camera is taking pictures of people and real objects, audio equipment is taking “pictures” of sound waves. These pictures are called samples, and the most common rate is 44,100 sound samples per second, or 44.1 kHz.
You Don’t Remember Physics 101?
You might remember from a boring day in physics class that different speeds of sound waves produce different pitches. When taking samples of these sounds, the sampling frequency must be fast enough to not miss any of these exceptionally fast waves. Some really smart people in the 1920s and 1930s figured out what “fast enough” is when it comes to recording. It’s something known as the sampling theorem. To put it simply, your sampling frequency needs to be twice as fast as any of the sound you are recording in order to have all of the necessary samples to sound continuous when it’s played back. Knowing that the high end of human hearing stops at around 20 kHz, and that a little wiggle room is nice to have, we know why 44.1 kHz is used for most playback applications.
Sampling Frequencies Above 44.1
Of course there are more sampling frequencies than just 44.1 kHz. Despite what the sampling theorem suggests, some people claim to be able to hear an improvement in recordings with sampling frequencies much higher than 44.1 kHz. 48 kHz is popular in more video-related applications, and 88.2 and 96 kHz are both popular since they are double the frequency of what the end product will be, making the conversion down to 44.1 or 48 kHz easy for the computer. All of this is about as clear as mud. Is it worth it to record at a higher sampling frequency? That’s an ongoing debate, with some very passionate arguments on either side.
What it all comes down to is storage space. Storing 96,000 24-bit (see our post on 16-bit audio recording for more info on bit depth) samples per second of audio is going to take up a fair amount of space. If you do a lot of recording at home, or if you’re like Jimi Hendrix (or me) and love multi-track recording, you might end up running out of storage space faster than you anticipated. Big record companies might not bat an eyelash at using almost a gigabyte of storage for a multi-track song, but that sure seems wasteful to me.
The Real World – Can You Or Can’t You…Really?
If you want to hear the differences for yourself, record a short dialog twice: once at 44.1 kHz and once at 96 kHz. Have a friend or family member rename the two files so that only they know which is which. Listen to each of them and try your best to figure it out by how each one sounds. If you can’t tell a difference (“the dog barked in the 44.1 kHz recording” doesn’t count!), then your listeners probably won’t be able to either.
One parting note: be careful not to confuse sampling frequency with bit depth when moving through the digital audio universe. It is a pretty common thing.
Sample Rate of 88.2 Kilohertz-Ouch My Brain Hurts
I don’t know about you, but when I start reading about audio sample rates, and scary numbers with decimal points and symbols like “kHz” start showing up, my brain tries to escape from my skull. Jeez, I’m a musician, not a tech geek (though not for lack of trying).
Unfortunately if we are going to get into audio recording, we should train our brains to stay still long enough for some fundamentals. Just as it is not necessary to understand why our iPhones work in order to operate them, we don’t truly need to know what a “kHz” is in order to grasp how it might be important to our recordings. It stands for kilohertz (or 1,000 cycles per second), and all you really need to know is that the music you listen to on your CDs is 44.1 kHz. So however you record your audio in your home recording studio, when it’s is finished, it should be 44.1 kHz.
Some folks believe you should record at higher rates, like 88.1 (stay with me!) kHz, converting down to 44.1 at the end. Personally I don’t see the point (get it? I made a decimal joke). Yes technically the audio will be “higher definiton” (pardon the video metaphor), but I don’t think most folks would be able to tell the difference. Meh, to each their own.
Here is my article on sampling frequency: https://www.homebrewaudio.com/what-is-sampling-frequency/
Here is an article that tries to make the case for always recording at 88.2 kHz:
http://theproaudiofiles.com/3-reasons-to-record-at-88-2-khz/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+theproaudiofiles+%28Pro+Audio+Blog+w%2F+Articles%2C+Tips%2C+and+Reviews+%7C+theProAudioFiles.com%29