www.loudspeakerindustrysourcebook.com - The Loudspeaker Industry Sourcebook
Posted by LIS STAFF on 03/21/2019

Limiting Audio Reproduction Bandwidth to 20 kHz May Also Be Limiting Your Potential Market

Limiting Audio Reproduction Bandwidth to 20 kHz May Also Be Limiting Your Potential Market

Limiting Audio Reproduction Bandwidth to 20 kHz May Also Be Limiting Your Potential Market

Dan Foley

(Eastern Region Sales, Audio Precision
and WPI Adjunct Faculty Member)

For decades, 20 Hz to 20 kHz has been the frequency range for recording and playing back music.  Because of this, audio reproduction technology, in particular the tweeter, has been designed to have a relatively flat response to 20 kHz before rolling off. When recording technology moved from analog to digital, sampling rates of 44.1 kHz and 48 kHz were, and still are, the sampling rates of choice for recording and playback since the Nyquist frequency will not exceed 24 kHz. 

In the past several years, there has been a great deal of interest in providing music to the consumer that has been recorded at sample rates well above 48 kHz. However, the vast majority of tweeters and finished loudspeakers currently on the market do not provide frequency response much beyond 20 kHz. In addition, there is a lot of controversy as to whether or not recording above 20 kHz is even worth the effort.

Such a premise makes sense if the instruments being recorded (e.g., percussion, brass, strings, and even vocals) produce little-to-no acoustic energy above 20 kHz. Why bother trying to record a signal that does not exist? But what if these instruments do produce energy above 20 kHz? To determine if this is the case, I supervised a student research project in 2010 conducted by two students from Worcester Polytechnic Institute located in Worcester, MA. The purpose of this project was to answer two basic questions:


Does acoustic music created by common instruments contain ultrasonic energy? 

If yes, how high in frequency is this energy and what is the corresponding level?


We chose to record the following instruments:

Trumpet

Conga drum

Violin

Nylon string guitar

Female soprano vocalist


Each instrument was recorded individually in a professional voice-over booth similar to that shown in Photo 1. A Prism Sound Orpheus was used as the analog-to-digital converter (ADC) as it can record up to eight channels simultaneously at 192 kHz, 24-bit. Three microphones were used in a side-by-side configuration and were placed at typical distances used for close-mic recording of these instruments. The microphones used were the GRAS 46BF 1/4” microphone set, the GRAS 46AC 1/2” microphone set, and the Shure SM81 microphone. 

Table 1 shows the corresponding frequency response curves and sensitivities for these microphones. 

F1
Table 1: The microphone frequency response and sensitivity are shown for the three different microphones. For the GRAS 46BF and the GRAS 46AC, the flat response corresponds to free-field measurements.



The musician would enter the voice-over recording booth and position his or her instrument a few inches away from the side-by-side microphones. The musician would then play a musical passage, or in the case of the vocalist sing a passage, that would be 30 seconds to 50 seconds in length. 

Audacity, a shareware program used for audio recording, editing and mixing, was used to generate spectrogram plots of the corresponding 192 kHz recordings. To easily view the spectrogram of each instrument, a blank track was created to contain the three separate .wav files. This track was comprised of three sections—the GRAS 1/4” microphone recording, the GRAS 1/2” microphone recording, and the Shure SM81 recording. 

Figure 1 shows the track layout using the conga drum as an example. Although each of the three .wav files may have small variances from one another, they are all the same take for each instrument including the soprano vocals.

F1
Figure 1: Audacity Track Layout with microphone bandwidth shown in parentheses.


The spectrogram function was used to determine whether or not ultrasonic energy is present in these recordings, especially in the transient portions of the music signal. A spectrogram uses Fast Fourier Transform (FFT) analysis, except that instead of displaying magnitude on the Y-axis and frequency on the X-axis, a third dimension (time) is also displayed. The Audacity spectrogram displays elapsed time on the X-axis and frequency on the Y-axis. Magnitude is displayed as a spectrum of colors with white representing magnitude at or close to 0 dBFS and blue representing levels that are -60 dBFS or more. 

The Recording Results

To better interpret the color scale, I created a stepped-level sine sweep in Audacity with the first part at 0 dBFS and each subsequent level attenuated by 6 dB. Figure 2 shows the corresponding stepped-level track along with its associated spectrogram. The white represents a signal with a magnitude at or close to 0 dBFS, white with a small amount of red mixed in is a signal at-6 dBFS, and so on until the very light blue represents levels that are 60 dB or more below digital full scale. The spectrogram settings were as follows:

- FFT buffer of 128 samples (667 µs total buffer length) so as to ensure the least amount of smearing of transient energy

- Linear frequency scale (Y-axis) from 0 Hz to 70 kHz

- Hanning window

F1
Figure 2: This is the color scale of spectrogram plots that I created with Audacity.


Figure 3 shows the spectrogram of the conga drum. The dashed red line is positioned at 20 kHz  and the majority of the higher-intensity acoustic energy is within the audio range. However, it is very apparent that ultrasonic energy is present especially during the initial transient of the hand/fingers hitting the conga drum head. Since the GRAS 1/4” microphone has a flat response out to 100 kHz, it better captured the ultrasonic energy present in the conga drum transients. The Shure SM81, designed to roll off above
20 kHz, did not capture any ultrasonic energy above 40 kHz. Surprisingly, the GRAS 1/2” microphone did capture a fair amount of this energy despite it starting to roll off at 40 kHz. 

F1
Figure 3: Spectrogram of conga drum


Figure 4 shows the resulting spectrogram for a nylon string guitar where the musician was playing a flamenco-style piece. The spectrogram for the GRAS 1/4” mic recording shows a lot of noise present. This is due to a combination of low microphone sensitivity and the recording gain being set so that the peak levels would be well below the clipping level of the Orpheus ADC. A combination of low recording gain and low microphone sensitivity, compared to the other two microphones, resulted in more electronic noise in the recordings made with the GRAS 1/4” microphone.

F1
Figure 4: Spectrogram for nylon string guitar


Despite the low sensitivity of the GRAS 1/4” microphone, it did capture higher levels of ultrasonic energy of two specific guitar transients. The dotted-line red ovals show a reddish color in the 47-kHz-to-52-kHz range whereas the GRAS 1/2” microphone is a very light blue, corresponding to around -60 dBFS, and the Shure SM81 barely captures any of this energy. 

I cannot explain the “noise” at 55 kHz for the GRAS 1/4” and the GRAS 1/2” recordings, other than this could have been caused by electrical interference from other equipment in the recording studio operating during the session. I consulted with a well-respected converter designer about this and he said he had seen similar behavior in A-D and D-A converter boxes and attributed it to switching power supply interference. Despite this electrical/electronic interference, this source of noise did not mask any of the ultrasonic energy contained in the guitar transients.

Figure 5 shows the resulting spectrogram for the soprano vocalist. This was the most surprising result from this research as it showed how a cappella female vocals can generate ultrasonic energy albeit at very low levels. This is more evident with the GRAS 1/2” microphone recording as shown in the dotted-line square.

F1
Figure 5: The most surprising results came from the spectrogram for the soprano vocalist.


Figure 6 shows the spectrogram results of a solo violin. The musician was playing very fast  (presto) throughout this recording. From the spectrogram, one can conclude that this particular playing style generates a lot of ultrasonic energy. Even at 70 kHz, the GRAS 1/2” microphone recording shows dozens of ultrasonic “bands” even though its frequency response could be attenuated by 10 dB or more in that frequency range.

F1
Figure 6: Spectrogram results of the solo violin


Figure 7 shows that little-to-no ultrasonic energy was produced by the musician playing his trumpet. The reason that the .wav file of Shure SM81 recording is flipped is that the other microphones invert the acoustic signals. Although practically no ultrasonic energy is shown in the spectrogram, this does not necessarily mean that a trumpet will never produce energy above 20 kHz. There can be many reasons for this such as playing technique, the piece being played, and so forth. More trumpet recordings will need to be studied before concluding that this particular instrument does not generate acoustic energy above 20 kHz.

F1
Figure 7: Little-to-no ultrasonic energy was produced by the musician playing his trumpet.


Impact of These Findings

How do these results impact the loudspeaker industry? Sales of recordings with sampling rates at or above 96 kHz have been increasing for the past several years. Figure 8 is from findhdmusic.com (http://www.findhdmusic.com/article/growth-in-high-resolution-audio-2012-2014/300) showing the growth in availability of high-resolution albums. Although this graph is a bit dated, it clearly shows the growth in these types of recordings.  

F1
Figure 8:


Given this growth and the results of this study, how should the loudspeaker industry respond? I think the industry should make an effort to develop and market transducers and finished products that have a reasonable frequency response above
20 kHz. The term “reasonable” is deliberately used since there are cost implications when designing and manufacturing transducers that are flat to
50 kHz or higher vs. a transducer that may be several decibels down at 30 kHz. However, this effort has to also go hand-in-hand with the recording industry providing true high-resolution recordings to the consumer. 

Figure 9 shows a percussion-only .wav file and corresponding spectrogram plot of a 192 kHz, 24-bit album downloaded from a reputable website devoted to the sales and marketing of high-resolution albums. This particular album has many tracks of various genres specifically for the purpose of consumers being able to audition high-end, audiophile-level playback hardware. 

F1
Figure 9:This percussion-only .wav file and corresponding spectrogram plot of a 192 kHz, 24-bit album was downloaded from a reputable website devoted to the sales and marketing of high-resolution albums.


Despite this file being sampled at 192 kHz, there is virtually no energy above 20 kHz. If the original recording was made at 192 kHz, there  would be some amount of energy above 20 kHz as was shown in several of the previous recordings made with the Shure SM81. In particular, the end of this recording should have a great deal of ultrasonic energy (areas enclosed by dotted-line red ovals) but nothing exists above
20 kHz. Either this recording was originally sampled at 44.1 kHz or 48 kHz and was upsampled to 192 kHz, or 20 kHz low-pass filters were used somewhere in the tracking/mixing/mastering phases or a combination of sample rate and low-pass filtering used in post-production. 

Figure 10 shows the .wav file and corresponding spectrogram of a snare drum track I recorded at 192 kHz using a high-bandwidth microphone. Such recording methods do capture the energy that is created by the snare drum. The limited number of A-B comparisons I made so far with professional musicians hearing their performances recorded at 192 kHz using high-bandwidth microphones shows that these recordings are preferred when compared to similar recordings made at lower sample rates using conventional microphones. The most common response is that the 192 kHz recordings sound more like the instrument(s) they play.

F1
Figure 10: Here is a .wav file and corresponding spectrogram of a snare drum track that I recorded at 192 kHz, 24-bit using a high-bandwidth 1/2” microphone.



One may argue as to whether or not capturing this ultrasonic energy results in a snare sound preferred by the consumer but if the acoustic energy is present when the stick hits the drum head, the author feels it should be accurately captured in recordings. 

As such recordings begin to proliferate whereby the ultrasonic energy generated is not filtered out due to anti-aliasing filters, mixing/mastering techniques and/or the recording microphones themselves, transducer engineers will have access to proper source material for subject and objective assessment of their true high-resolution designs. Given the growth in high-res audio, this could bode well for higher-end, and more profitable, transducers and finished loudspeaker products. LIS


About the Author

Dan Foley has been in the audio test and measurement industry for more than 35 years and has a broad background in analog and digital audio test, acoustics, electro-acoustics, telecom audio, as well as vibration measurement and analysis. He is a member of the Audio Engineering Society (AES) and the Institute for Electronics and Electrical Engineers (IEEE), and has many close ties to the audio industry, having worked for the likes of Bose, Listen, and Brüel & Kjær. Foley has developed and taught seminars regarding digital signal processing techniques used in acoustic, vibration and audio test, and measurement applications. He currently serves on the IEEE Transmission Access & Optical Systems Committee as well as AES standards committees. Foley is also an Adjunct Faculty Member at Worcester Polytechnic Institute where he is developing a curriculum in audio product design engineering. Dan is a published author of ASME and AES and has an engineering degree from the University of Hartford, West Hartford, CT.