BASIC
PROBLEMS OF SOUND REPRODUCTION
THAT APPLY
TO DIGITAL AND ANALOG
©1998 The Anstendig Institute
With the new digital standard and the analog vs. digital
controversy occupying the worlds of computers and sound reproduction, a review
of the problems of all sound reproduction, analog or digital, is timely.
Sound reproduction was a badly flawed field long before digital
came into being. People have, for a century, been listening to recordings of
live events that do not sound even close to the original. This is true for
radio and TV as well as sound reinforcement (amplified live sound programs). It
started with the awful, early-in-the-century sound associated with the original
Caruso recordings and those of other famous artists of the period. The most
beautiful voices and instruments sounded like a Kazoo on the machines of the
time and, uncorrected, still sound that way on present day machines. But people
loved these recordings because they still made them cry and/or gave them an
emotional lift. Part of the reason for this is that the ear is more forgiving
of such distortions at low volume levels and most machines up to the late 60s
could only play well at low volume levels. Films also had less than desirable
sound, but by the 1930s, when film sound began, recordings had progressed
technically.
The main problem with early sound recording was that the
recording process favored some frequencies over others, and thus the volume
level of some frequencies was much louder than other frequencies. The main
imbalance was an exaggeration of the mid-range frequencies around 400 to 1000Hz
because the recording equipment was most sensitive in this range and
progressively less sensitive as the frequency range got higher and lower. That
gives the old recordings their characteristic “hooty” sound. There were also
problems of surface and background noise. The various equipment involved in the
whole record and playback process had other frequency exaggerations of their
own, called resonances, which would vastly overemphasize a particular small
frequency range. Microphones were the biggest culprits.
For the most famous and grossest example, most recordings from
the 40s to the 80s were made with a particular Neumann microphone favored by
the industry. That mic had a huge overemphasis of the frequencies in the 1200
to 2000 Hz range (centered at about 1600 Hz). This greatly distorts the sound
and must be corrected if the sound is to sound like the original. Almost all of
the thousands or more important recordings made during that long period have
this frequency emphasis, including the famous Mercury recordings and most
others prized as having excellent sound. Yet millions of those recordings are
in circulation, including thousands re-released on CDs, and practically no one
recognizes this huge resonance that changes even what human voices sound like.
This points out a lack of hearing acuity in most of the population of the
world, including that of the critics who review these records. And it is not
the only emphasis on most recording, new and old (yes, modern recordings have
them too). The extraordinary Maria Callas suffered particularly from resonances
in recordings, especially a 350-600Hz emphasis that gives her voice a “hooty”
sound and gave her a reputation of singing with a hoot, since few got to hear
her sing live. In fact, her voice production was quite excellent and can be
experienced in its full glory when this emphasis and that of the Neumann mic
(1600Hz) is reduced by using equalizers.
Some of the resonances and emphasis take place when the
recording is played back in the listening room because the speakers are, in
reality, no different from musical instruments creating new sounds in air.
They, too, create overtones and other effects associated with the process of
creating sounds in air. The Anstendig Institute has described these added
distortions in our paper “A Massing of Overtones”.
But, beyond the problems of recording and playback, there is
another characteristic in the way we hear the frequencies that colors the
sounds we hear.
In the early 1930’s, two technicians in the Bell Telephone
Laboratories carried out the most important experiments in the history of
sound, well known throughout the industry as the Fletcher Munson Equal Loudness
Curves (see illustration below). Fletcher and Munson were carrying out research
to determine how telephone handset speakers should be manufactured to sound
most natural. What they found is that
1) when all the frequencies are playing
equally loudly, we do not hear them equally loudly. We hear some frequencies
louder than others, not just in one particular range, but that we hear various
frequency groups, low and high, louder than those above and below them. The
region we hear loudest is that between 2000 and 4000 Hz, which is exactly the
range where the frequencies of most instruments and voices peak. But there are
other imbalances nearly as strong.
2) that the amount of emphasis with which
we hear various frequencies changes with the volume level. That means, that,
when all the frequencies are played at one particular volume, we hear a certain
amount of imbalance. But that the imbalance changes dramatically when the
overall volume level of the frequencies changes.
The meaning of these findings is that, if there were no other
resonances or balance problems in the technology of the recording, the only way
the playback could sound like the original performance is if it is played back
at absolutely exactly the same volume level as the original performance in a
room with the same acoustics as that of the performance. But there is no
standardization of volume levels and no way of knowing if one is listening at
the original volume level. So a sizeable amount of equalization would be
necessary for that reason alone. Add to that the various other resonances and
imbalances inherent in the recording and playback processes and there is no way
anyone is going to hear sound accurately, or even close to the original without
a large amount of equalization.
Furthermore, because all recording media, including digital,
cannot capture the whole dynamic range of real life sound, the volume levels of
the performance are changed during the recording, reducing the loud passages and
increasing the soft passages. After Fletcher Munson’s discoveries, the
recording industry should have come up with a means of compensating for the EQ
changes in the way we hear sound when overall volume levels change. But the
industry has not done so. Therefore, there is another distortion to the sound
every time the overall volume level was changed during the
recording..(Compression and expansion is the term for this technique when done
automatically; “gain riding” is the term when done manually by the recording
engineer.) The only way to compensate for this is to manually raise or lower
the volume levels during playback.
Even live performances, especially of shows, suffer greatly from
this problem. I attended a performance of “Les Miserables” in San Francisco, in
which the sound was amplified and a technician used gain riding to increase the
sound to deafening levels (at least 115 decibels and probably louder) during
climaxes and decreased the sound to nearly inaudible during quiet, lyrical
passages. I walked out as soon as I could, after it became apparent what was
happening. In the medical community, exposure to sound pressure levels above
100 decibels is believed to cause hearing damage (some even feel levels above
90 decibels to be dangerous). Therefore, one could expect those who stayed to
have suffered at least some hearing damage, besides suffering discomfort from
the distortions of the sound.
The above described distortions do not even begin to touch on
the problems of the equipment itself. The most important point to know about
equipment is that, until the mid seventies, amplifiers had very low output,
usually 10 watts or lower. Therefore, loud listening is only very new. Up to
the last two decades, all radio, TV, and recordings were heard at low volume
levels. As mentioned, the ear is more forgiving of the distortions at low
volume levels. The louder the sound, the greater the effect of these
distortions on the listening experience and the more apparent the distortions
become. With the advent of high powered amplification, volume levels have risen
and the distortions due to the sound reproduction as well as those due to our
unequal perception of loudness have become much more disturbingly apparent, to
the point where they usually ruin our perception of the expressive content.
The Anstendig Institute has carried out year long research that
demonstrated that these resonances, especially those in the 2-4,000 Hz range,
destroy our perception of expressive nuance, keeping us from hearing the more
delicate emotional qualities in music and sound when they are present.1
A further problem of sound reproduction and sound reinforcement
is the loudspeakers. The only loudspeakers I have found that are capable of
reproducing the finer nuances at most volume levels are very large, horn-loaded
(the higher frequency drivers are connected to and dispersed by a horn), and
vented (they have open boxes, in which the frequencies from the back of the
speaker escape into the room through a vent). These are usually theater loudspeakers.
But this type of speaker usually has other resonances of its own, which also
have to be equalized. All speakers in closed boxes that I have heard dampened
the nuances as well as the frequencies from the back of the driver.
What is the effect of all this? First of all, most people in
today’s world, at least in the developed countries, hear most of their music in
sound reproduction. They have been doing so for a century. Society has slowly
lost its concept of natural, undistorted sound. It has also stopped listening
for or expecting anything more than the relatively gross, unsubtle emotional
content that comes across in typical recorded sound.
Present digital recording, which is without much of the subtle
differentiation of nuance and actually falsifies the expressive nuances, simply
put the nail in the coffin. Digital was only able to be accepted because people
were already accustomed to not hearing all of the expressive information and
don’t miss it. The following is a quote from our paper “Our Loss of Emotional Richness Due To Bad Sound
Reproduction”. What Dr. Ostwald says goes for old analog as well:
A noted psychiatrist at the Langley Porter
Institute at the University of California in San Francisco, Dr. Peter Ostwald,
M.D., recognized our Institute’s warnings, dating back to the early 1980’s,
about the probable effects of the acceptance of unperfected digital technology:
“I was fascinated by his (Mark’s) original theories, which included the daring
proposition that due to its inability to record subtle changes between notes,
the then-developing digital technology might be detracting from listeners’
perception of emotional nuances in musical instruments and the human voice.”
After nearly two decades of digital recordings, that has, in fact, already
happened: we now live in a society that suffers from a general impairment of
its ability to perceive and experience emotional nuances. Worse, people no
longer are aware of the finer differentiations of emotional qualities and no
longer listen for them.
Someone of great personal sensitivity came over recently to hear
Menuhin’s recording of the Beethoven Violin concerto with Klemperer conducting.
Afterwards, he confessed that he simply hadn’t expected anything like the depth,
sweetness and seriousness of that experience, but particularly the sweetness of
the expression, which is the first thing destroyed by the uncorrected
distortions of all recorded sound, analog or digital. The difference between
the two is that analog at least has the full depth of expression on the
recording and the playback sound can be adjusted so that the expression can be
heard. Current digital does not even capture that expression. But with either
analog or digital, sound reproduction cannot be even close to accurate without
compensating for the frequency imbalances during playback. The inescapable
conclusion is that sound reproduction that closely matches the original sounds
and allows the listener to experience the emotional content of the original is
not possible with analog or with an adequate digital system unless the sound is
equalized during the playback.
1 Our papers
on sound reproduction, particularly “Sound Equalization in
Relation to the Way We Perceive Sound”, deal with the effect of distortions
on our perception of nuances of expression, especially those resonances in the
frequency ranges where we are most sensitive.
The Anstendig Institute is a non-profit, tax-exempt, research institute that was founded to investigate stress-producing vibrational influences in our lives and to pursue research in the fields of sight and sound; to provide material designed to help the public become aware of and understand stressful vibrational influences; to instruct the public in how to improve the quality of those influences in their lives; and to provide research and explanations for a practical understanding of the psychology of seeing and hearing.