THE TRUTH ABOUT CD AND DIGITALorTHE TRAGEDY OF THE MISSING INFORMATION©1984 Mark B. Anstendig Without
exception, the public should avoid all products using digital sound processing,
including digitally re-mastered records as well as CD discs. The digital
process currently used for recordings and CD discs was introduced long before
the requisite technology was perfected. The current technology is incapable of
preserving all of the information necessary for the accurate reproduction of a
musical performance or any other real-life sound event. The adoption of this
flawed technology as the accepted standard for the whole industry makes it
impossible to perfect commercially available digital recordings without
changing all of the hardware, i.e., without making all digital equipment and CD
discs obsolete.
The
differences between digital and analog recorded sound are clear, obvious, and
not the least bit subtle. If an experienced listener with good hearing cannot
hear these differences when comparing the same recording in digital and analog
versions, the sound-system must be faulty. Unfortunately, that is almost
universally the case, as most sound-systems are not capable of resolving enough
detail to reproduce the musical experience contained in the original
performance. In fact, the impression that digital is an improvement over analog
recordings stems from the fact that most playback systems, particularly the record-playing
components (pickup-cartridge, tone-arm and turntable), are not capable of
resolving all the detail in the record grooves. Most listeners have, therefore,
not yet heard all of the information on their analog records. The vast majority
of owners of analog sound systems should, therefore, upgrade their sound
systems so they can hear how well their records can sound, rather than invest
in a permanently flawed digital system.
For
the following explanation of the problems of digital recording, we are indebted
to Mitchell A. Cotter, one of the most respected electronic, acoustic, and audiological authorities in the country.
Traditional,
or "analog", sound reproduction saves the sound in a form similar to
(analogous to) the original sounds. Digital recording, on the other hand, uses
an analog-to-digital converter to convert sound into a train of numbers by
sampling the sounds at a fixed number of times per second. The speed of this
sampling rate determines the detail and subtleties that the digital train will
represent. It determines the high-frequency limit of the reproduction and the
amount of dynamic subtleties that are captured.
The
sampling rate of 44,000 samples per second, currently used by the industry, is
not frequent enough to capture accurately all the subtlety of the higher
pitched musical transients. A much quoted theory, the Nyquist Theory, states that, in order to reproduce a simple, steady, unvarying sound,
one must have a sampling rate that is at least twice the frequency of the
sound. In other words, a 20,000 hertz tone, which is considered the upper limit
of hearing, needs a sampling rate of at least 40,000 samples per second.
Think
of each single vibration of a 20,000 Hz tone. Each vibration begins at zero,
goes up to a peak above zero, down to a trough below zero, and returns to zero.
That occurs 20,000 times per second. A sampling rate of double the frequency,
in this case 40,000 samples per second, is therefore the slowest sampling rate
that could theoretically sample both a peak and a trough of each vibration. But
that would be true only if the sampler were lucky enough to be absolutely
synchronized with the peaks and troughs of the sound being sampled. If they are
not absolutely in synch, the sampler might be sampling each vibration at zero,
which would result in data indicating no sound, or anywhere on the way up or
down. Since it is impossible to precisely synchronize the sampler with the
peaks and troughs of musical or other non-mechanically produced vibrations, it
is misleading to speak about the Nyquist limit in
relation to sound-reproduction.
(It
should also be pointed out that, while a sampling rate of double the frequency
is the theoretical minimum for reproducing a steady, unvarying tone, no
circuits are 100% efficient. In fact, the efficiency
of current machines is quite low. But, even with an ideal, completely efficient
machine, the ambiguity of the data with a 20,000 Hz tone and a sampling rate of
40,000 samples per second would be 100%, because each frequency can only be
sampled twice per cycle.)
If
the sampling rate is not exactly double the vibration, as is the case with most
frequencies, the tones will be sampled at constantly changing positions in the
cycle of each frequency. Think of a disc with a white spot on it that is
turning clockwise at 100 times per minute and is lit by a strobe flashing
exactly at 100 times per minute. The white spot will stand still. If you speed
up the flash a little, the spot will appear to be slowly moving
counterclockwise. In digital, such an effect happens at all frequencies unless
the sampling rate is dense enough to catch the whole waveform of all sound
vibrations.
Furthermore,
because the sampling rate is too slow, it generates other distortions as well
as its own additional sounds that degrade the final result in all frequencies.
Because these degradations of the sound belong only to digital, they are
completely new and are, therefore, not at all reflected in the analog-type
specifications used to tout digital as an extremely accurate form of recording.
Those specifications, which describe analog problems, do not even apply to
digital. For example, no digital could have any wow or flutter. If it did, it
would not be a little better or a little worse than other machines, it would
simply be defective.
The
patterns of musical sounds are not steady, unvarying tones. Most sound patterns
consist of highly complex mixtures of tones of constantly varying dynamics and
subtly varying pulse. The rise and fall of the tones as they progress in time
(modulation) and their tonal qualities must also be accurately reproduced and
differentiated. The sampling rate of 44,000 samples per second is not even
close to being fast enough to reproduce that information accurately. Mr. Cotter
emphasizes that this limitation is not a matter of imperfection of circuitry
but, rather, a fundamental mathematical limitation from the lack of sampling density.
It is, therefore, simple ignorance to be speaking-of the Nyquist theory in relation to the recording of sound information. Anything approaching
a true resemblance of the original sound-patterns first begins at least 10
times the Nyquist limit, or twenty times the
frequency of the tone being reproduced. And much greater sampling density than
that is necessary for true accuracy that can match the fidelity of analog
recordings. To improve upon the best of today's analog sound a sampling rate
approaching one million samples per second (one megahertz) would be necessary.
Mr.
Cotter explains that the reason for so much misunderstanding and misinformation
regarding digital is that most of the important research and development of
digital technology was done under government contract for defense purposes,
particularly in the development of radar technologies to disguise the presence
of radar signals and in further technologies aimed at detecting those disguised
signals. Therefore, the most important research into the necessary sampling
density for detection of transients, etc., belongs to different disciplines,
much of which lies under the blanket of National Security and is not readily
accessible, even to the professional audio world.
The
Anstendig Institute has used digital processors for half a year and has
carefully investigated the sound quality, including comparisons of audience
reaction to programs of digital and programs of analog sound. In addition, Mr.
Anstendig has been able to evaluate the digital sound and compare it to analog
sound in a listening facility, designed and executed by Mr. Cotter, that is
considered by many to be acoustically and electronically perfect and the finest
installation of its kind.
Everything
in this room, including the walls, ceilings, speakers, electronics, and even
the resonance factors of the building materials has been precisely computed to
be an integral part of the sound reproduction. It is a perfect acoustical
environment. Live, but with absolutely no ringing and no
resonances. The speakers, custom-built to Cotter's specifications,
radiate in such a manner that the sound remains the same throughout the room.
State-of-the-art
recordings that were simultaneously recorded in analog (direct-to-disc) and in
digital versions were compared, with and without equalization (for a true
comparison, it is important to be able to change the frequency balance of each
recording so that they match each other, because the frequency balances of the
analog and CD versions are not the same. In this particular comparison, the CD
had louder highs).1 The difference between analog and digital
recordings is clearly audible on The Anstendig Institute's own sound system.
But it is definitive to hear the differences in such a perfect room.
In
The Anstendig Institute's experience, the various faults of digital limit the
fine detailing of the sound and the subtle nuances of dynamics that make up the
expressive content. The resultant sound reproduction is, therefore, quite
different from, and inferior in expression to the original performance. This is
the worst possible flaw, because the most important aspect of music, the
expressive content, is changed and degraded.
With
professional analog recordings, the necessary information does get preserved on
the tape or disc. The frequencies may be out of balance (unequalized) or, as in
early recordings, limited in frequency range, but al1 of the dynamic qualities
are saved and can be retrieved.2 (Supposedly, the missing
frequencies can now even be electronically reconstructed.) But with digital,
that is not the case. Definitely from 1000 Hz on up (and, in our experience,
much lower), enough information to reproduce the dynamic time-factors of the
music (the precise dynamic fluctuations of the sounds as they progress in time)
simply is not on the tape and the modulation of the rest of the frequency range
is imperfect. Since it was not saved in the master recording, this
information can never be retrieved. In actual listening, the most easily
bearable problem is that the sound of the frequencies above 1000 Hz is coarser
and grainier. But, more important, the bloom of the tones is gone and the
expression is compromised and changed. The rise and decay of the tones are
lost. The sounds are expressionless and the music sounds curiously dead,
dramatically so if one familiar with the original. The differences immediately
show up in a comparison of simultaneously recorded analog and digital versions
of the same performance using first- rate equipment.
So
far, the knowledgeable of the audio world have realized that the low sampling
rate limits digital's ability to capture dynamic
transients (the very short, sharp, isolated sounds). But it is more important
to understand that the ability to capture all dynamic nuance is limited.
The
Anstendig Institute has, with invited guests, compared four types of recorders:
the finest cassette-recorder on the market, a reel-to-reel tape machine at 15
IPS, a digital processor, and a top-of-the-line Beta Hi-Fi video recorder. All four machines were set up together and simultaneously
recorded the same record. The results: the only machines capable of saving
enough information to claim to be reproducing the source are the reel-to-reel
(at the higher speeds) and the Beta Hi- Fi. The
others simply are not reproducing the music, with digital producing the worse
results.
Cassette
recordings are not much better than digital. The sound is not quite as bad, but
the subtleties are not all present in the lower registers and the highs are
coarse, without bloom or luster. During our test, no one was moved by the music
when the cassette recording was played. The music is almost as curiously dead
as with the digital process.
The
lack of the important human expressive qualities in digital recordings goes
unnoticed for two reasons. First of all, since one does, of course, hear some
expression, albeit a falsification, there is no way the listener can know that
the expression is wrong and that something important is missing. Secondly, most people (and most musicians too) are no longer used to listening
for subtlety because it is not present in most sound systems or in many of the
other sounds they are accustomed to hearing in the modern world. Because, for
nearly a century, playback systems have not been able to reproduce the most
subtle expressive nuances and exquisite modulation of music, many people hear
those qualities so seldom that they no longer know about or listen for them.
But the Anstendig Institute has found that most people do respond to subtlety
when exposed to it under the right conditions and their attention is directed
towards it. It is, therefore, important that these essential civilized
qualities are not lost to society.
Unfortunately,
the prevalence of bad radios and cheap sound-systems that do not reproduce what
is on the records as well as recordings that only approximate the actual
recorded performance has given rise to the idea that with such approximations
one can still experience the classical masterworks. But the truth is that
either the sound is an accurate reproduction or it is not. If it is not,
the listener is not hearing that music. What is heard is a distortion that is
really something else quite different from the original art-work that
was recorded.3 With digital, there is more
than simple distortions of the signal. Part of the signal is simply missing and
the rest is adversely affected and truncated. Indicatively, as already
mentioned, the many distortions causing these problems cannot be reflected in
the type of measurements (specs) used to advertise digital recordings.
Unfortunately,
the public has come to rely on specs in its purchase of equipment. It is,
therefore, important to understand that digital has brought a whole new set of
distortions and that measuring techniques for expressing these distortions as
universally meaningful specifications have either not yet been developed or not
yet adopted by the industry. The same specifications used to describe analog
recordings are used in describing digital even though they apply only to analog
and not at all to digital. To quote Cotter: "They are not only talking
about apples and oranges, they are talking about totally irrelevant things.
They tell you about all of the analog frailties that the digital system does not
have, as though the digital systems could ever have them (they can't possibly).
And they tell you nothing about the digital system's frailties, which the
analog system has none of. In balance, the frailties of the analog system are
more musical, even if it has frequency irregularities and .5% distortions, etc.
That is like life; we live with those and similar distortions all the time. But
we don't live with those digital sampling processes and their unnatural
distortions. The description of the perfection of these digital audio systems
using total harmonic distortion and all that kind of thing is the sheerest
irrelevancy.”
Even
with analog recordings, the buyer is not offered specifications that describe
anything more than limited aspects of how that machine would reproduce steady,
unwavering, and, therefore, unnatural, mechanical sounds. The currently popular
specs tell the reader nothing about how even an analog machine will reproduce
real-life, live sounds, in particular the expressive subtleties.
Also,
much touted measurements, such as those of digital's dynamic range, are inaccurate when comparing the limitations of the dynamic
range of analog with the limitations of dynamic range in digital. For example,
the analog recording actually can preserve a much wider dynamic range of
information than its specs indicate, while digital, in reality, provides far
less dynamic range than the specs claim. And the specified upper and lower
dynamic range of digital cannot be fully utilized because, as the signal approaches
or exceeds those limits, an unavoidable reaction takes place which causes the
machine to create new, ugly, unlistenable sounds
which are added to the signal. In all digital, it is, therefore, necessary to
keep the signal quite a bit away from the dynamic limits defined by the specs.
A stated dynamic range of 80 decibels (dB) might, therefore, only amount to
about 60 dB or less of usable dynamic range. But, with analog recording, a
great deal of usable information can still be recorded for quite a ways above
and below the dynamic range implied by the specs.
The
big difference between analog and current digital is that when no sound is
being played, the digital is always silent, while the analog recording will
often have some low-level background noise. But the ear easily adapts itself to
and ignores steady unvarying low-level noise, while the distortions of digital
are more disrupting to the ear (if less overtly noticeable) because they are
constantly fluctuating along with the audio signal.
The
present, extraordinarily ugly situation is fraught with irony because, as a
technology, Digital recording is not limited. With an adequate sampling rate
and adequate dynamic range, it would be the ideal recording medium. But the
sampling rate would have to be quite a bit higher than 200,000 just to approach
decent hi-fi sound and it should be about 1 Megahertz if it is to be an
improvement over what is possible with analog recording. Also, the dynamic
range would have to be improved by raising the number of bits of the processor,
independently of the processes involved in reproducing the time relationships.
At
present, the public has no examples of what the reproduction of music should
really sound like available to it. The problems of distinguishing the differences
between various kinds of recorded sound without adequate equipment make it
particularly urgent that a room with a perfect acoustic and sound-system be
available to the public and particularly to the professionals who need such a
frame of reference. In fact, many such facilities should exist around the
country, if not the world. A direct result of the lack of availability of such
a reference standard is the current confusion in the field of acoustics and the
well-known, major crisis in the music world due to the deterioration in the
quality of music-making, which is, to a great degree, the result of decades of
hearing deficient sound-reproduction.
People
no longer know what music should sound like. They are hearing substantially
less than the content of the great recorded performances. They are being robbed
of a crucial part of their cultural heritage in the great recordings of this
century, many of them by the composers themselves. In fact, the confusion at
present is so total, and wrong, inaccurate sound-reproduction is so universally
prevalent, that it is almost too late. That the gross flaws of digital are not
immediately apparent, even to professionals, is a major symptom of this general
malaise.
Musicians
as well as the public have been listening to imperfect cassette-players and
sound systems for so long that the poor sound quality of digital is not readily
noticed. The tragedy is that some day the world will wake up and realize that
all these new digital recordings might just as well be thrown away because they
do not contain the most important information and that missing information is
irretrievably lost. Along with cassettes and most sound-systems, they are a
misrepresentation of art, a subtle, insidious disease that has long been eroding
cultural life. The only thing worse--unthinkable even, but possibly more
likely--would be for the world to accept today's digital sound and not wake up.
1 The need for equalization is explained in our paper "Sound
Equalization in Relation to the Way We Hear”.
2 Technically, analog tape-recording also has a sampling rate. The
bias frequency is really a sampling rate. But the bias frequency is much higher
than the current sampling rate of digital. According to Cotter, analog tape
recording can adequately resolve the frequency spectrum up to about 10,000 Hz,
after which the signal quality begins deteriorating.
3 This problem is paralleled in the visual world by the poor
quality of photo-reproductions of art works and, especially by the widespread
use of inadequate 35mm slide- reproductions to judge visual art, by the
National Endowment for the Arts and other philanthropic organizations. Due to
failings in the photo-reproduction process itself, particularly the inability
of any available focusing device to achieve precise focus, these slides are
technically incapable of accurate reproduction of art works (or of anything
else). What therefore is being judged are falsifications that are completely
different art works.
The Anstendig Institute is a non-profit, tax-exempt, research institute that was founded to investigate stress-producing vibrational influences in our lives and to pursue research in the fields of sight and sound; to provide material designed to help the public become aware of and understand stressful vibrational influences; to instruct the public in how to improve the quality of those influences in their lives; and to provide the research and explanations that are necessary for an understanding of how we see and hear.
|