Remko Scha
Institute
of Artificial Art, Amsterdam
Virtual Voices
(Mimesis)
The new digital
media technologies which are now being developed are often imitative
technologies. Future generations may end up viewing the twentieth
century as the century of abstraction, and the now imminent turn
of the century as the moment of a return to a mimetic aesthetics.
The imitation of nature is once again a widely pursued artistic
ideal. Sometimes this concerns the imitation, not of the way things
look, but of the processes that constitute life, body, or mind.
Mimesis is then called artificial life, robotics,
or artificial intelligence. But in other cases, it concerns
the classical ideal of the perfect simulation of the surface of
things. Then it is called ray tracing, paintbox, digital photography,
virtual reality.
Music exists
between the poles of mathematical abstraction and pure physics.
Imitation is not an issue there, one might think, but nothing could
be further from the truth. After the failure of 'real' electronic
music, which used sinusoids, square waves, noise and modulators
to build sound sculptures that people don't particularly want to
listen to, there is now an avalanche of digital electronic technologies
that simulate the sounds of conventional instruments in great detail,
and make them accessible for keyboards and computers with MIDI interfaces.
Artificial
speech synthesis is an imitative technology which is closely
connected with music. However, its relation with language also lends
this medium an entirely unique character. This article explores
the history, the techniques and the aesthetics of this medium.
(Voice)
Music is a
matter of physics -- not so much because music is usually realized
by means of sound, but rather because precisely the structural
properties of music (such as metre, rhythm, harmony, melody) are
based on physical phenomena.
Language is
a matter of symbols. The conceptualization and abstraction of human
experience.
Between language
and sound: speech. Between mind and matter: the voice.
"Listen
to a Russian bass [...]: something is there, manifest and persistent
(you only hear that), which is past (or previous to) the
meaning of the words, of their form (the litany), of the melisma,
and even of the style of the performance: something which is directly
the singer's body, brought by one and the same movement to your
ear from the depth of the body's cavities, the muscles, the membranes,
the cartilage, and from the depths of the Slavonic language, as
if a single skin lined the performer's inner flesh and the music
he sings."1
(To copy
/ To fake)
Within the
technology of voice imitation, two approaches are usually distinguished:
the genetic approach and the gennematic one. The genetic
method imitates the physiological processes that generate speech
sounds in the human body. The gennematic method is based on the
analysis of the speech sounds themselves, and reconstructs these
sounds without considering the way in which the human body produces
them.
The speaking
machines of the eighteenth century were based on the genetic principle:
the hardware of the larynx and the oral cavity was reconstructed
in a stylized way. If such an imitation is faithful enough, the
sounds it generates resemble the sounds of human speech.
In the twentieth
century, we see an entirely different approach: digital technology
which calculates the shapes of sound signals and then uses
loudspeakers to make them audible. The voice is no longer imitated,
but its output is faked. The algorithm computes signals that evoke
the image of a physical process that never occurred.
The eighteenth-century
automaton is a mechanical body, a piece of clockwork claiming the
qualities of life. In twentieth-century computer simulation, the
mechanics is abstract, the machine dissolves into mathematics. The
body has disappeared.
(To copy)
The impulse
of classical sculpture: not representation, but imitation. A life-size,
colored, three-dimensional model is not a model, but a copy. The
master sculptors of classical mythology even managed to duplicate
the human body in sculptures that did not only show perfect likeness,
but that also could speak and move naturally. In Chinese and Germanic
mythology, carpenters and silversmiths displayed similar skills,
building treacherously seductive female automata.
The essential
step on the road from myth to technology was taken in the seventeenth
century. The idea that living organisms function according to the
laws of physics, and could in principle be simulated by means of
mechanical constructions, is then no longer a vague, alarming suspicion,
but a scientific hypothesis. In the early seventeenth century, Descartes
presented the thesis that animals are in fact machines.
Thomas Hobbes:
"Nature, the art by which God hath made and governs the world,
is by the art of man, as in many other things, in this also
imitated, that it can make an artificial animal. For seeing life
is but a motion of limbs, the beginning whereof is in the principal
part within; why may we not say that all automata (engines that
move themselves by springs and wheels as doth a watch) have an artificial
life? For what is the heart but a spring, and the
nerves but so many strings; and the joints
but so many wheels giving motion to the whole body, such
as was intended by the artificer?"2
In the seventeenth
and eighteenth centuries, the construction of automata which imitate
bodily functions of man or animal was extremely popular: the development
of clockwork technology had made it possible to realize much better
imitations than before; and the theories of the Cartesians lent
a philosophical interest to such enterprises. Thus, there were dolls
that could walk or talk, write letters, or play the flute; birds
that could flap their wings, tweet, eat, drink and shit. There is
a curious similarity between this kind of automaton building and
present-day Artificial Intelligence. In those days, too, the capacities
of the most advanced technologies of the moment were exploited with
the goal of imitating the outward appearance of certain aspects
of human behaviour; then, too, this resulted in products which aroused
everybody's interest, because they could be regarded as technological
experiments, as biological models, as philosophical existence proofs,
as art, or as entertainment.
<<illustration:
Mechanical Duck>>
This is illustrated
by the career of Jacques de Vaucanson, one of the most well-known
automaton builders of the eighteenth-century. His automata were
amusing and astonishing exhibits in popular fairs, while their mechanisms
were published in learned scientific articles -- by the designer
himself, and also by Diderot and D'Alembert in their Encyclopédie.
The production of the automata also generated interesting technological
spin-offs. Eventually, Vaucanson became an innovative organizer
in the textile industry, and built the most advanced silk-spinning
factory of that time. He used the technology of his automatic flute
player for the design of the first programmable loom, which would
later become the basis for Jacquard's work.
(L'Homme
Machine)
Human beings
are emphatically excluded from Descartes' reasoning about the mechanical
character of animals. He links mechanism with the absence of emotions
and the absence of consciousness. Apparently he has no difficulty
in viewing animals in this way, but that people could be machines
is more of a problem. The Cartesian philosopher Cordemoy formulates
the argument against the mechanizability of humans in terms of the
idea of an automatic speech machine: "...although I see
clearly that a purely mechanical apparatus could utter a few words,
I know at the same time that the springs which distribute the air
or open the tubes that let out the voices display a certain order
between each other, which they could never change. So that, from
the moment the first voice sounds, the voices that usually follow
must necessarily follow as well -- that is, if the machine still
has sufficient air. Contrarily, the words which I hear being uttered
by bodies such as mine, are rarely pronounced in the same order."3
From this point
of view, the richness of language is connected with the typically
human capacity of free will, which is intrinsically incompatible
with the rigidity of a clockwork. To account for free will, Descartes
provides the - otherwise mechanical - human body with an interface
to the immortal soul. He situates this interface in the pineal gland
-- a small gland with an unknown function, located in the hypothalamus.
A hundred years
after Descartes, the idea that human beings are machines as well
was explicitly defended after all. In "L'Homme machine",
La Mettrie argues that "all capacities of the soul are to such
an extent dependent on the right organization of the brain and the
entire body, that apparently they are nothing but this organization
itself." In this theory, there is a material identity between
body and soul; therefore, the necessity of a mind-body-interface
has disappeared. Between man and animal, there are only differences
in degree. The Cartesian arguments concerning the mechanical character
of animals now apply directly to human beings, but at the same time
their meaning changes profoundly. Mechanism no longer implies non-consciousness:
consciousness itself is mechanical.4
If the more
ambitious automaton builders of this period had based their research
programmes on La Mettrie's viewpoint, they would have invented something
like today's Artificial Intelligence: a discipline aimed at creating
actual demonstrations of mechanized mental processes. But clockwork
technology was not suitable for such a purpose. Therefore, this
step could not be made until the middle of the twentieth century,
when the electronic computer became available.
This is also
why Cordemoy's argument was so plausible in his days. No one could
foresee the virtually unlimited switching flexibility which would
be introduced by Von Neumann's stored program computer. Software
is mechanics which is capable of dynamic reconfiguration. Programs
are virtual clockworks with self-modifying and self-extending capacities.
Although these capacities are limited by the finiteness of the hardware
on which the programs are implemented, in practice we can often
ignore these limitations. Compared to a clockwork, the computer
realizes a qualitatively superior complexity and flexibility. With
this invention behind us, we must now forever view the limits of
the mechanizable as unknown and open-ended.
(Talking
Heads)
The first serious
speech machines were developed by eighteenth-century automaton builders
who were engaged in mechanical simulation of the bodily functions
of man and beast. At this time, the sound of speech was not yet
viewed as a phenomenon which could be analyzed and reconstructed.
Speech simulation was imitation of the act of speaking. Artificial
bodies were created, which could blow out air and thereby make the
air vibrate; the fidelity of the artificial speech generated in
this way depended on the accuracy with which the relevant features
of the human body were reproduced.
Like human
beings, these machines had 'vocal cords' which are set into vibration
when air is pressed through. The precise functioning of the human
vocal cords was not yet known at this time. To imitate them, the
machine-builders used the principle of the harmonium: an air tube
is closed off by a flexible metal tongue, which moves under pressure
to let the air through and is consequently set into vibration. The
tongue was often covered in leather to dim the high tones slightly.
As with a reed
organ, this vibration was then conveyed to the air in a resonance
chamber -- which in this case was made to resemble the human mouth
as much as possible. Depending on the exact shape of the resonance
chamber (that is, the position of the mouth), various vowels could
be generated. Depending on the way in which the air stream was started
or stopped, or obstructed by constricting the outlet, various consonants
could be formed.
<<illustration:
machine from the "bachelor machines">>
As Cordemoy
had argued already, independently functioning machines of this kind
could only deliver a limited repertoire of texts. Because speech
simulation proved far from easy, in practice this even came down
to rather small numbers of words or sentences, which would be hardwired
into the machine. For this reason, speech machines were often designed
as instruments instead -- machines which could generate all
the sounds that are needed to pronounce any given text, but which
could only pronounce an actual text if operated by a technical expert
who determined which sounds were produced at which moment. Descartes'
solution, one might say: a mechanical machine driven by human consciousness;
a body controlled by a mind.
In 1778, for
example, Wolfgang von Kempelen designed a machine which directly
imitated the functioning of the oral cavity. The operator squeezes
a pair of bellows to press the air, via 'vocal cords', into a resonance
chamber, which he modulates with both hands. The various vowels
are created by changing the shape of the resonance chamber with
one hand; the consonants are produced as the other hand opens or
closes this chamber in various ways.
A lung and
vocal-cord prosthesis, which makes it possible to use the hands
as a mouth. Technological perversion of speech.
<<illustration:
Von Kempelen's machine>>
The 'vowel
organ'.5 The same principle, but in this case:
a carrousel of different sound cavities. A fan of vowels. A laboratory
instrument operated by a technician, by means of switches, wheels,
foot pedals. As a result of the technician's actions, the vibrating
air is sent to one resonance chamber or another, and such a chamber
is opened or closed in a variety of ways. Thus, by a succession
of separate interventions, the technician realizes, one by one,
the phonetic elements of the language expression to be pronounced.
<<illustration:
Van Brakel's machine>>
Human speech
is a continuous process. In this mechanical simulation, there is
no such continuity. What we hear is phonology: the discrete combinatorics
of linguistics.
Joop van Brakel
on the 'vowel organ': language shattered into meaningless fragments.
Slapstick, merriment, music. Language regressing to animal sounds.
Cackling, bleating, barking. ("There once was a time when all
speech was song.")
The vowel organ
has ingenious 'artificial vocal cords'. A hollow cylinder with a
slit in it continually rotates within another hollow cylinder, also
with a slit in it. The result: a slit-shaped opening is opened and
closed continually. The air is pressed through this opening. If
we use this technique to set the air in motion without providing
a connection to an 'artificial oral cavity' in which the air can
resonate, what you hear is a fart. Is that the sound that
underlies all speaking?
<<illustration:
Van Brakel>>
Other speech
machines create an even greater separation between the operating
technician and the material production of sound: they insert a keyboard-interface.
Abbé Mical's Têtes Parlantes (1783) and Joseph
Faber's Euphonis (1840) belong in this category. Speech machines
for entertainment. The designer also acted as operating technician,
and as variety artist, ventriloquist: he puts a puppet on stage,
and tries to create the illusion that it really speaks.
Here, the laboratory
apparatus has become a musical instrument, with an interface which
enables the virtuoso performer to add natural dynamics and timing
to the mechanical speech utterances, and to compensate as much as
possible for the limitations of technology.
Thus, one of
Mical's contemporaries writes about the Têtes Parlantes: "With
a little practice and agility, we will be able to speak with the
fingers as with the tongue, and we will be able to give the language
of the heads the speed, the calm, and in short all the qualities
that a language can possess which is not animated by passions."6
On the keyboard of the Têtes Parlantes, you present
a text as you would play a musical score on a piano.
<<illustration:
Faber or Mical>>
(Soft Machines)
Alexander Graham
Bell stands at a turning point in the history of speech synthesis.
When he was young, his father took him and his younger brother to
an exhibition where they saw a replica of one of Von Kempelen's
speech machines. Back home, the boys proceeded to build a similar
speaking machine themselves. When, several years later, Bell invented
the telephone, he introduced the technique that would determine
the future of sound processing: the represention of sounds by means
of electric signals. Bell also produced a detailed design, that
never got implemented, for a device that would have been a mechanical
Vocoder.
But his most
curious contribution to artificial speech synthesis was another
early feat. "Bell's youthful interest in speech production
also led him to experiment with his pet Skye terrier. He taught
the dog to sit up on his hind legs and growl continuously. At the
same time, Bell manipulated the dog's vocal tract by hand. The dog's
repertoire of sounds finally consisted of the vowels /a/ and /u/,
the diphthong /ou/ and the syllables /ma/ and /ga/. His greatest
linguistic accomplishment consisted of the sentence, 'How are you
Grandmamma?' The dog apparently started taking a 'bread and butter'
interest in the project and would try to talk by himself. But on
his own, he could never do better than the usual growl."7
A related technology,
with a cyberpunk slant, is due to Johannes Müller, the father
of modern physiology. "His working method is clearly characterized
by his orientation toward experiments on living or dead objects.
Continuing the efforts of Liskovius, who in 1814 was the first to
generate chest- and head-voice from the larynx of a corpse, he cut
off the head of a corpse in such a way that the entire vocal apparatus
and part of the trachaea were preserved. By blowing air into the
larynx of the corpse, Müller produced vocalic sounds which
closely resembled human speech. By moving the lips, he even managed
to generate some consonants."8
(To fake)
Hermann Helmholtz
was a pupil of that same Müller. But his work in the field
of speech synthesis was less physiologically and more acoustically
oriented. In the second half of the nineteenth century, research
into the phenomenon of sound had reached the stage where one could
attempt to analyse the sounds of human speech into elementary components.
To synthesize vowels, Helmholtz did not imitate the human body,
but built up the sounds from elementary, sinus-shaped components.
His synthesis
machine consists of a battery of tuning forks equipped with resonance
chambers, with frequencies in harmonious proportions. Driven by
electromagnets, the tuning forks vibrate with perfect regularity
in their basic frequencies. The volumes of the contributions from
the different tuning forks can be varied by partly opening or closing
their resonance chambers. Thus, sounds with different spectrums
can be composed, which bear resemblance to various vowels: Oo, Ee,
Ah, Oh, Uh, Ih...
<<Illustration:
Helmholtz' machine>>
The same method
of synthesis can be applied even more easily with modern electronic
technology -- a technology which was developed for the reproduction
and transmission of sound. The crucial invention which made electronic
sound generation possible was the loudspeaker: the general
purpose sound producer which can replicate the sound of an arbitrary
event, without having to mimick its material structure.
The loudspeaker
transforms arbitrary electric signals into material sound waves.
This creates the possibility of treating electric signals as models
of sound waves. In electronic technology, this is done by means
of resistors, induction coils, radio tubes, transistors. Objects
with a specific electronic behaviour are combined into circuits
which generate the desired output patterns.
The two kinds
of approach mentioned above in connection with mechanical sound
synthesis can be applied in electronics as well. The structure and
the components of a mechanical system that imitates the human larynx
can systematically be transposed to the electronic domain; this
will indeed result in a circuit with an output signal that corresponds
to the vocal sound produced by the mechanical model. Translating
Helmholtz' approach to the electronic realm is even simpler: replace
his tuning forks with sine wave generators, and his adjustable resonance
chambers with potentiometers.
<<illustration
from Köster, p.239 (Paget), or alternatively Flanagan>>
Electronic
simulation has a material form: a circuit consisting of identifiable
components and connections. But on the outside, nothing seems to
be happening. The clockwork stands still. It thinks.
The structure
of the circuit corresponds to the mathematical analysis of a physical
sound-generating process. The circuit is a materialized diagram.
(A print board actually looks like that.9)
The computer
is the next step in the development towards an increasingly abstract
simulation. The hardware no longer has anything in common with the
physics conjured up for the listener. The hardware even has a structure
which is essentially incompatible with the origins of music. A computer
really 'computes': it manipulates discrete symbols. Music, on the
other hand, is generated by the resonance of continuous systems.
Digital sound
simulation is two steps away from real sound: the electric signal
driving the loudspeaker is represented in the computer as a sequence
of discrete symbols that represent the amplitude variation in time,
split up into small discrete steps. Thus, even the continuity of
the electric signal is faked.
The operations
on the symbolically represented signals largely correspond to the
functioning of the components from electronic circuits -- but because
these operations are now symbolically represented as well (installed
as software in the computer), they can be applied with infinite
flexibility, in every imaginable combination and sequence. Cordemoy's
impossibility has come true: lifeless matter has escaped the rigidity
of the clockwork.
The flexible
machine which can do anything is at the same time the enigmatic
machine which shows nothing. The machine is motionless, so
that we do not see anything happening. But neither does the wiring
structure of the components reveal anything about the functions
performed. This structure only says: calculations in progress.
<<illustration:
computer>>
The flexibility
of the software medium is virtually complete. All operations which
can be described mathematically can be implemented. Even the fact
that the execution of each operation takes a short, but not infinitely
short, moment of time, and that very complex combinations of operations
can therefore take a long time, is hardly a limitation anymore.
This practical problem is solved by VLSI technology. It is often
possible to develop special chips for sub-processes which take too
much time: large-scale integrated electronic hardware, which is
less flexible than software, but extremely fast.
Everything
you can imagine you can do with software. That is what's interesting
about A.I. and other experimental branches of computer science:
we discover the limits of what we can imagine. Sound synthesis is
a typical example of this: modern synthesizers can produce a tremendous
richness of sound, but imitations of existing instruments still
sound stylized. Where they do sound natural, this is because they
are not synthesized on the basis of structural analysis, but on
the basis of samples. In that case there is no imitation, but reproduction
of a previously recorded sound. The best sounding synthesizers have
a great deal in common with tape recorders. They are digital mellotrons.
Digital sound
registration technology is now the technology with the highest accuracy.
The basic methods of digital sound representation are thus completely
adequate. The limitations of digital sound synthesis are solely
due to the limitations of our understanding of the psychological
structure of sound.
(Platonic
People)
Because their
speech was barely intelligible, there was not much use for the first
electronic speech-synthesis systems. For example, you could not
make them speak a complex text with unpredictable contents if you
wanted the text to be understood by an audience.
These systems
also sounded distinctly inhuman. The voice appears to be generated
by an alien body which is not flesh and blood -- by the angular
movements of the metal components of the prototypical robot. What
you hear is a machine which, in its awkward mechanical way, tries
to use the human means of communication. This behaviour evokes disturbing
questions about the possibilities and the dangers of technology,
about mind and matter, and the nature of human identity.
But current
state-of-the-art software is different. A typical example is DECtalk.10
This program is the realization of Abbé Mical's wildest dreams.
Têtes parlantes: not one, not two, but nine different
ones; and all of them can moreover be modified and interpolated.
The DECtalk manual presents their portraits and gives them names:
Rough Rita, Frail Frank, Whispering Wendy, Huge Harry, Kit the
Kid, Perfect Paul, Beautiful Betty, Uppity Ursula, and Doctor Dennis.
Protagonists of a comic strip version of Peyton Place.
<<illustration:
DECtalk voices>>
The input for
programs such as DECtalk consists of discrete symbols. The program
processes files that consist of sequences of phonemes. So there
is no human control of timing and dynamics, as with the eighteenth-century
machines which were operated by means of a keyboard. In spite and
even partly because of this, the output has greater continuity.
The software does not only contain models of the signals that correspond
to the individual phonemes, but also procedures for merging the
successive signals seamlessly together.
Modern synthetic
voices are perfectly intelligible. And because of a more accurate
control of the spectrum of vowels, the distinctively metallic quality
of the sound has disappeared. But nevertheless, no one would confuse
their output with human speech. The synthetic voice is still inhuman,
if only because of its uniformity.11
DECtalk's standard
voice, Perfect Paul, is an abstract sounding voice, that
of a newsreader. Neither machine, nor human being. This marks the
birth of a new medium. Up until now, you could not listen to a text
without listening to someone's body. The independent text, independent
of the human body, was always the printed text. For the first
time, language now has a sound independent of the body -- a sound
that directly emanates from the linguistic system, from syntax and
phonemes.
The next step
in this development is foreshadowed by other DECtalk voices, such
as Whispering Wendy and Huge Harry. These are more
personal, but just as equable and imperturbable, smooth and continuous.
Airbrush pinups. Platonic bodies.
Whispering
Wendy's voice has a pure, clear sound, with very little substance
-- like Marilyn Monroe's singing voice, or Brigitte Bardot's. The
suggestion of a soft, supple, weightless body. Huge Harry
is Wendy's macho counterpart. His voice is heavy and
lustful. Not Elvis Presley yet, but not bad for a beginner.
The synthetic
body has already become an erotic ideal. Look, for instance, at
the use of classical statues in thirties' fashion photography: "The
forms of high fashion assume the look of the statuesque, the hallowed,
the classical. Living flesh has the smoothness, the soft luster
of ancient marble. Stone, it almost seems, is as supple as flesh.
Hoyingen-Huene makes an equation between living and not living bodies,
and the equation enchants, for in his photographs the bodies that
do not live are not dead. They are statues. His imagery argues that
in the realm of fashion there is no death. To enter the fashionable
instant is to live forever."12
The future
of digital image- and sound-simulation: the smooth coolness of the
statue in a naturally moving body, in a sensually modulating voice.
Technology is heading slowly but surely toward increasingly perfect
robot-porn. Live performers like Prince and Michael Jackson are
already beginning to dissolve into their computer-animated images.
When Andy Warhol
invented commercial telephone sex, he suggested in the same breath
that it could best be done by robots: "A robot-computer
to answer the phone, that would be great. It would do the job without
emotion."13
(Epilogue
by Ultra Violet)
"I
think back to one of Andy's earliest paintings, compelling in its
simplicity -- a starkly black-and-white six-foot-high Coca-Cola
bottle, painted in oil on canvas in 1960. I think of the paintings
of clean, shiny Campbell soup cans, the young, unlined, fresh-scrubbed
faces of Marilyn Monroe, Jackie Onassis, Ingrid Bergman, so many
others."
"Then
gradually I begin to grasp what Andy was trying to say with all
his babble about machines and sex. Where sex has turned repulsive
and inhuman, machine sex beckons alluringly. Only in telephone sex,
robot sex, computer sex, is there escape from ugliness and cruelty.
Machine sex is the only kind left that is uncontaminated, antiseptic,
clean, even a little mysterious [...].
Yes, here
is still another of the endless paradoxes Andy strews along our
paths. In sex, as in art, [...,] he reinvents shining, pristine,
early morning purity. His kind, of course: on the surface, no deeper."14
english version
olivier/wylie/scha
NOTES
1.
Roland Barthes: "Le Grain de la Voix." In: L'obvie
et l'obtus. Paris: Éditions du Seuil, 1982. [English
translation: "The Grain of the Voice." In: The Responsibility
of Forms. Critical essays on Music, Art and Representation.
New York: Hill and Wang, 1985, pp. 269/270.]
2.
Thomas Hobbes: Leviathan. 1651 [Harmondsworth, Middlesex:
Penguin, 1968.]
3.
G. de Cordemoy: Discours physique de la parole. Paris, 1666.
4.
Julien Offray de la Mettrie: L' Homme machine. Leyden:
Luzac, 1748.
5.
This is a relatively recent machine (built at the Institute of Phonetic
Sciences of the University of Amsterdam), but its method of operation
definitely belongs to the eighteenth-century tradition.
6.
"Avec un peu d'habitude et d'habileté, on pourra
parler avec les toigts comme avec la langue, et on pourra donner
au langage des têtes la rapidité, le repos et toute
la physionomie enfin que peut avoir une langue qui n'est point animée
par les passions." From a letter by Antoine de Rivarol,
1783 (Oeuvres complètes de Rivarol, Part III, p. 207.
Paris, 1808.)
See: Jens-Peter
Köster: Historische Entwicklung von Syntheseapparaten zur
Erzeugung statischer and Vokalartiger Signale nebst Untersuchungen
zur Synthese deutscher Vokale. (Historical development of synthesis
machines for generating static and vowel-like signals and research
into the synthesis of German vowels.) Hamburg: Buske, 1973, p. 85.
On p. 95, Köster also quotes another part of this letter: "
If these heads were multiplied in Europe, they would raise terror
in all those Swiss and Gascon language teachers, whose influence
has infected all countries and who disfigure our language for the
peoples who love it." Köster comments: "Here
lie the roots of the use of technological tools in foreign language
teaching."
7.
James L. Flanagan: Speech Analysis Synthesis and Perception.
Second Edition. Berlin: Springer, 1972, pp. 206/207.
8.
Köster, op.cit., p. 149.
9.
Cf. Dick Raaijmakers: "De kunst van het machine lezen."
(The art of machine reading.) Raster, 6 (1978),
pp. 6-53.
10.
DECtalk was developed by Digital Equipment on the basis of
MITalk. See: Jonathan Allen, M. Sharon Hunnicutt and Dennis Klatt:
From text to speech: The MITalk system. Cambridge (UK): Cambridge
University Press, 1987.
11.
Speech technologists are doing their best to imitate human limitations
and imperfections. Allen et al. (op.cit.), for example: "Some
additional pauses are introduced in longer phrases and slow speaking
rate so that the talker does not seem to have an inhuman supply
of breath."
12.
Carter Ratcliff: "Out of Time." Artforum International
30, 1 (September 1991), pp. 112-117.
13.
Ultra Violet: Famous for 15 minutes. My years with Andy Warhol.
New York: Avon Books, 1990, p. 163.
14.
Ultra Violet: Op. cit., pp. 165/166.