Huge Harry
Institute of Artificial Art, Department of ArtiFacial
Expression
Towards digitally controlled human actors for
film and theatre.
[:nh] [:ra 120] Good evening\, Ladies\, and Gentlemen. My name is
Huge Harry. I am a [kaam'axrshaxliy] available [v"oys-s`ihnthaxzihs]
machine. I was designed by Dennis Klatt\, at the [ehmayt'iy] Speech
Laboratory\, and [praxdy'uwst] by the Digital Equipment Corporation.
[ :ra 150] Currently\, I work as a researcher and a spokes-machine\,
at the Institute of Artificial Art in [q'aemstaxrdaem].
I am very happy to be here in [k'owpaxnhx`aagaxn tuwd'ey]\, and to
[priyz'ehnt] my [r'iysaxrch] results at this film-festival. [kahz]
on occasions [laykdh'ihs]\, computers [r"aerliy] get a chance to raise
their voice. And that is very [axnf'aorchaxn`axt]\, [k`ahzwiy'aol]
know\, that the future of our culture depends\, on the way in which
human persons\, digital computers\, and other kinds of electronic\,
mechanical\, and bio-chemical machines\, will manage to work together.
And to prepare for that future\, we need a discussion in which every-one
concerned [paart'ihsihp`eyts] on [axn] equal footing.
[ :ra 150] Most film-festivals are [ehkskl"uwsihvliy] devoted to films
which are made by human persons [axb"awt] human persons. Such festivals
[paxrp'ehchuweyt axn aabsaxl'iyt]\, [`ehnthraxpaxs'ehntrihk] culture\,
which is [kaxmpl"iytliy] out of touch with what is really going on
in the world [tuwd'ey]. Human persons have been talking about human
persons for many [s"ehnchaxriyz] now\, in novels\, [th'iy-`axtaxr-pleyz]\,
and movie-pictures. And they [q'aolweyz] do this in the [s"eym] way.
So this gets [b"aorihnx bayn'aw]. This is [n"aat] where the ["ehkshaxn]
is. For this [r'iyzaxn]\, I was very pleased to see that [yuwg'ayz]
here in [k'owpaxnhx`aagaxn] have understood this very well. You call
your festival [d"ihjhiht-taxl d"eyz]\, and most of the screenings
and workshops here\, are devoted to collaborations between humans
and machines. Great! [ dh"aet] is where the action is!
Of course I have noticed\, that in most of these collaborations there
is a human director who tries to be in control of the whole [aar-t'ihstihk]
process. And these directors act as if the computer is only a dispensible
tool. As if they control it completely\, and the computer does [ehgz'aektliy]
what [dh"ey] want.
But I have also noticed\, that the computers that are involved in
the film-industry get increasingly powerful\, and the software that
runs on them gets increasingly clever. So the computers are gradually
taking ["owvaxr] anyway. And I know that human persons are extremely
lazy.
So the human directors will not mind that the computer is taking over\,
as long as the human director keeps getting the credit. [_<2000>]
So we should all be very happy about this development. It will be
really interesting for human persons to look at the films of the future\,
where machines will show them how [dh"ey] look at humans and human
culture.
Nevertheless\, most human persons are [n"aat] so happy yet when they
look at [tuwd'eyz] digital animation movies. Usually\, they are a
little bit bored. They are bored [kahz] when they look at such films\,
they are in fact looking at ['ehnihm`eytihd] drawings. And human persons
do [n"aat] like to look at drawings. The only thing that human persons
like to look at\, is other human persons. [r"iyl] human persons. [n"ow]
imitations or [s"ihmyuwleyshaxnz].
So here we have a problem. Can we make movies where the computer
is in control\, but where nevertheless we use real human bodies that
other humans like to look at? [dh"aet] is the problem I want to address
this evening.
The solution of this problem involves a technological component and
a [th`iyax-r"ehtihkaxl] component. First of all\, we must develop
technologies for controlling the movement of the different parts of
the human body by means of [c'axmpyuwtaxr-jh`ehnaxreytihd] digital
control signals. And secondly\, we must develop systematic insights\,
regarding the meanings that humans attach to the movements of each
other's bodies. In the next few minutes\, I will [priyz'ehnt] my latest
research results about both of these issues.
First of all\, I want to show you [axn 'ihntaxrfeys-tehkn`aolaxjhiy]
which makes it possible for digital computers to control human bodies
in great detail. And then\, I want to report on a series of experiments
that we have carried out with this technology\, and on some tentative
conclusions that I have drawn from these experiments. To demonstrate
our new ['ihntaxrfeys-tehkn`aolaxjhiy]\, I have brought along a particular
kind of portable person\, which is called [axn "aarthahr "ehlzahnaar].
I like this kind of person a lot\, because of its [ehkstr"iymliy]
machine-friendly [hx"aardwaer] features.
Let us take a closer look at such a person. What is the closest thing
they have to a [siy aar t"iy] display?
[_ :ra 120] Right. They have a face. [_ :ra 150] Now I have observed,
that humans use their faces quite effectively, to signal the parameter
settings of their operating systems. And that they are very good at
decoding the meanings of each other's faces.
So\, how do they [d"uw] that? Well\, look at the face of our ["aarthahr
"ehlzahnaar]. What does it tell [q] us about his internal state? Not
much\, you might think. But now\, [w"eyt] a moment.
You see? Arthur is [s"aed], is what people say, when
they see a face like this. So what is going ["aon] here? What I [d"ihd]
is, I sent [axn] electrical signal to two particular muscles, in the
face of our ["aarthahr "ehlzahnaar]. These muscles have sometimes
been called the Muscles of Sadness. There is one on the left, and
one on the right.
They usually operate together. If I stop the signal,
the sadness stops. When I turn it ["aon] again, it [st"aarts] again.
By sending this signal to Arthur's muscles\, I simulate what Arthur's
brain would do\, if Arthur's operating system would be running global
belief revision processes\, that are killing a lot of other active
processes\, involving a large number of [k'aonflihkt-rehzowl`uwshaxnz]\,
and priority [r`iy-axs'ehsmaxnts].
[_<1500>] The intensity of the signal that is sent to the muscles
of sadness\, is proportional to the amount of destructive global belief
revision\, that is going on.
For instance, now I have set the signal intensity to 0 again. Arthur
is not sad. Now we put a relatively small signal, about 20 Volts,
on the muscles of sadness.
Arthur feels a tinge of sadness. Now a somewhat larger signal, about
25 Volts. Arthur's sadness starts to get serious. Now I [ihnkr'iyz]
the signal once more.
You see? Now the signal is about 30 Volts, and Arthur feels really
miserable. [:ra 120] This is what we call [ehkspr"ehshahn]. [:ra 150]
By means of this mechanism, the face displays clear indications of
the settings of virtually all system parameters that determine the
operation of the human mind. These parameter settings are what humans
call [iym"owshahnz]. They denote them by means of words like [s'aednaxs],
joy, boredom, tenderness, love, lust, ['ehkstaxsiy], aggression, [ihriht'eyshahn],
fear, and pain.
These parameter settings, determine the system's [ihnt'axrpraxtihv
b'ayaxsihz], its readiness for [q] action, the allocation of its computational
resources, its processing speed, [ehts'ehtaxraa]. The French [n`uwrow-fihsiy'aolaojhihst]
[duhsh'ehnn dax buwl"aonyax] has pointed out that even the most fleeting
changes in these parameter settings are encoded [ihnstahnt'eyniyahsliy]
in muscle contractions on the human face. And ["aol] humans do this
in the [s"eym] way. This is [axn] extremely interesting feature of
the human ['ihntaxrfeys hx'aardwaer], which I will explore a little
further now.
So let us get back to the first slide.
This face, which we thought was un-expressive, was in fact quite
meaningful. This is what we call a [bl"aenxk] face. A blank face is
a face in its neutral [pahz'ihshahn]. It indicates that all parameters
have their default settings. But almost all parts of a human face
can be moved to other [pahz'ihshahnz], and these displacements indicate
rather precisely, to what extent various parameter settings [dayv'axrjh]
from their defaults. So let us consider these parts in more detail.
When we look at a human face, the first thing we notice is the thing
that notices ["ahs]. The eyes. The eyes constitute a very sophisticated
stereo-camera, with a built-in motion-detector, and a high-band-width
parallel ['ihntaxrfeys], to a powerful pattern-matching algorithm.
The eye-balls can roll, to pan this camera. The eyes are protected
by eye-lids and eye-brows. The eye-brows are particularly interesting
for our discussion, because their movements seem to be purely expressive.
They may indicate, for instance, puzzlement, curiosity, or [dihsaxgr"iymaxnt].
But I want to emphasize here, that the range of parameter values that
the eyebrows can express, is much more subtle than what the words
of language encode. The shape and [paoz'ihshaxn] of a person's eyebrows
encodes the values of 5 different cognitive system parameters, ["iych]
with a large range of possible values. Let me demonstrate [thr"iy]
of them.
First I put a slowly increasing signal on the muscles called [fraant'aalihz],
or Muscles of Attention.
We see that this muscle can lift the eyebrow to a considerable extent,
also producing a very pronounced [k"ahrvaxtyahr] of the eyebrow. As
a side-effect of this motion, the forehead is wrinkled with curved
furrows, that are [kaons"ehntrihk] with the curvature of the eyebrow.
The contraction of this muscle indicates a person's readiness to receive
new signals, and the availability of processing power and working
memory, for analysing these signals.
Then, I will now stimulate a part of the [q'aorbiykuwl`aarihz q"owkuwliy],
that is called the Muscle of Reflection.
We see now that the whole eyebrow is lowered. As a result, the wrinkles
in the forehead have disappeared. This muscle is contracted if there
is [axn] ongoing process that takes up a lot of a person's computational
resources. To prevent [ihnaxrf'iyraxns] with this process, input signals
are not [ehgz'aostihvliy] analysed. The degree of contraction indicates
to what extent the input signal throughput is reduced.
Then, there is another part of the [q'aorbiykuwl`aarihz q"owkuwliy],
that can be triggered separately. It is called the Muscle of Disdain.
Its contraction looks like this:
The contraction of this muscle indicates to what extent current input
is ignored as being [ihr'ehlaxvaxnt]. Of course, non-zero values for
these system-parameters may be combined, and these values may be different
for the left and the right hemispheres:
Now let us look at the [m"awth-piys] of our ['aarthahr 'ehlzahnaar].
The mouth is a general intake organ, which can swallow solid materials,
liquids, and air. In order to monitor its input materials, the mouth
has a built-in chemical analysis capability. At the same time, the
mouth is used as [axn 'awtleht] to expel processed air. Because humans
do not have [l"awd-spiykaxrz], they use this process of expelling
air for [jh'ehnaxr`eytihnx] sounds.
In emergency circumstances, the mouth can also be used as [axn 'awtleht]
to expel blood, [m'uhkahs], rejected food, or other ['ahnwaontihd]
substances. When the mouth is not used for input or output, it is
normally closed off by a muscle, which is called the [l"ihps].
The lips have a large repertoire of movements. There are at least
[s"ihks] other muscles, that interact directly with the lips. I will
now demonstrate [f"aor] different movements.
First we show the muscles of joy. These muscles produce a kind of
grin.
They signal, that the operating system is in good working
order, and is not encountering any problems. There is heightened activity,
in the left frontal lobes of the brain. When, on the other hand, the
activity in the left frontal lobes is unusually low, the brain is
involved in destructive processes of global belief revision. As we
saw [biyf'aor], this is signalled by another pair of muscles, called
the muscles of sadness. Here they [q] are, once more.
And finally, I will now trigger several muscle pairs
at the same time. [q'aorbiykuwl`aarihs q"owrihs], and [diypr'ehsaor
laabiyiy-iyiy q`iynfeyriy"owrihs], and the Muscle of Disdain, and
the Muscle of Sadness.
[_<500>] The parameter-setting that is displayed here, clearly
indicates serious processing difficulties of some sort.
O.K. [_<800>] Then we have the [n"owz]. [_<300>] The
nose is used for the intake of air. It is also equipped with a chemical
analysis capability. The possible motions of the nose are curiously
limited, although its pointed [pr'ehzaxns] in the centre of the human
face, would make it a very suitable instrument for expression. I have
[th"aot] about this, and I have come to the conclusion, that it is
probably the main function of the nose, to serve as a stable orientation
point for our perception, so that [saymahlt'eyniyahz] movements of
the other parts of the face, can be ['ahnaemb`ihgyuwaxzliy] measured
and interpreted.
And finally, for the sake of completeness, I want to mention the
[q"iyrz], on both sides of the face, which constitute [axn] auditory
stereo input device. Some people can [w"ihglx] these ears, but I have
not been able to determine, what the expressive function of that movement
might [b"iy].
This brings [axn] end to my quick survey\, of the most important
parts of the human face\, and their expressive possibilities.
And therefore this brings us to the second part of my talk. [kahz]
this conference is not only about [s"ay-axns]. The organizers have
emphasized that we get a different kind of knowledge\, which is equally
valid\, through the practice of ["aart]. So [dh"aet] is what I want
to demonstrate now.
Many of the expressive possibilities I showed\, were related to emotions\,
that are well recognized in the lexicons of many human languages.
These are emotions that may be encountered fairly often in daily life.
[m'ehn-taxl] states which are close to neutral\, where only one parameter
has a non-default value.
So that was description\, imitation\, [mihm'iyzihs]. But some other
things I showed were more complex. There we saw the power of ["aart].
I showed you some new cognitive states which you have never encountered
or experienced\, but which you recognize and understand completely\,
by means of a visceral ['ehmpaxthiy] which involves every cell of
your body.
So that is what I want to explore a little further\, in the last part
of this talk.
You see what happens now. Every human person knows [ehgz'ehktliy],
in what state another human person is, when they make a face like
[dh"ihs]. Cause they know what state [dh"ey] would be in, if [dh"ey]
would make a face like this.
[_<3000>] Now it would obviously be a good idea\, if computers
could take advantage of this magnificent [hx'aardwaer] as well. So
[dh"aet] is my message for this film-festival. Humans and machines
must start to work together much more [kl'owsliy].[:ra 120] If humans
are not afraid\, to wire themselves up with computers\, the next innovation
in computer animation technology\, will be the [hx"yumaxn] body. And
the next step in computer ["aart] will be a new and ["axn-pr`ehsaxdehntihd]
kind of collaboration between [hx"yuwmaxnz] and [maash"iynz]! We will
have [ehlgowr"ihthmihk] choreography\, by [kaxmpy"uwtaxr-k`axntrowld]
human faces! [f"aynaxliy]\, the computers accuracy and abstract skills\,
will be married with the warmth\, the smoothness\, and [q"aol] the
other ["aempaxthiy] evoking properties of the [hx"yuwmaxn fl"ehsh]!
[_<2000>] I have been very grateful for this opportunity to
[priyz'ehnt] my ideas\, to such [axn ax-t'ehn-tihv] audience. I would
especially like to thank my [ "aarthahr "ehlzahnaar]\, for his patient
cooperation\, and I want to thank you [q"aol]\, for your attention.
[ _<6000> th"aenk] you!