How do we hear with the world?

Brian Gygi

The student asked the Zen teacher what is the most important point in Zen? And the teacher wrote the words "Attention! Listen!" The student said there must be something else. The teacher said, "Yes, there is," and he wrote on the board "Attention! Listen!"
Bernard Glassman Roshi

Last year I moved out of a quiet residential neighborhood in El Cerrito, California. Ever since I left New York City ten years ago I had wanted to get back to a more urban experience. And in downtown Oakland, I certainly found it. I had forgotten that in addition to bright lights and bustling energy an urban area entails a fairly constant level of noise. That first sleepless night with trucks rumbling and shouts from customers down on the street up to the drug dealer next door (funny, the landlady didn’t tell me about him), I thought: “I am never going to get used to this.” But in actuality it only took about two nights until all those annoying sounds simply melted into the background.

But some other people in the same situation would have reacted in exactly the opposite manner, dreading every squeal of air brakes or cry of “It’s me, yo!” Rather than blocking out sounds, their responses would be heightened to the point where even simple sounds at moderate intensity become unbearable shrieks. One common feature of autistic children is that they cannot tolerate even moderately loud sounds; Temple Grandin, a high functioning autistic woman, said that whenever the school bell rang she would start screaming. One theory of autism holds that it is an inability to inhibit stimuli, so that the environment is a constant assault, making it difficult to know what to respond to.

What we don’t “hear” is as important as what we do hear[1]. When I lived in New York City I remember walking on the street and being jolted by a blast of really loud hip hop from an approaching car. This occurrence in and of itself is not unusual. However, I noticed the windows of the car were shut, making it hard to imagine what the noise inside the car must have been like. Then I saw inside the car were two small children. My first reaction was “My God, they are deafening their children!” But then I realized that in New York one can function a lot better if one is able to simply ignore a lot of things. So this deafening might actually have been an adaptive measure for existing in an environment with too many stimuli. Fortunately, for the most part, such enforced irreversible selectivity is not necessary. In most listening situations we can block out everything except what we want to focus on. Just make a live recording of almost any listening situation and you will be amazed at all the sounds that you didn’t hear the first time around, while you discover that what you thought was crystal clear is somewhere in the background.

We can thus conclude that we don’t hear everything -- far from it. This is true of all sensory systems, by the way: we ignore the grunts of the elephant in the room as well as his knobby grey bulk. Much of this is purely practical: we would go nuts if we were constantly feeling our clothes or smelling our aroma. The ability to block out stimulation does not seem to cross sensory modalities easily, though, which is one reason why turning off the car radio does indeed help one focus on finding a parking space. Here we are focused on hearing: the spam filters in our auditory system are remarkable for their flexibility and unobtrusiveness, sorting out which messages are necessary for us to hear even when we are asleep. This need for vigilance, by the way, has been proposed as the reason why we rarely hear in our dreams. A wonderful passage in Gravity’s Rainbow describes paradoxical responses, in which softer stimuli evoke stronger responses than louder stimuli. It focuses on a man who sleeps through the fighter jets roaring overheard only to wake up at a slight rapping at the door. Hearing scientists and acousticians like to talk about the ratio of signal to noise, but in everyday life there is not really such a thing as true ‘noise’, there is only sound-bearing information that differs in its usefulness, a quality which is independent of sound level.

Since humans are excellent learning machines, it is natural to suppose that we would be one step ahead of the game and, rather than just react to sounds, develop a way to anticipate certain sounds, making us more receptive to them (humans are likely not unique in this: the efferent pathways to the cochlea allowing for ‘tuning’ of the auditory periphery are well developed in most mammals). A large number of experimental data support this assumption– for example, words which, due to linguistic context, are highly probable to occur in a sentence have up to 10 dB lower thresholds than low probability words (which is quite a large effect, equivalent to several dozen garbage trucks rumbling by). We can even hear words that aren’t there: one study replaced words with noise bursts and the words were still perceived. So rather than listening to the world, we are instead listening with the world, constantly testing what comes in against what was expected, and on the basis of the mismatch, updating what I will call our “Auditory World Model” (AWM), which determines our selective responses to anticipated stimuli.[2]

“The attenuation of expected responses” is one of the several definitions of attention. William James, the human quote machine, quipped “Everyone knows what attention is.” Like many good quotes, its intent is to incite rather than clarify. To be fair, he went on to qualify that, “it is the taking possession by the mind, in clear and vivid form, of one out of what seem several simultaneously possible objects or trains of thought. Focalization, concentration of consciousness are of its essence. It implies withdrawal from some things in order to deal effectively with others…” Attention is fundamental property that allows us to navigate the mass of information that is the world, so it is not surprising that Zen, which, despite its reputation, is a very pragmatic practice, would have attention as fundamental property, as noted in the opening quotation.

What I described above is generally referred to as ‘endogenous’ attention, the ability both to intentionally attend to a stimulus and to switch attention to a different item of interest. There is also ‘exogenous’ attention, namely the grabbing of attention by especially salient stimuli, such as the cocktail party where your name is spoken. Of course there are gradations of grabbiness, so you may be able to ignore your name if spoken by someone you don’t really want to talk to. Some artists have manipulated the graduated drawing of attention to notable effect. Brian Eno described ambient sound as “… sound that compels attention rather than commands it.” Which may be why it is so lulling. Ambience gently lulls our AWM, teases it with slow gradual adaptation, rather than abrupt transitions.

The best definition of attention I know of is from a well-known hearing scientist, Bruno Repp, who said “There is no learning, there is no attention, there is no memory, there is only learning of things, attention to things and memory for things.” So attention may not be, and probably is not, a unitary process: the mechanisms that allow us to attend aurally are quite different from those that allow us to do so visually, even though functionally they seem quite similar. In vision, there is the fovea, a small area on the retina, in which virtually all the important “seeing” occurs There is no similar localized structure in the auditory system.

So perhaps we have some insight into why unexpected sounds are so jarring: they force us to radically reconfigure our AWM. Normally, if we are well fit to our world, our AWM easily accounts for the vast majority of the incoming sounds. “Truck rolling by, expected that.” “Fight on the street, expected that.” “Cow mooing -- hmmm, didn’t expect that! Wake up! Something is going on!!”

An infant likely comes into being with only minimal AWM, although since the physiological structures for hearing are well developed by the sixth month of gestation, prenatal auditory learning starts really early on, and speculation about auditory “primal states” is largely pointless. In any case, we are listening before we are seeing. So what must it be like for the very young system? Is every new sound an unexpected blare, jarring us to our core? A large body of data from Lynn Werner, a developmental psychologist, suggests strongly that infants have a great deal of difficulty focusing on one frequency region to the exclusion of others. Is Williams James’ “blooming buzzing confusion” in actuality the land of a thousand jolts? So when we hear unexpected sounds are we briefly thrust back into the primordial prenatal ooze of no reference?

It is useful at this point to note that the roots of the English word “attention” come from the same root as “tendons”-- that is, something that stretches and attaches. To bring this back to the Zen parable cited at the beginning, the second noble truth of the Buddha is that the cause of suffering is clinging, attachment. So why is attention, this facility that enables us to cling, considered one of the bases of Zen practice?

Our attention is limited. The exact boundaries are not known: some people are reportedly quite good at multitasking but a more common state is to focus on one stream to the exclusion of others. One consequence of the limitations of attention is that we tend to see the world in ‘figure-ground’ terms: we divide it into that which is to be focused on and that which is to be ignored. Once a figure is discerned from buzz, it tends to reinforce itself. After you have once seen the dog that lurks in the pattern at left, you can never see it again as just plain dots (if you can’t see it, the dog is shaded in the figure at the end of the article). Shakyamuni Buddha said that once you divide the world into what is and what is not, you lose the Way. So the dog may or may not have a Buddha nature, as the famous koan goes, but the one perceiving the dog as a dog has momentarily strayed from his/her Buddha nature.

In the auditory realm, there are numerous examples of constructing sense out of seemingly senseless sounds. Clicking on this link will play an example of sine wave speech, in which the complex harmonic structure of speech is replaced by modulated sine waves. If you are like most listeners, it will sound like a series of whistles. But once you know the message the segment contains (which you can hear by clicking on this link) you will not be able to hear it as just whistles again.

In zazen (Zen meditation) rather than drifting off or transcending reality, the goal (if it can be called a goal) is to be completely utterly present and aware in the moment, hearingseeingsmellingfeeling directly with as little interdiction from cognition as possible. One of the methods of becoming present is to focus on nothing, your eyes cast downward on a blank wall, so silent your breath resonates through your whole nasopharyngeal system. You ‘open the hand of thought’ as Uchiyama Roshi said and the world comes rushing in, the filter of our AWM yanked out for replacement. You are attending, but to the whole world. While sitting zazen I have had the strange experience of hearing the voice of someone talking outside the meditation room as just pure, simple sound. And what strange wonderful sound it is, resonant but hissy, sweeping rolling gurgling popping. I didn’t even notice what the person said or whether it was a male or female.

In this state I have felt a profound sense of gratitude for just being. I am attending to the world, in the other meanings of the word: waiting on, serving, being present. This is the attention the master was referring to as the basis of Zen. When I get into that good state of practice, the bell signaling the end of the zazen period nearly makes me jump clear off my zafu. Because I am listening with the world, rather than for it, I am totally unprepared for the horrendously loud (in relation to the extreme quiet that preceded it) BARAAANG sending pulses rippling up my cochlea. This is the state of “beginners mind” as described by Shinryu Suzuki Roshi, the state of the infant mind that is not constantly afraid of the world, but “empty, free of the habits of the expert, ready to accept, to doubt, and open to all the possibilities.”

But that is a rare state for me, hard to get to, hard to maintain. Most of the time in sitting I am in a slightly dreamy state, thinking about my life, my dog, the cute woman sitting zazen next to me, and I barely hear the bell. My zendo AWM knows the bell is coming, it attenuates the gain on the aural input and right then I have lost some of the richness of the world. Constraints cut both ways. We just need to remind ourselves again and again of what we are losing. So, now that you have seen the dog in the picture (below) can you remember what the picture was like when it was just dots? I can’t either, although I am glad I continue to try.


[1] “Hear” in this case refers to the perceptual process of the auditory system, and not the sensations picked up by the cochlea.

[2] I am not going to present any nice factoids here about the location of the auditory system of the AWM. Don’t expect statements such as “fMRI studies of the parietal cortex show….” and the like. Not only is the AWM an entirely hypothetical structure, but I am among those who tend to believe that perceptual events don’t “happen” at any one physiological location, but rather are emergent properties of the whole system.

