The Aging Voice

1 Concepts of Age
1.1 Chronological and Cultural Age
1.2 Biological Age
1.3 Perceptual Age

2 Vocal Aging
2.1 Basic Concepts of Phonation
2.2 The Aging Voice
2.3 Aging of the Speech Organs
2.3.1 The Subglottal System
2.3.2 The Larynx
2.3.3 The Supra-Laryngeal System
2.4 Aspects of the Aging Voice
2.4.1 Speech Rate and Speech Pauses
2.4.2 Acoustic Aspect of the Aging Voice
2.4.3 Formant Frequencies

3 Perception of the Aging Voice
3.1 Perception of Speaker Age
3.2 Acoustic Correlates of Perceived Age
3.2.1 Speech Rate
3.2.2 Speaking Fundamental Frequency
3.2.3 Formant Frequency

4 Conclusion

5 Literature



The concept of voice recognition is a basic concept of nature. Voice is one of the first sensations a human gets in contact with. Many scientists state that the cognition of the maternal voice in early childhood or in embryonic stadium has significant impact on processes of cognitive abilities, learning aptitude, as well as discretion and language acquisition in later life.

As voice recognition is an elementary ability of every human, the question arises in how far the specific processes of voice recognition perform. It is obvious, that every one of us can easily distinguish between male and female speakers, children an older people, or even whether a speaker originates from e.g. Africa or India. But what are the characteristic aspects of this recognition ability? What are characteristics of the aging process of the human speech apparatus? In how far are voice frequencies and other physical features of voice relevant to designate a speaker’s age? What are the correlates of voice aging and the perception of aged voice?

In this paper I am going to give an overview over the basic theories of vocal aging as well as perceptual speaker’s age recognition. Therefore I will initially illustrate prominent views of the aging process from biological, psychological, and cultural perspectives. In a second step I will present specific studies that deal with the experimental analysis of voice records, as well as technical methods of speaker’s voice perception.

1 Concepts of Age

The concept of age is natural to the cognition of every human. Indeed, it is part of our everyday life. Every human experiences the eclectic impacts of aging during his or her own and individual process of maturing. While aging seems to bring a lot of efforts when we grow from infancy to adolescence, there are more and more duties rising in the process of growing-up. Finally age discloses as to bring also negative influences during the progression of our life when our physical health more and more becomes the focus of our daily grind. As described above, age is a concept that contains a range of different characteristics. These characteristics can be rated as positive, negative or neutral attributes in the process of aging. A common definition of aging can be centralized as followed: “Prozess, der in Abhängigkeit von der Zeit zu charakteristischen Zustandsveränderungen führt.” (Brockhaus 2005-2006). This definition describes the process of aging as to bring characteristic state-changes in correlation to the concept of time[1]. In a wider sense thus a concrete definition of the concept of age seems to be difficult to designate and also Winkler (2008) asserts that the concept of aging covers virtually everything that comes about a organism during time: “Altern ist alles, was im Laufe der Zeit mit einem Organismus geschieht.” (Costa and McCrae quoted in Winkler 2008: 7).

In most contexts though, the concept of age is connoted negatively. When we talk of age we often talk of e.g. the loss of physical capacity of our body or the limitations that are connected to the status of age. This negative connotation is not only subject of interpersonal transposition, but also of the grater context of social communication. Terms like demographic change and retirement arrangement are part of many ongoing political and social discussions and are nearly omnipresent to us. Thus when we think of the concept of age we often think of the chronological age. To describe the concept of age however a more differentiated angle is necessary. To describe age adequate we need to distinguish between chronological and biologic age. In a linguistic context we also need to comprise the term of perceptual age. In the following these three relevant concepts of age will be illustrated.

1.1 Chronological and Cultural Age

Basically chronological age means the year of ones life. It is the difference between ones date of birth and the current date. The concept of chronological age is more general as it doesn’t cover any information about ones physical constitution, mental capacity or social status.

The cultural aspect of aging is as multifaceted as the concept of aging in a whole: “Culture is a vital and pervasive force of humanity that molds a society’s social and political institutions, ideals, and norms.” (Linville 2001: 9). Therefore culture can be seen as the setting for multiple conceptions in regard to common associations with age and elderly. In a way one could say that culture covers men’s imagination of everything connected to the process of aging. It covers psychological, as well as physical aspects concerning the concept of aging and beyond that frames the social conditions and agenda in which the process of aging takes place. Comprising this wider range of socio-cultural aspects the process and the concept of aging can be described more precisely.

1.2 Biological Age

The term biological age is more scientific than the term chronological or cultural age. Biological age hence covers the physiological constitution of the human organs and body. For a holistic description of the process of aging, especially for the aging-process of the human voice apparatus, this term is indispensable. During the process of chronological aging the human body, as well as every other biological organism, undergoes significant changes in its structure[2]. These structural changes can be analysed and serve as basis for the description of age-related changes in e.g. voice.

1.3 Perceptual Age

As the focus of this paper will be the acoustic aspects of vocal aging as well as the perceptual speaker’s voice estimation, the concept of perceptual age (and perceptual age estimation) is important. As already mentioned, the perception of the human or maternal voice is one of the first sensations a human gets in contact with. In early childhood we basically learn from perception how typical patterns of speech and speech production work. This process is not a conscious one, but more a natural and particularly passive process of learning. In the early days of our infancy we learn how to distinguish between mother’s and father’s voice. But the perception of voice does not only cover the basic information of gender or age for us. The perceived voice can cover a range of information about emotion, appeal, and meaning.[3]

A lot of research has been done on the mechanisms of the perceptual speaker’s age estimation. The accuracy of correct identification of age though varies significantly from study to study but overall the “[…]accuracy levels [are] significantly better than chance [10-12,2][…]” (Schötz 2007: 91). Furthermore it seems to be difficult to give a general statement about “[…]how well listeners are able to judge speaker age.[…]” (ibidem). Schötz’ thereby argues, that the various studies on this topic have varied significantly in “[…]method and speech material[…]” (ibidem). Although it is difficult to give a concrete number for the accuracy for the perception of speaker’s age, it is obvious that there is a high potential to filter information like age, gender, or even educational background from voice. The perceived age is distinct to the speech signal and therefore a quality of the speech signal itself (Winkler 2008: 9).

While the speaker himself cannot influence the concepts of chronological age and biological age, the perceived age of a speaker can undergo such influence. Although this influence is limited to a very small extent, it is to be mentioned as a possible source of errors in perceptive speakers age estimation. Further it is to be said that the perceived age cannot be seen as an absolute numerical value than more as the average value of several estimations.

Das wahrgenommene alter ist als Mittelwert der Schätzungen der einzelnen Hörer eindeutig bestimmt. Wurde zum Beispiel ein Stimulus von 20 Hörern mit einer Schätzung versehen, ist der Mittelwert dieser 20 Schätzungen der Wert des wahrgenommenen Alters bei dieser Äußerung und stellt das perzeptive Pendant zum Wert des chronologischen Alters des Sprechers dar. (ibidem)

2 Vocal Aging

As already mentioned: The human organism undergoes structural changes during the progress of time. Thus, the specific organs involved undergo the same structural changes.

A simple example for such a structural change is the puberty vocal change. At this specific time during human body growth, almost every male teenager is befallen of the phenomenon of voice break. Due to the hormonal controlled growth of the human body, the pitch of the voice changes and gets deeper (Brockhaus: 2005-2006).

There are not only those structural changes that are medically harmless or natural, but also changes that can be traced back to physiologic limitations of the speech organs. In addition there are some changes of the speech organs that can be returned to muscular diseases or even serious diseases like cancer of the vocal folds. Generally there is to be distinguished between normal change and disease.

2.1 Basic Concepts of Phonation

illustration not visible in this excerpt

Illustration I: The vocal tract and the articulators involved in speech production. (<http://copingwithstuttering.blogspot.com/2010/02/how-speech-sounds-are-formed.html>)

The speech sounds of many languages are produced by an airstream egressing from the lungs. This egressive airstream is influenced in many different ways on its way out of the lungs. These influences come about the different articulators (Illustration I) that are involved in speech production. During the egression of the airstream from our lungs, the vocal folds begin to vibrate and acoustic sounds are produced. Generally we can distinguish between movable (active) and non-movable (passive) articulators.

The larynx forms the upper end of the trachea and is made of cartilage. Its function in the articulation process is to support and move the vocal folds and regulate the pulmonic airstream.

The glottis is the opening between the vocal folds. As the speaker can move the vocal folds, the glottis can have different shapes, which influence the resulting speech sound. It is to be distinguished between open glottis (voiceless sounds), narrow glottis (fricative sound), vibration of the vocal folds (voiced sounds), and closed glottis (glottal stop).


[1] The concept of time can be tackled from many other directions but will not be a central point in this paper. For further information The Concept of Time by Roger Teichmann is recommended.

[2] The aging oft the human vocal apparatus can be analyzed in detail for every single part oft he vocal tract. As this paper doesn’t go into to much detail with these structural changes, the chapters 2-10 and 12-17 of Linville’s Vocal Aging are highly recommended.

[3] The construct of the different levels of information covered in a speech signal can be illustrated by the communication-models of Schulz von Thun or Karl Bühler. For further information Michael Fleischer’s Kommunikationstheorie is recommended.


Title: The Aging Voice