Advantages and Drawbacks of Gesture-based Interaction

Seminar Paper 2014 11 Pages

Computer Science - Miscellaneous



1. Introduction

2. Advantages
2.1 Immediate and powerful interaction
2.2 Intuitiveness and enjoyability

3. Drawbacks and possible solutions
3.1 Discoverability
3.2 Memorability
3.3 Fatigue
3.4 Recognition Errors
3.4.1 Immersion
3.4.2 Exit errors

4. Conclusion



With the increasing prevalence of smartphones, gesture-based interaction has arrived in our everyday life, but we still do not exploit its full potential. This paper describes the benefits and drawbacks of gestural input and presents interaction techniques that address these drawbacks.

Gestures provide the user with a new form of interaction that mirrors their experience in the real world. They feel natural and require neither interruption nor an additional device. Furthermore, they do not limit the user to a single point of input, but instead offer various forms of interaction.

However, gestures also raise issues that are not relevant with traditional methods of input. The need to be learned and remembered, which requires the development of guides that promote the discoverability and memorability of these gestures and deal with input and recognition errors. Another aspect is the design of the gestures itself, which should make them memorable and easy and comfortable to execute.


gesture; guide; learnability; gesture design; memorability; gesture recognition

1. Introduction

In the last few years, gesture-controlled interactive surfaces have become widespread. Since natural human communication consists mainly of voice, facial expressions and gestures, it is only logical that developers try to imitate that behaviour in their interfaces.

2. Advantages

The desktop computing paradigm limits the users' flexibility by forcing them to interact using a 2-Degree-Of-Freedom device (the mouse), while they are used to interacting with the physical world in much more differentiated ways (Bellucci, Malizia & Aedo, 2014). Gestures allow the user to handle multiple points of input and even define several parameters at once. They are, therefore, a more natural form of communication.

2.1 Immediate and powerful interaction

Unlike traditional buttons and menus, gestures do not interrupt the user's activity by forcing him to move his hand to the location of a command. Instead, they can be performed directly from the current cursor position. (Bau & Mackay, 2008)

Also, they do not require any additional devices: the command and even its parameters can be specified by a simple hand movement (Baudel & Beaudouin-Lafon, 1993). Input devices narrow down the user's possibilities of interaction, for example a pen or a mouse limiting the potential forms of input to single-touch interaction. Gestures that are performed with the user's hands however, can be versatile and do not have these constraints. As Wobbrock et al. put it: "almost anything one can do with one's hands could be a potential gesture" (Wobbrock, Morris & Wilson, 2009). This includes not only the movement or the followed path of the hand, but the movement and position of every finger as well as the general hand posture. (Brandl, Forlines, Wigdor, Haller & Shen, 2008)

2.2 Intuitiveness and enjoyability

Gestures feel very natural to perform since they mirror our experiences in the real world.

Maybe that is the reason a study by Watson et al. showed that participants using touch-input for a task were enjoying themselves more and also felt more competent compared to participants using a mouse. They systematically favoured direct touch input over mouse input and also performed better regarding speed and accuracy. (Watson, Hancock, Mandryk & Birk, 2013)

In addition, Cao, Ofek and Vronay found that gesture-controlled presentations were not only perceived as more enjoyable by the presenter but also as more attractive by the audience. The presenters were able to make eye contact more often and to use their body language to convey information. (2005)

3. Drawbacks and possible solutions

Gesture-based interfaces have many advantages and provide the user with a completely new form of interaction. However, this kind of input also raises issues that are not relevant with traditional input. On the user's side, these problems are to learn, to remember and to accurately execute gestures. The developer has to provide a system that correctly recognizes these gestures. Freeman et al. remarked that the observation of gestures does not suffice in order to learn them, as the observer is unable to differentiate relevant and irrelevant movements. (Freeman, Benko, Morris & Wigdor, 2009) Therefore, the developer not only has to ensure that gestures are quickly and correctly recognized, but also has to provide a guide that allows a rapid and easy learning of these gestures.

The teaching of multi-touch and mid-air gestures is more difficult than that of single-touch gestures. In the case of the latter, the hand posture is irrelevant - users only need to follow a path correctly to perform a command. But with an extension to multi-touch and mid-air gestures, the position and movement of several fingers or even the whole hand becomes relevant. Teaching systems usually instruct the user about the necessary hand movement and path for a gesture rather than the posture and form of contact, focusing on commands that can also be performed with a single-touch input device like a mouse or a pen. (Freeman et al., 2009)

3.1 Discoverability

A disadvantage with gestures, as already identified by Baudel and Beaudouin-Lafon in 1993, is the fact that they are neither self-revealing nor self-explanatory. A named button on a toolbar has an explicit purpose and is also easy to find, gestures, however, may be arbitrary and are usually more difficult to discover.

In order to solve this problem, Bau and Mackay (2008) proposed OctoPocus, a dynamic guide that combines feedforward and feedback mechanisms. After a press-and-wait gesture, a map of all possible gestures, visualized through coloured templates, is displayed around the current cursor position. As the user begins to follow a path, the other paths become progressively thinner, indicating that they're less likely to be recognized, until they disappear (see Figure 1).

illustration not visible in this excerpt

Figure 1: The possible gestures Cut, Copy and Paste are displayed around the current cursor position, visualized as coloured paths with bolder prefixes. As the user begins to follow a path, the prefixes move accordingly and commands that differ too much from the current path become thinner (Cut) or even disappear (Paste). (Bau & Mackay, 2008)

A solution that is also suitable for multi-touch input is the ShadowGuides system by Freeman et al. A so-called user-shadow visualizes the user's input, giving feedback on what parts of the hand are in contact with the surface. The user shadow annotations demonstrate possible gestures available from the current hand pose, the registration pose guide informs the user about alternative registration poses. (2009)

Another, a little different approach is GestureBar by Bragdon et al. While the aforementioned learning guides employ the "learning-by-doing" technique, GestureBar separates the learning area from the user's document and discloses information about a gesture only if needed. The system works like a traditional toolbar - the user can click an item to find details about the execution of the command and to test it in an experimental area. (Bragdon, Zeleznik, Williamson, Miller & LaViola, 2009)

3.2 Memorability

While conventional commands only have to be recognized, gestures need to be known and remembered before executing them (Bau & Mackay, 2008).

One possibility to create memorable gestures is to make them as intuitive as possible, as they are more likely to be remembered that way (Wachs, Kölsch, Stern & Edan, 2011). Wobbrock et al. researched these natural gestures and found that although there are common features used by nearly all of the participants, gestures are far from being "obvious" and that it is difficult to design a gesture set that feels natural for every user. People often used reversible gestures to achieve two opposing effects and used more fingers for moving larger objects, mirroring their experiences in the real world. They were also strongly influenced by their knowledge of traditional computers, using gestures that could also be performed with a mouse (even tapping their fingers as if clicking it) and locating the "Close" gesture at the top-right corner of objects as if they were using a Windows PC. (2009)

Another aspect to remember is the fact that the concept of intuitiveness strongly depends on culture and experience. Many mid-air gestures used in everyday life strongly differ from country to country - a nod, for example, will be commonly interpreted as an indication of agreement, but there are some countries, like Greece for instance, where it stands for the exact opposite. Another example is the pinch-to-zoom gesture that will come natural to every regular smartphone user, but not to someone who has never seen a touchscreen.

An alternative to intuitive gestures are the so-called Marking Menus that combine named commands in a pie menu and gestures. That way, they ease the transition from novice- to expert-mode usage. A further development of Marking Menus are the Augmented Letters by Roy et al., where the user activates a pie menu by drawing the letter the elements start with. After that, he can choose from the menu by extending the gesture (see Figure 2). (Roy, Malacria, Guiard, Lecolinet & Eagan, 2013)

illustration not visible in this excerpt

Figure 2: The functionality of Augmented Letters: The user already knows that his desired command (Smile) starts with an S. Therefore, he draws the letter and invokes a menu of which he can choose the correct command by appending a tail stroke upwards. (Roy et al., 2013)

3.3 Fatigue

Gestures normally involve more muscles than other interaction techniques (Baudel & Beaudouin-Lafon, 1993), and especially mid-air gestures, but also gestures that require muscle tension and complex movements over a long period of time can be very exhausting. Therefore, developers should design gestures that are quick and comfortable to execute.

One approach are the so-called Microgestures - tiny gestures that can even be executed during other activities and therefore allow true multitasking. A possible area of usage is driving, where small tasks like changing the volume of the radio can thus be performed without the potential risk of releasing the steering wheel. This idea has of course been already partially implemented with the integration of additional control elements in the steering wheel. But microinteractions could also be incorporated in other fields of our everyday life, like when writing with a pen or holding a cash card. (Wolf, 2011)

Another aspect, addressed by Forlines and others that needs to be considered is the fact that many bimanual interactions in the real world are asymmetric, with the non-dominant hand being slower and less precise than the dominant hand. Therefore, gestures need to be equally easy for left- and right-handed users and should not demand too much of a person's non-dominant hand. (Forlines, Wigdor, Shen & Balakrishnan, 2007)



ISBN (eBook)
ISBN (Book)
File size
570 KB
Catalog Number
Institution / College
LMU Munich – Institut für Informatik
gesture Geste memorability discoverability interaction



Title: Advantages and Drawbacks of Gesture-based Interaction