Lade Inhalt...

CollabKit – A Multi-User Multicast Collaboration System based on VNC

Diplomarbeit 2011 118 Seiten

Informatik - Angewandte Informatik



1 Introduction
1.1 Problem Statement
1.2 Approach
1.3 Structure of this Work

2 Real-Time Collaboration Use Cases and Requirements
2.1 Use Cases
2.1.1 Presentations
2.1.2 Electronic Teaching
2.1.3 Professional Collaboration
2.2 Requirements Analysis
2.2.1 Non-Functional Requirements
2.2.2 Functional Requirements
2.2.3 Summary

3 State of the Art
3.1 Basic Principles regarding Real-Time Collaboration Systems
3.1.1 Classification
3.1.2 Common Technical Realisation
3.2 Survey of Existing Real-Time Collaboration Systems
3.2.1 Based on the X Window System
3.2.2 Based on VNC
3.2.3 Based on RDP
3.2.4 Others
3.3 Conclusion - Motivation for CollabKit

4 Design of a Multi-User Multicast Collaboration System
4.1 CollabKit Needed Functionality
4.2 Multi-User Support
4.2.1 Concurrent Multi-User Operation
4.2.2 Multi-User Graphical Annotations
4.2.3 Cross-Platform Client Application
4.2.4 Client-to-Server Window Sharing
4.3 Multicast Transmission of Image Data
4.3.1 Delivery of Multicast Group Address to Clients
4.3.2 Different VNC Pixel-Formats and Encodings
4.3.3 Accumulation of Update Requests
4.3.4 Datagrams Instead of Byte Streams
4.3.5 Multicast Flow Control
4.3.6 Introduction of New Message Types
4.3.7 Overall Resulting Design

5 CollabKit Implementation
5.1 Multi-User Functionality
5.1.1 VNC Server MPX Extension
5.1.2 Annotation Tool MPX Extension
5.1.3 Client Application
5.1.4 Client-to-Server Window Sharing
5.2 Multicast Extension of VNC
5.2.1 Declaration of Message Types
5.2.2 Implementation of Session Setup
5.2.3 Implementation of Message Handling
5.2.4 Implementation of the NACK mechanism
5.2.5 Implementation of Multicast Flow Control .
5.2.6 Use with LibVNCServer

6 CollabKit Evaluation
6.1 Evaluation of Multi-User Functionality
6.1.1 Concurrent Multi-User View and Control
6.1.2 Multi-User Graphical Annotations
6.1.3 Cross-Platform Client
6.1.4 Client-to-Server Window Sharing
6.2 Evaluation of the MulticastVNC Extension
6.2.1 Throughput Properties
6.2.2 Latency Properties
6.2.3 Effectiveness of Multicast Flow Control .

7 Summary and Future Prospects

List of Figures

List of Tables


1 Introduction

Collaboration means working together. Computer-supported real-time collaboration systems allow two or more participants to work together simultaneously. Being computer supported, they provide the advantage of being able to make electronic documents, multimedia content or interactive applications available to participants, independent of their location. The real-time properties of such systems enable users to concurrently ask and answer questions, brainstorm, and thus to rapidly draw, refuse, or accept conclusions. Done asynchronously, such activities would take a much longer time.

These characteristics make computer-supported real-time collaboration systems useful in both professional as well as educational contexts. On the one hand, they enable knowledge workers and scientists to exchange information and to jointly create, share and modify electronic artefacts. On the other hand, computer-supported real-time col- laboration systems can be used for electronic teaching and learning: by employing such electronic classrooms for education, traditional teaching material can be made more in- teractive, abstract concepts can be visualised using simulations and multimedia content can be incorporated into the teaching process, allowing participants to interact with each other and the provided material.

1.1 Problem Statement

Lack of Fully Concurrent Multi-User Operation

The first area in which common computer-supported real-time collaboration systems are limited though is support of fully concurrent multi-user interaction:

On the one hand, there is one class of collaboration systems that does support fully concurrent user interaction, but such systems are confined to one or a few built-in ap- plications specifically designed for that system with multi-user support in mind. They do not allow users to interact with unmodified standard desktop applications.

On the other hand, the second class of computer-supported real-time collaboration sys- tems does allow participants to use any kind of desktop application, but they only support user interaction in a turn-taking mode. Only one user at a time can be in control of the shared desktop, there is only sequential but no concurrent interaction.

Bad Scalability in Low-Throughput Shared-Medium Networks

The aspect of multi-user support inevitably leads to the second area in which existing systems have shortcomings: when sharing applications or whole desktops - especially on low-throughput computer networks characterised by shared medium access such as wireless local area networks - the system’s user-perceived performance degrades with an increasing number of connected users. This is because the same data is sent to each and every user individually: the more users are connected, the less throughput capacity is available to each one.

1.2 Approach

In order to address the first problem - lack of fully concurrent multi-user operation in existing systems - the first focal point of this work was to create a computer- supported real-time collaboration system with support for fully concurrent multi-user operation. The eventual goal was to develop, implement, and test an easy- to-use collaboration system that allows its users to simultaneously interact with any kind of application on a shared desktop. To achieve this, existing technologies were examined and suitable ones integrated to form a collaboration system with the desired features.

The second problem - bad scalability in low-throughput networks - could not be solved by simply integrating existing technologies though. Instead, this required enhancing the way data representing shared applications is delivered to the system’s users. This meant designing and implementing an extension of an existing remote desktop technology that would make data transmissions use the shared medium more efficiently. The chosen approach to accomplish this was to fit the created system with support for mul- ticast data transmission. This allows a high number of users to efficiently use the created collaboration system on a low-throughput shared-medium network.

illustration not visible in this excerpt

Figure 1: A computer-supported real-time collaboration system that supports concur- rent multi-user interaction and transmits the shared desktop once to all clients using multicast.

1.3 Structure of this Work

First specific use case scenarios of computer-supported real-time collaboration systems are analysed with regard to the functional and non-functional requirements they pose to the system in use. Then a survey of existing systems and basic technology is done, analysing conformance to the identified requirements. This includes common remote desktop technologies like the X Window System, Virtual Network Computing and the Remote Desktop Protocol as well as other commercial and academic solutions.

Using the findings of these investigations, the design of a computer-supported real-time collaboration system that supports fully concurrent multi-user interaction and multicast data transmission is presented.

Then, the implementation process of the devised system is documented and the system itself is evaluated with regard to its conformance to the initial requirements. A summary of findings regarding the work done and a look-out to possible future work that can build upon the developed system mark the conclusion of this work.

2 Real-Time Collaboration Use Cases and Requirements

There are a multitude of application possibilities for real-time collaboration systems. The benefits they provide can come in handy not only when used as electronic classrooms in teaching environments, but such systems can also be valuable in professional contexts in which a team of knowledge workers uses and edits electronic documents together. This section presents the most fundamental use cases ranging from simple single-user presentation scenarios to collaborative work settings with several participants.

2.1 Use Cases

2.1.1 Presentations

Presentations using a portable computer and a projector nowadays are quite common in areas like school, university or industry. The setting is always quite similar: The presenter stands in front of the audience and shows something to them on the projector’s screen, putting across some idea using simple slides or demonstrating a running application. In this scenario, the audience remains more or less passive.

A common problem for the presenter is how to flip back and forth through the slides quickly. When no remote control is available, presenters always have to get back to their computer in order to turn pages. A similar problem occurs when an application is to be demonstrated: presenters have to stop explaining something in front of the audience and have to return to their computer to operate the program. Doing so, interaction with the audience, like responding to remarks or hints, is hampered. Figure 2 illustrates this.

illustration not visible in this excerpt

Figure 2: Typical problem while presenting: explaining something to the audience and operating the presentation equipment have to happen concurrently100.

There are two solutions to these problems: first, presenters could use a small and handy device like a sub-notebook or a smartphone. This would allow them to control the presentation and still be in close contact with the audience. However, operating a complex application with such a device can be cumbersome. The other solution is to leave presenters seated behind their large, easy-to-use notebook and to provide them with extended means of presentation that go beyond the usual mouse cursor, like the possibility of making graphical annotations on-screen in order to highlight some region of interest. This probably is the better approach for a task like application demonstration.

Finally, it is very beneficial for the audience if they can somehow interact with what is shown on the projector’s screen. Within a traditional presentation setup, only verbal interaction is easy. Drawing attention to some special region on the screen just by verbal means can be difficult, though. One the other hand, getting up and having to walk to the front to highlight something on the screen is as equally cumbersome. It is a big advantage for the audience to be able to directly draw graphical annotations onto the projector’s display from where they are or to show applications running on their computers on the projector’s screen.

2.1.2 Electronic Teaching

The electronic teaching scenario, as seen here, is a situation in which a relatively small group of instructors and learners is intensively engaged in creating comprehension of some matter. Teaching materials can be simple text documents, websites or multimedia contents like audio and video files. In fact, the essential benefit of real-time collaboration systems used as electronic classrooms over more traditional ways of teaching is that animated electronic content can be shown and directly interacted with. Training the use of some complex program is a good example: instructor and learners can explore the usage of the application together.

illustration not visible in this excerpt

Figure 3: An electronic classroom scenario: students (green) can work together on the yellow computer that is optionally connected to a projector.

When students have (limited) access to the instructor’s desktop, they can ask questions regarding some specific parts of the user interface a lot more easily and exactly. On the other hand, if problems arise when the assignment was to carry out a specific task with the program by oneself, it can be very helpful to show one’s own desktop to the others.

In another slightly different scenario, illustrated by figure 3, students can work together on a single desktop to collaboratively solve assigned tasks or to learn how to use some program together, assisting and guiding each other. Such a multi-user approach to electronic classrooms employs the benefits of collaborative learning88: By allowing students to jointly interact with complex objects on the screen, new ways of learning and understanding [104, p. 132][118, p. 7] can be developed.

A third, to some extent educational scenario is that of an assistance system. In this use case, a computer user interface is normally operated by one user alone. Only if the user is in need of help, an assistant can join in so that the two operate the user interface together. This provides the assistant with a far more effective way of guiding the user than, for instance, support by phone can offer.

2.1.3 Professional Collaboration

Professional collaboration scenarios are situations in which a group of people work together trying to solve a problem. This can include joint brainstorming, collecting relevant bits of information, compiling these bits into documents and creating a solution using these documents - all done in teamwork, collaboratively.

illustration not visible in this excerpt

Figure 4: A scientific professional collaboration scenario with three participants, each operating a different application.

It can be that all participants are in the same room, on the other hand it is possible that they communicate from different parts of the world. The professional collaboration use case also differs from the electronic classroom use case in that there is no instructor- student hierarchy: in a professional collaboration scenario, participants are not being taught but are equal, working together. There is much more emphasis on collaborative work.

There are lots of possible fields of application for real-time collaboration systems. For instance, scientists can use it for collaborative brainstorming or developing of ideas, very much like a shared whiteboard, but rather an electronic one with multimedia objects on it. Such a system can also be of use for the industry: in business contexts, it is quite common that considerably large documents have to be created, often containing multimedia content and requiring teamwork to be compiled. Especially for geograph- ically distributed teams, a real-time collaboration system used for computer supported collaborative work can be beneficial105, even more if it allows fully concurrent user interaction113.

2.2 Requirements Analysis

Taking the aforementioned use case scenarios as a basis, it is possible to deduce requirements the used software has to meet. First, high-level non-functional requirements specifying a system’s general characteristics and overall qualities [102, p. 187] are dealt with. Then, lower-level functional requirements which define particular abilities and functions of a system [102, p. 188] are investigated.

Naturally, requirements regarding software functionality differ the most between the considered use cases, whereas non-functional requirements specifying overall system qualities were found to be more uniform.

2.2.1 Non-Functional Requirements

Although the considered use cases differ in some particular aspects mostly regarding specific functionality, they all pose similar requirements when it comes to overall characteristics of the software used to operate a real-time collaboration system. Along with particular functionality offered, it is the fulfilment of these non-functional requirements that accounts for a good user experience.


First of all, all three use cases pose certain requirements regarding performance: es- pecially for real-time collaboration, a slow system is a system nobody will like to use. When working together in real-time, there should not be a too long time span between a user action and the response triggered by that action: too high delays increase task completion time and user error rate90. Thus, the system should try to keep latency low. Additionally, in all three uses cases rather bulky image data representing a whole desktop or single windows is transferred between participants. For a good user experience, this should happen as fast as possible, essentially meaning that the system in use should achieve high effective throughput.


Furthermore, in all of the considered use cases the number of participants is not predetermined: it can be that just two or well over twenty users take part. Ideally, the used system should deliver the same high level of performance with any number of connected clients. Put short, it should do well in terms of scalability.


Then, it can be expected for every use case that the level of technical computer knowledge among users differs. Besides, while experts would be able use a system that is overly complicated to operate, that use would still be unnecessarily time-consuming. Therefore, ease of use is another important non-functional requirement.


Additionally, users of such a collaboration system do not only have different levels of computer knowledge, it is also very likely that they use different operating systems, at least in the presentation and professional collaboration use cases. The used client application, if any, should therefore be platform-independent or at least portable.


Finally, security of the employed system is of importance. On the one hand, this includes the basic question of who is allowed to use the system and who is not. On the other hand, security includes guaranteeing confidentiality, availability and integrity of the communication that is taking place.

2.2.2 Functional Requirements

The following subsection deals with the functional requirements the software used to facilitate a real-time collaboration system has to meet. First, basic functional require- ments pertaining to specific features and actual functionality of the used software are identified per use case and summed up. Then, those functional requirements are deduced whose fulfilment helps to satisfy the high-level non-functional requirements identified in subsection 2.2.1.

Basic Functional Requirements

Regarding Presentations. Like mentioned before, it can be quite beneficial in presen- tation scenarios if the audience is able to interact somehow with what is shown on the projector’s screen, improving interaction between presenter and viewers. In terms of functionality this means the audience should be able to remote control the presenter’s desktop, i. e. to do basic things like moving windows or flipping back and forth in slides. But other more advanced features are conceivable as well: the ability to draw graphi- cal annotations onto the presenter’s desktop would be very useful to highlight certain regions of the screen for the others. Also, functionality to show one’s own window on the presenter’s desktop can be a helpful feature for an attendee of the presentation, for example to show a certain document or application to other participants.

Thus, functionality requirements for the presentation use case were specified as:

- functionality to remote control the presenter’s desktop
- an annotation mode
- functionality for presentation viewers to export their own windows to the presen- ter’s desktop

Regarding Electronic Teaching. For the electronic classroom use case, functionality requirements were found to be similar, with the exception that there is a stronger em- phasis on window or desktop sharing in both directions, from instructor to students and from students to instructor. On the one hand, it should be possible for students to share their desktops (or parts thereof) to the instructor or to others in order to get specific help. On the other hand, students should be able to view the instructor’s desktop, es- pecially when they are remotely located. In addition, they should be able to exercise limited control on the instructor’s desktop, for example in order to ask about the usage of some control element or to demonstrate something to other students. For this pur- pose, a graphical annotation mode on the instructor’s desktop would be useful in this scenario as well.

Therefore the requirements regarding functionality for an electronic classroom use case are:

- functionality to view and limitedly remote control the instructor’s desktop
- an annotation mode on the instructor’s desktop
- functionality for students to export their own windows to the instructor’s desktop

Regarding Professional Collaboration. The professional collaboration use case has functionality requirements that are almost the same as those of the presentation and electronic teaching scenarios. The one additional requirement the professional collabo- ration use case has is that the system should support several participants concurrently working together at one desktop. Since the fundamental means of controlling a desktop are mouse pointer and keyboard and because fully concurrent collaboration provides ad- vantages over turn-taking113, the system thus ideally should provide every participant with their own mouse cursors and keyboard foci. Such a computer supported real-time collaboration system allows users to interact with objects on the desktop jointly and simultaneously.

Functionality requirements for professional collaboration use can thus be summed up as:

- functionality to view and remote control the shared desktop
- a graphical annotation mode on the shared desktop
- functionality for participants to export their own windows to the shared desktop
- support of multiple mouse cursors and keyboard foci on the shared desktop

Functional Requirements Related to Non-Functional Ones Related to Performance and Scalability. Since one of the main uses of the system is to transmit rather bulky image data, the underlying network’s maximum throughput is likely to pose a fundamental constraint. For some of the considered use cases, it is reasonable to expect wireless LANs with 54 MBit/s (802.11a/g) or just 11 MBit/s (802.11b) gross data rate. With this in mind, it is obvious that for instance delivering 25 fps of uncompressed RGB data to multiple participants will quickly exhaust the network’s capacity.

illustration not visible in this excerpt

Figure 5: Multicast data transmission provides significant channel capacity savings com- pared to unicast.

Therefore, the maximum achievable data throughput of the underlying network was identified as the primary bottleneck regarding the system’s scalability. Taking into account the considered use cases, it is reasonable to maintain that in most cases multiple users will be connected to the desktop they are jointly working on. Since the image data representing this remote desktop is the same for all connected participants, an obvious approach to alleviate the constraints posed by limited network capacity is to use multicast data transmission instead of unicast. This way data just gets sent once to all connected users instead of being delivered to each and every one individually. It was concluded that by using multicast data transmission instead of unicast, the system’s performance would not be worsened by an increasing number of participants anymore, as illustrated by figure 5. Thus, the use of multicast transmission of image data was made a fundamental requirement regarding the system’s performance and scalability.

Regarding the maximum achievable throughput that can be measured at the client side, the following metrics show the advantage of multicast over unicast: For the unicast case, the maximum throughput observable by a client cl can be defined as

illustration not visible in this excerpt

In this metric T cl, the expression T p describes a concave metric that defines the maximum throughput limited by the characteristics of the network path from server to client: Let T (n i , n j) be a metric describing the achievable throughput between two network nodes n i and n j and let p (n 1 ,n 2 ,...,n m) be the path between server node n 1 and client node n m. Then T p can be expressed as

illustration not visible in this excerpt

Similarly, the expression T sp describes the achievable throughput on the path sp which is the subset of p that the client cl shares with N sp − 1 other clients. It can clearly be seen that T cl decreases with an increasing N sp.

However, when using multicast data transmission, the maximum client-observable throughput becomes independent of the number of clients that share the same path. The metric then evaluates to a rather simple

illustration not visible in this excerpt

showing that the maximum throughput observable by cl is now independent of the number of other clients it shares the network path to the server with.

Multicast data transmission can also be beneficial for the latency of communication taking place, because it can cut down on server answer time. First, for the unicast case, the delay or latency observed by a client cl can be defined as

illustration not visible in this excerpt

Within L cl, the expression L p describes an additive metric defining the latency of the client’s connection to the server: Let L (n i , n j) be a metric that describes the latency

illustration not visible in this excerpt

The term t srv in L cl describes the time the server needs to serve a single client. This includes preparing data, making calculations and transmitting data. N sp is defined as in the throughput metric above. It can be seen that given a non-zero t srv, L cl increases with an increasing N sp. The higher t srv, the stronger the effect.

The benefit of multicast data transmission is that it eliminates the possible delay a client might encounter while waiting for others to be served: Because data is now sent only once instead of N sp times, the latency observed by cl when using multicast data transmission is described by

illustration not visible in this excerpt

Related to Security. The first aspect of security, access control, is equally important in all considered use cases: may it be in presentations, training courses or professional collaboration meetings, the fundamental fact is that there is a desktop being shared. Naturally not everybody, not even in the local subnet, should be granted access without basic authentication. Thus, the system should provide at least a simple access control feature to lock out unwanted users. Additionally, some form of tiered access control like allowing full or view-only access may be useful for the considered use cases as well.

The other aspect of security is confidentiality, availability and integrity of the communi- cation that is taking place. This may not be much of an issue if data is sent and received only in a private, secured local area network, but it definitely becomes an issue when ge- ographically remote users take part. Especially when data is transmitted over untrusted networks like the Internet, it has to be encrypted somehow to prevent eavesdropping or spoofing.

Related to Usability. By having looked closely at the considered use cases, it became clear that different people with different levels of technical knowledge would use the system. Therefore the system should feature an intuitive and easy to use graphical user interface at the client side that provides users with all fundamental operations like connecting, interaction and disconnecting. The GUI should preferably adhere to the user interface conventions of the OS it is running on.

Furthermore, it was concluded that one the most significant issues hampering usability is unnecessary and recurring configuration. For instance, requiring the user to know of IP addresses and ports in order to connect to some service is considered harmful. The user application should rather support mechanisms like automatic service discovery, simply providing the user with a list to choose from. Preferably, users should be able to connect with a single click instead of having to care about technical details like IP addresses or ports.

Related to Portability. As stated in subsection 2.2.1, for the electronic classroom use case (for instance schools), platform independence may not be overly important since it can be expected that a homogeneous supply of computers is available, but especially for the presentation and professional collaboration use cases it is very likely that participants run different operating systems on their client computers. Thus the used software on the client side should be cross-platform or at least available for Unix, Mac OS X and Windows.

2.2.3 Summary

In conclusion, the requirements posed by the considered real-time collaboration use cases can be summarised like in figure 6: this diagram shows identified functional and nonfunctional requirements and their inter-relationship regarding as to which functional requirements help to satisfy which non-functional ones as described above.

illustration not visible in this excerpt

Figure 6: Summary of functional requirements and their relation to non-functional ones.

3 State of the Art

3.1 Basic Principles regarding Real-Time Collaboration Systems

3.1.1 Classification

Collaboration, when denoting working together with the help of computers, is commonly referred to as computer supported cooperative work (CSCW) in literature. CSCW can be defined as » computer assisted coordinated activity such as communication and problem solving carried out by a group of collaborating individuals « [78, p. 1], but the term is also used to name the multi-disciplinary field of research that deals with understanding of social processes and the design, implementation and evaluation of technical systems supporting social interaction89.

illustration not visible in this excerpt

Figure 7: Classification of groupware by space and time.

Such multi-user software systems facilitating CSCW are commonly referred to as group- ware78, but this term is sometimes also used to describe the software together with the social group processes99. Since the aspect of social processes is contained within the term CSCW, this work uses the former definition and refers to groupware as the software system supporting CSCW. The are various possible classifications of groupware. The most common one is the space-time taxonomy98 that classifies collaboration systems using this two aspects as in figure 7. Other possible classifications use relations between persons and artefacts84, the interdependencies of communication, coordination, co- operation120 or different application classes86 to categorize collaboration systems. This work uses the space-time taxonomy and focuses on real-time collaboration systems that facilitate desktop conferencing and application sharing (quadrant III in figure 7).

Social Entities

An important insight into the nature of collaboration systems is that they are technical systems that are tightly interwoven with social systems. Therefore they can also be referred to as socio-technical systems. The social component is characterised by different forms of interaction between different social entities.

A possible classification of social entities distinguishes between dyads, groups, teams, social networks, communities and organisations [89, p. 16 ff]: Dyads are social entities that consist of exactly two persons, groups comprise more people. What characterises dyads and groups is their differentiation against other people, which can take the form of inward (e.g. through self-identification) or outward (e.g. through official club member- ship) differentiation. Teams are defined as groups that pursue a target. Social networks, on the other hand, are characterised by the social relations of the entities they consist of. Communities, in contrast to social networks, are defined as groups of people sharing a common culture. Like teams, communities also pursue a target, but they have a big- ger number of members that do not necessarily know each other. Finally, organisations are seen as social entities that pursue a target with the help of social structuring and coordination.

All these categorisations of social entities are not disjunct. However, it can be stated that the real-time collaboration use cases considered in section 2 do not comprise all of them but only dyads, groups and teams. Real-time interaction of a larger group of people would be too noisy.

Forms of Social Interaction

Concerning forms of interaction between the mentioned social entities, communication, consensus building, coordination, awareness and cooperation can be listed as the most fundamental ways of interacting [89, p. 8]. Real-time collaboration systems perform particularly well with respect to communication, awareness and cooperation. Consensus building and coordination are less of a focal point.

Precisely because they allow users to concurrently interact, real-time collaboration systems do not necessarily have to provide special means to facilitate consensus building and coordination: users are able to do this by communicating in real-time. They can concurrently ask and answer questions, very much like in a real-world face-to-face setting. Synchronous communication furthermore allows for some imprecision when starting to convey ideas, the correct meaning can be worked out by asking and answering questions. This way, ideas can be communicated more rapidly than with asynchronous communication where thoughts have to formulated as precisely as possible from the start on because further clarification would take a lot of time.

The synchronous nature of real-time collaboration systems is also the reason they facilitate mutual awareness of users: When users communicate in real-time, they are necessarily aware of each other.

Finally, this mutual awareness and the ability to communicate are prerequisites for successful cooperation. Here support for joint handling of electronic artefacts is of great importance since cooperation almost always involves working with shared electronic documents or data.

3.1.2 Common Technical Realisation

Figure 7 on page 14 lists shared view desktop conferencing, application sharing and video conferencing systems as examples for different-place real-time collaboration systems. Such systems are often realised by integrating different technologies [89, p. 131].

When looking at such systems from a multi-user-support perspective, they can be cate- gorized in two classes, like already pointed out in section 1.1: One the one hand, there are special multi-user tools that support concurrent multi-user interaction but are con- fined to a single application. On the other hand, there are more general systems that allow sharing any kind of desktop application but are limited in terms of concurrent multi-user support.

The first group according to this classification are real-time collaboration systems that provide a single application which is specifically designed for concurrent multi-user sup- port: This includes collaborative text editors such as SubEthaEdit45, Gobby13, SynchroEdit47 or EtherPad10 as well as shared whiteboards and drawing appli- cations such as Scriblink39, Twiddla55, iScribble19 or PaintChat35. Most of the text editors are implemented as native applications, with the exception of Ether- Pad, which is a web application implemented in JavaScript. The mentioned shared whiteboards and drawing applications are all web-based and implemented using Java or Adobe Flash.

The other group of real-time collaboration systems according to the multi-user support classification are shared view desktop conferencing systems that allow users to operate any kind of standard desktop application. Some of them support sharing single applica- tions instead of whole desktops as well. Commonly, such systems are based on a remote desktop technology such as the X Window System, VNC or RDP. Some are also realised using other, proprietary protocols or are built on web application technologies like Java or Flash.

Since one of the focal points of this work was to create a real-time collaboration system that allows its users to concurrently interact with any kind of application on a standard desktop, this second group, comprised of shared view desktop conferencing systems, is examined more closely in section 3.2.

3.2 Survey of Existing Real-Time Collaboration Systems

After identifying the requirements posed by the different considered use cases, the next step was to do a survey of related work to see what had already been done and what basic technology and tools were available to build upon.

A first obvious candidate was the X Window System version 11 that by design supports transmission of graphical user interfaces over a network. A promising alternative was found to be the desktop sharing system V irtual N etwork C omputing whose most no- table feature is its simplicity and thus wide distribution among operating systems and devices. Software based on the R emote D esktop P rotocol developed by Microsoft was also considered.

Thus, results of the investigation are split up into four subsections, the first three ones containing results belonging to the three major remote desktop technologies X11, VNC and RDP. The last subsection contains software whose underlying protocol is different from these major three or is unknown.

In each subsection, the individual works were examined with regard to their conformance to the most fundamental functional requirements identified in section 2.2. Additionally, all examined software products were checked for source code availability, which is a fundamental requirement in case the system in question would be chosen to be extended.

Results are summed up in a table and a short textual roundup at the end of each subsection.

3.2.1 Based on the X Window System

The X Window System, also just called X or X11, is a windowing system and network protocol. It is the most commonly used software to display a GUI on Unix-like operating systems. Since the X Window System was designed from the ground up with network transparency in mind, all X-based applications can be displayed and controlled remotely or locally. The main catch here is that bare X11 only supports one receiver for each application instance: This way, applications cannot be shared. There are, however, some X11-based tools that facilitate sharing of applications between several participants.


One basic way to control an existing X session is the tool x2x68. With x2x, keyboard and mouse of a local X display are able to control a another remote X display. Depending on its configuration, x2x creates an invisible, one pixel wide window at one edge of the local screen. If the mouse pointer moves over this window, x2x sends mouse and keyboard commands to the remotely controlled computer. If the cursor moves back, the desktop of the local machine is controlled as before. x2x thus is suitable for use as a basic remote control, but does only support the X Window System, does not support transmission of display contents and has no collaboration support whatsoever.


With xtv72, it is possible to locally view a remote X display in a window. xtv does not allow the user to control the remote display and is tied to the X Window System. Because it is transferring the entire screen content uncompressed via unicast it is quite slow and barely usable in a wireless environment.

NoMachine NX

With NoMachine NX33 a remote X display can be controlled through a window appear- ing on the local machine. NX client implementations exist for most operating systems, users are not tied to operating systems that have a X Window implementation.

NX acts as a proxy between the remote controlled X display and the client: the NX proxy compresses the data flow and creates a cache of already transmitted data (e. g. icons). This eliminates many unnecessary round-trips between X client and X server. The NX proxy architecture also makes an X connection somewhat stateless: If the network connection terminates unexpectedly while running a traditional X session, all applications terminate as well because state is stored at the server and the client. If it is only the client’s connection to the NX proxy that terminates, the applications keep on running, the client can eventually reconnect and continue working where it stopped.

While the underlying libraries are open source, the client and server applications are not. While there is a free implementation of the server called FreeNX11, it seems it is not very well maintained and could not be made to run. The proprietary server only supports three registered clients, on the other hand its performance is relatively good: It is possible to play videos at a viewable frame rate over an 11 Mbps connection. However, only the proprietary client application exists, there are no free client implementations. Furthermore, NX makes no use of multicast, all display contents are transferred unicast.

NX does not support collaborative features like annotations, application sharing or multiuser operation of the remote display.


XMX70 is a »X Protocol Multiplexer«: it claims to allow an X session to be distributed among several other X displays, which can then view and control this session. Like NX, XMX acts as a proxy between communication partners, the remote X session is displayed in a local window. What sets XMX apart from NX is that it incorporates some multiuser features like a basic floor control. It does not, however, support other collaboration features like annotations or window sharing.

Unfortunately the last release of XMX was made in 1999, so this software is rather outdated. It was impossible to obtain or produce a version that would run on today’s X Window System implementations.


While not exactly being a remote display technology in itself, MPX34 is a very useful extension of the X Window system in terms of multi user collaboration support. MPX stands for Multi Pointer X and allows fully concurrent multi-user operation with several pointers and keyboards at the window system level. According to96, MPX is the first incarnation of a real Groupware Windowing System: Instead of modifying existing applications, multi user support is built into the underlying windowing system. This way multiple users can simultaneously interact with different applications on the same display. To allow control of a single application by several input devices at the same time, an application has to be modified to be made multi-device aware.

illustration not visible in this excerpt

Figure 8: Several MPX pointers operating a shared scribble sheet.

There is an experimental multi-pointer window manager available95 that demonstrates MPX features like simultaneous moving or resizing of windows with two or more mouse pointers.


Of the five examined X11 solutions, only NoMachine NX and XMX provide full re- mote desktop access including view and control. Multi-user operation is only provided by XMX, but it just allows turn-taking. Fully concurrent multi-user operation is only provided by MPX, but on the other hand, this system provides no remote desktop functionality at all and thus has to rely on other tools to provide remote desktop func- tionality. Annotations, client-to-server window sharing, multicast or server discovery are provided by neither of the tools examined, but they all have source code available, making extensions possible.

Table 1: Comparison of X Window System Software.

illustration not visible in this excerpt 1 2 3

3.2.2 Based on VNC

VNC or Virtual Network Computing is a remote desktop technology that tries to adhere to the thin client paradigm: It tries to keep the client as simple as possible and concen- trates most of the complexity at the server side. Therefore, the underlying protocol called RFB (for Remote Framebuffer Protocol115 ) is intentionally kept as simple as possi- ble to ease client side implementation. VNC simply transmits (optionally compressed) image data to the client which in turn sends back mouse and keyboard commands to the server. The image data sent by the server can be encoded in different ways, which are negotiated by server and client at session start-up. In contrast to the X Window System, a VNC connection is a stateless one: all state is stored at the server so that when a connection dies unexpectedly, the application keeps on running on the server. The client can simply reconnect and continue working where it stopped.

Because of the simplicity of the RFB protocol, client implementations are really widespread and do exist for almost all major and minor operating systems42.


RealVNC37 is the direct descendant of the original VNC software suite, developed by the same team that created the first implementation of VNC. RealVNC is available in three different flavours: a free, open source edition with the basic VNC features (avail- able for Unix); a personal, commercial edition featuring text chat, printing support and encryption (available for Windows); and an enterprise edition with enhanced authenti- cation and encryption (available for Unix, Windows and Mac OS X). Probably because it is an offspring of the original VNC software, RealVNC supports only the original VNC encodings, although more performant ones have been developed by other projects [56, 53]. Besides text chat support, RealVNC does neither offer any multi-user support nor multicast.


UltraVNC56 is an open source Windows implementation of a VNC server and client. It features its own »ultra« compression scheme and a so called »mirror video driver« that allows the server application to get notified about screen updates without constantly polling the framebuffer, resulting in lower CPU load. The bundled UltraVNC client application is able to view and control any VNC server and additionally provides basic file transfer functionality when connected to an UltraVNC server. Besides these features, server and client do not provide any multi-user collaboration features or multicast data transfer capabilities.


Unlike UltraVNC, TightVNC53 is available for both Microsoft Windows and Unix-like operating systems. TightVNC was the first implementation to support the JPEG-based »tight« encoding for image data, hence the name. This encoding is a lossy one and achieves very good compression ratios compared to lossless encodings. The Windows version of the server is able to share the whole desktop or just a single window. The TightVNC client application is able to connect to any VNC server, but only when connected to tight-enabled servers it is able to use this lossy encoding. TightVNC does neither support any multi-user features nor multicast.


Xf4vnc69 is a VNC server that is - like the Unix variants of RealVNC and TightVNC - also an X server. This mode of operation means that a new X11 session that is exported via VNC is spawned for every client. Like RealVNC and TightVNC, xf4vnc is not able to share an existing session this way. However, it does allow sharing of OpenGL applications, although not hardware accelerated (RealVNC and TightVNC provide no support for OpenGL applications at all). Xf4vnc also supports redirecting the OpenGL command stream to clients to be rendered there, as opposed to rendering the image on the server. Clients have to have support for this system called Chromium7, though.

Additionally, xf4vnc also provides an X server plug-in module that is able to share an existing X11 session. However, this requires changing the X server configuration in superuser mode. Furthermore, the module does not work with recent X server imple- mentations.

Both implementations of xf4vnc do not provide any multicast data delivery or multi-user collaboration features.


X11vnc116 is an open-source VNC server for Unix-like operating systems. Unlike traditional VNC servers for Unix systems, which create a new session for every client, it allows sharing of an existing user session by connecting to an already running X server. This way x11vnc acts as a client of the running X server whose display it is exporting and as a server to VNC clients which receive that display. Because it basically works like a sophisticated screen scraper, x11vnc is also able to export accelerated OpenGL applications, a thing that traditional Unix VNC servers are not able to do because they are both X11 and VNC server in one application and lack the proper X11 OpenGL extensions4. X11vnc (through the LibVNCServer library21 ) furthermore supports all major VNC encodings including »tight« and »ultra«, can do server-side screen scaling, is able to share the whole desktop or just single windows and has support for automatic server discovery via Zeroconf82. It does not have any multi-user features or support for multicast data transfer, though.


Vino58 is the standard VNC server of the GNOME desktop environment. It works exactly like x11vnc in that it is able to share an existing session (possibly with OpenGL applications), with the exception that Vino can only share the whole desktop, not single applications. It also provides Zeroconf service advertisement and supports the same VNC encodings as x11vnc (by using LibVNCServer21 ), but unlike x11vnc, it is very tightly integrated into the GNOME environment.

Apple Remote Desktop

Apple Remote Desktop4 is the default remote administration software suite used in Mac OS X that uses VNC for graphical remote desktop access. It also provides other features like distribution of software packages, remote batch jobs and remote monitoring. What is missing though is multi-user support or multicast data delivery.

Collaborative VNC

Collaborative VNC8 is a patch to TightVNC 1.2.9 Unix version that extends the orig- inal software with some multi-user support: Each new client gets a uniquely coloured and labelled mouse cursor that is drawn into the VNC server’s framebuffer. This has the advantage that it in principle works everywhere but has the drawback that these mouse cursors are only known to the VNC server and its clients. The VNC server’s host oper- ating system does not know about these multiple cursors. Collaborative VNC therefore implements a simple floor control mechanism that maps several client mouse pointers onto one. This way, it is not possible to interact simultaneously with applications on the server desktop. Furthermore, Collaborative VNC uses a modified RFB protocol, it is not compatible with existing VNC viewers.

illustration not visible in this excerpt

Figure 9: Collaborative VNC showing two distinct client cursors.


The DrawTop project developed by the University of Sydney[119, 9] works similar to Collaborative VNC in that it draws into the VNC framebuffer. However, DrawTop is not a modification of an existing server, but sits as a proxy between VNC server and clients and works with standard VNC clients. It offers a transparent framebuffer overlay that clients can draw into, making annotations on the original server desktop. Also, one client at a time can take control of the underlying desktop. It is not possible to have multiple clients control the server’s desktop. DrawTop uses the same VNC library as x11vnc and Vino do21, so supports the same encodings.


SharedAppVNC40 is a tool for remote collaboration that allows its users to share individual applications between them. SharedAppVNC does not make a strict distinction between client and server: every participant can act as a server, sharing applications, and as a client, receiving applications. Shared applications can either be set to »view only« or can be controlled by the receivers, where every application appears in its own movable and resizable frame. The software is available for Linux, Mac OS X and Windows, but uses a modified RFB protocol, so it is not compatible with existing VNC servers or clients.


Similar to SharedAppVNC, MetaVNC28 is a window aware VNC server and client, but it uses a different approach from the user perspective. First, the MetaVNC server al- ways shares all windows, users are not able to select specific ones. Second, the MetaVNC viewer does not place each received window into its own manageable frame, but instead uses a single fullscreen window with a transparent background to draw received win- dows into. It is based on TightVNC 1.3.9, so otherwise supports the same features as TightVNC.


TightProjector52 is a commercial VNC server available for Windows that sends all data via multicast. A special client application is needed to receive this multicasted VNC data. Users are only able to view the server’s desktop, it is not possible to control the remote side. Analysis of network traffic revealed that TightProjector seems to do no error recovery whatsoever, it simply sends fullscreen updates at regular time intervals.


MulticastVNC31 is a discontinued Java VNC proxy that features transmission of VNC data via multicast. The MulticastVNC proxy connects to a VNC server as a normal client and multicasts the data it gets from the server. A modified Java VNC viewer is used to receive the multicasted VNC data. Probably because the software was developed for tele-teaching, it only possible to view the remote desktop, multicast clients are not able to control the remote desktop. MulticastVNC does not do any multicast error recovery and only supports the Hextile115 encoding which does not compress data very well compared to »tight« or »ultra« encoding.


TeleTeachingTool51 is an extension of the original MulticastVNC into a feature-rich software suite for distance learning. The main application differentiates between two fundamental modes of use: in »lecturer« mode, the actions on the user’s desktop are recorded and sent to connected clients. These run the software in »student« mode. Only the lecturer can control the desktop or make annotations, clients are just able to view the lecturer’s desktop. Because it builds on MulticastVNC, TeleTeachingTool’s multicast mode also does no error discovery and provides no way to deal with lost packets - it simply sends a fullscreen update at regular time intervals [124, p. 6]. There is also a new implementation that does not use VNC anymore but relies on RTP for multicasting a video stream50.

Win2VNC / x2vnc

Win2VNC59 and its Unix counterpart x2vnc94 are VNC clients that offer a unique mode of control: the main principle is the same as with x2x (section 3.2.1 on page 17): They create an invisible, one pixel wide window at one edge of the local screen. If the mouse pointer moves over this window, they send mouse and keyboard input to the remote display. When the cursor is moved back, the desktop of the local machine is controlled as before. While this is definitely usable if the user user is able to physically see the remote display (like in a multi-screen setup), this way of remote controlling obviously is quite useless if the remote machine is located somewhere else. Both tools provide remote control and nothing more, but are listed here for completeness’ sake and because they provide an interesting way of interacting with a VNC server.


The majority of the VNC products looked at provide full remote desktop access with view and control functionality. The exception are the tools that have some kind of multicast support: these are all view-only. Only two solutions, Collaborative VNC and DrawTop, feature multi-user operation, but again only in a sequential fashion without fully concurrent interaction. DrawTop also is the only VNC software to fully support on-screen annotations, TeleTeachingTool only provides annotation facilities for the user locally operating the shared desktop. Furthermore only a single system, SharedAppVNC, provides client-to-server window sharing. None of the examined software products meet all of the posed functional requirements, though most of them are open-source, so could possibly be extended.

Table 2: Comparison of VNC Software.

illustration not visible in this excerpt 5


1 NX libraries are open source, but server and client applications are proprietary.

2 XMX supports basic floor control, but no real concurrent multi user operation.

3 MPX is not exactly a remote desktop technology, but a viable extension of X11 because it allows fully concurrent multi-user operation, so it is included here as well.

4 RealVNC and TightVNC provide no X11 OpenGL (GLX) extensions at all, xf4vnc does, but is not using hardware acceleration.

5 TeleTeachingTool offers annotations only for the lecturer, not for connected clients.


ISBN (eBook)
32.4 MB
Institution / Hochschule
Humboldt-Universität zu Berlin
remote desktop collaboration multicast vnc multi-user multi-user realtime collaboration




Titel: CollabKit – A Multi-User Multicast Collaboration System based on VNC