TableTops: worthwhile experiences of collocated

workspace awareness for real-time groupware. Special Issue on Awareness in CSCW, 2002. [8] H. Ishii and M. Kobayashi. CLEARBOARD: A seamless me-.
2MB taille 4 téléchargements 269 vues
TableTops: worthwhile experiences of collocated and remote collaboration A. Pauchet F. Coldefy L. Lefebvre S. Louis Dit Picard L. Perron A. Bouguet M. Collobert J. Guerin D. Corvaisier Orange Labs: 2, Av. Pierre Marzin, 22300 Lannion, France [email protected]

Abstract Tabletops incite people to collaborate around shared documents. We propose D IGI TABLE, a platform for collocated and remote collaboration which attempts to preserve the fluidity of interactions and the mutual awareness of copresence. D IGI TABLE combines a multiuser tactile tabletop, a video-communication system and a robust computer vision module for distant users’ gesture visualization. From an experiment, we show that D IGI TABLE improves the efficiency of a collaborative task in remote configuration. We also show that remote gesture visualization facilitates coordination as it provides to local participants important information such as intentionality and pointing. Thus, collocated and remote configurations are both worthwhile experiences: remote collaboration is not seen anymore as a poor ersatz of collocated collaboration, although presence feeling is not uniformly perceived by participants.

1. Introduction This paper addresses remote groupware collaboration and communication. We strive to design a remote collaboration platform which preserves as far as possible the fluidity of the interaction and the mutual awareness provided by co-presence. Mutual awareness refers to human ability to maintain and constantly update a sense of social and physical context. Being aware of others has an important role in the fluidity and the naturalness of collaboration. Our approach is founded on Gutwin’s Workspace Awareness [7]. Gutwin provides precious guidelines to design such a platform as he clearly identifies what perceptual information mutual awareness involves, how it is gathered and used by people, and finally, how it may be conveyed and rendered to a remote site. This descriptive framework helps to decide the kind of display to choose and the type of perceptual information to transmit between distant sites. Among the perceptual information involved in mutual awareness, we focus on the visual channel and especially on the gesture visualization of distant users, as it provides to local participants information facilitating distant coordination and communication such as intentionality (who is in-

tending to do what), action identity (who is doing what) and pointing [9] [10] . We present D IGI TABLE, a collaborative platform which includes a computer vision system to render remote gestures on a shared workspace. We design an experiment to evaluate how our collaborative platform performs in terms of task efficiency, coordination and user experience in collocated and remote uses. We explore several gesture visualization modes to identify how the remote gesture embodiment and the shared desktop orientation affect the task. A further work will complete this study which compels users to focus on the task domain [2] by evaluating D IGI TABLE with a more social activity.

2. Related Work V IDEO P LACE [11], V IDEO D RAW [17] and V IDE OW HITE B OARD [18] were among the very first video based

attempts to capture participant’s gesture and to fuse it with graphic images. Later, C LEAR B OARD [8] enabled eye contact between distant participants by using a half mirror polarizing projection screen whose transparency allows rear video projection, while user’s reflected image is captured by a camera. In 2004, Takao proposes T ELE -G RAFFITI [14], a remote sketching system for augmented reality shared drawings on real paper sheets. It allows two distant users to contribute to a virtual page, fusing the marks each participant has written on his/her own sheet. A very accomplished work in the domain of remote gesture visualization is V IDEOA RMS [15]. The authors designed a collaborative system with effective distant mutual awareness by coupling a sensitive S MARTB OARD with a video capture of participant’s gesture overlaid on the remote desktop image. Users can easily predict, understand and interpret distant users’ actions or intents, as their arms are visible whenever they want to interact with the tactile surface or show anything on it. E SCRITOIRE [1] and V I CAT [4] both associate a tabletop with a vertical screen for remote collaboration. V I CAT is designed for remote and collocated groups’ interaction, whereas Escritoire focuses on a personal desktop support-

ing bi-manual interaction. Both systems use conventional video-communication tools and remote gesture is embodied in tele-pointers and traces. Experiments on platform providing remote gesture visualization are still few [9] [10] and mostly focused on interaction with physical objects of the real world. Our main contribution is to give hints of how a collaborative platform combining remote gesture visualization and full-size video communication affects a collaborative task.

3. D IGI TABLE platform We propose D IGI TABLE[5], a platform for collocated and remote collaboration combining a multiuser tactile tabletop, a video-communication system enabling eyecontact and full size remote user visualization, a computer vision module for remote gesture visualization and a spatialized sound system (see Fig. 1 and Fig. 2).

Figure 1. D IGI TABLE is a platform combining a Diamond Touch, a video communication system, a spatialized audio system and a computer vision module to provide remote gesture visualization.

Figure 2. D IGI TABLE Architecture: shared application, video communication and remote gesture analysis is implemented at each distant site on a single work station.

We use Merl Diamond Touch [6] tactile surface on which a desktop image is projected from a ceiling mounted videoprojector (video-projector 2 in Fig. 1). Diamond Touch

supports up to four users simultaneously. A collaborative application server manages the collaboration between the connected sites and provides the replication of events occurring on window containers on both sites in order to keep the shared desktop consistency. We use an ad-hoc software based on Java and JOGL (Java Binding for OpenGL) to implement the interactive window containers. The video communication system uses a camera hidden behind a wood-screen and peeping through an almost unnoticeable 3mm wide hole. A second video-projector (videoprojector 1 in Fig. 1) beams on the wall screen the video of the distant site captured by a symmetric remote spy camera (see Fig. 2). Eye-contact is guaranteed by approximately placing the camera peep-hole at the estimated eyes’ height of a seated person and beaming the distant video on the screen such that the peephole and the distant user’s eyes coincide. Fine tuning of hole’s design and video-projector beam’s orientation is performed to avoid camera’s dazzle. Audio and video channels are compressed using respectively the TDAC and the H263 codecs and are sent to the remote site using RTP (Real-time Transport Protocol) over UDP/IP. Note that the video quality is currently limited by the spy camera peep-hole resolution (420 lines expanded to 4CIF - 704x576). The computer vision module [5] uses a camera placed at the ceiling and pointing at the tabletop. The module consists of a segmentation process detecting any object above the table by comparing, at almost the frame rate, the captured image and the desktop image, up to a geometric and color distortion. The geometric deformation model between the camera images and the desktop images is automatically estimated with an off-line procedure. The color transfer functions are computed on-line to cope with external lightning changes. The computer vision module provides an image mask of the detected objects (hands, arms, or any object above the table) extracted from the camera image. The compressed mask is sent to the distant site and overlaid on the current desktop image. Semi-transparency is used to visualize the desktop “under” the remote partner’s arms. Fig. 3 shows users’ gesture and its visualization on the remote site. The left bottom image shows the overlay of the detected hand of site 2 (upper right image); the right bottom image shows the overlay of the detected hand of the user in site 1 (upper left image). The system runs at about 17 Hz on dual-core Inter Xeon 3.73 Hz with 2Gb of RAM.

4. The present study: Comparison between collocated and remote collaboration 4.1. Objectives We aim at understanding how the D IGI TABLE platform affects collaboration and interaction between users. Many

Figure 3. Remote gesture visualization: view of both distant tables in action (upper line) and of both overlaid desktops (lower line); the left bottom image shows the overlay of the detected hand of site 1 (upper right image); the right bottom image shows the overlay of the detected arms of site 2 (upper left image); the remote gesture is overlaid in transparency with the desktop image.

parameters may impact a group activity: the type of the task which makes users focus more or less on the task and on the personal spaces [3], the position of the participants around the table and the functionalities of the application. More precisely, we focus on how remote gesture visualization influences a distant collaboration task on a tabletop.

• Remote side-by-side: the users share the same point of view of the desktop. The remote gesture is rendered as if people were side-by-side (or even on each other’s laps), in contradiction to the face-to-face visualization of the remote user on the wall screen (see bottom left image in Fig. 4 and middle image in Fig. 5). This configuration corresponds to the literal and partly situated embodiment of the remote gesture visualization. • Reconstructed gesture: the users share the same point of view of the workspace, while the remote gesture of the distant user is rendered on the desktop to recreate a face-to-face configuration. The remote gesture image is reversed and overlaid on the distant desktop in respect to the contact point on the table (see lower-right image in Fig. 4 and bottom image in Fig. 5). The discontinuity which exists in the side-by-side configuration is reduced since the distant gesture is oriented as if it were coming from the wall screen. The reconstructed gesture configuration has however two drawbacks: firstly, although people “feel” as if they are in a face-to-face configuration, the remote gesture image is mirrored (the distant user’s right hand is seen as his left hand); secondly, we may lack of information to complete the full gesture image on the remote site because of the symmetry: this is especially noticeable on the lowest image in Fig. 5, which shows that the remote gesture is incomplete in the upper part of the table.

4.2. Design Six different conditions In remote configuration, the user communicates with his/her remote partner through the life-size video communication system projected on the wall screen in front of him/her (Fig. 1 and 3). We evaluate different embodiments of the remote gesture. Gutwin [7] identifies two main axes for remote awareness representation on a shared workspace: the visual information may be literal (it is displayed in the same form as it is gathered) or symbolic (particular information is extracted and synthesized when shown), and situated within the workspace (placed at the same place of the shared workspace, on both distant sites) or separated from it. We propose three implementations of the remote gesture visualization, moving along the literal/symbolic axis in an ‘original’ way described below: • Remote face-to-face: this configuration corresponds to the literal/situated embodiment of the collocated faceto-face situation (see upper right image in Fig. 4 and upper image in Fig. 5). We mimic the reality: it is as if users are on both sides of a table. They do not share the same view of the document on the table as the shared workspace is upside down for them;

Figure 4. The various configurations to represent the remote gesture; the arms on the upper-right, lower-left and lower-right images stand for the gesture visualization of the user from upper-left image, user whose full size video is projected on the wall screen.

The three reference conditions to which we compare these previous implementations are:

• for the remote configurations: remote collaboration without gesture visualization when the users share the same view of the shared desktop, • face-to-face and side-by-side collocated situations.

Figure 5. The various embodiments of remote gesture. The upper, middle and bottom images correspond to face-to-face, side-by-side and reconstructed gesture configurations.

A mosaic completion task We choose a digital mosaic completion task as experiment. In comparison with puzzles, mosaics are composed of square pieces. Puzzles and mosaics have the advantage

to be a friendly application which needs a very short learning stage. The task goal is obvious and the learning process is limited to piece manipulation. Secondly, it has the benefit to make users focus on tactile interaction afforded by the tabletop. Manipulating graphical objects may be one of the most pertinent interactions on a tabletop comparing to text editing for instance. Finally, mosaic completion is clearly identified as allowing users to concentrate on the task space [3]. The users feel challenged to perform the task as fast as possible, although no particular instruction has been given to them in that way. The type of mosaic has also a significant effect on the completion task: whereas abstract images can be interpreted or understood from any viewing direction, figurative and especially textual mosaics induce a favored orientation for completion. The manipulation of pieces during a session, their orientation and positioning give a lot of information about implicitly private, shared and storage spaces created by participants [13] [16]. From preliminary tests we performed before the present study [12], it appears that textual mosaics lead to a tighter collaboration between users than abstract and figurative ones. As textual pieces cannot be easily read in an upsidedown position, participants have first to negotiate the general orientation of the mosaic. Moreover, whereas the orientation of textual pieces can be obviously deduced from its content, its positioning within the mosaic is difficult and needs more coordination and communication between users. To limit the number of sessions during the experiment, we choose textual mosaic (see Fig. 6) among all the three different possibilities (textual, figurative or abstract).

Figure 6. Textual mosaic

Experiment design To sum up, we want to evaluate 1. how the task completion is affected by the remote configuration,

2. what is the contribution of the remote gesture visualization in terms of task efficiency and user experience. We have experimented 6 conditions with pairs of users for a textual mosaic completion task: 4 in remote situation (face-to-face, side-by-side both with remote gesture visualization, side-by-side with reconstructed gesture visualization, and a side-by-side remote configuration without gesture visualization), 2 in collocated situation (face-to-face and side-by-side).

a cross pointer for dragging a single piece (Fig. 7, upper left image), a circular arrow for rotation (Fig. 7, upper right image) and the colored piece contours inside the selected bounding box (Fig. 7, bottom row images).

4.3. Participants A total of 30 subjects participated to the study, randomly put in pairs. There were one female pair, six male pairs and eight mixed pairs. Eight pairs gathered persons who knew each other well, two pairs a little and the last five pairs not at all. All participants were postgraduates, had normal or corrected to normal vision. Participants were not paid for the experiment.

4.4. Procedure The participants completed a series of 6 textual mosaics, 4 in the remote situation and 2 in the collocated situation. For all the remote configurations, the interactions are mediated by the D IGI TABLE platform. During the experiment, the order of the mosaic completions was counter-balanced in collocated situations (face-toface and side-by-side), in remote conditions (remote faceto-face, remote side-by-side, reconstructed gesture and no gesture visualization) and in situation (collocated and remote). The participants completed an individual training period before the 6 collaborative mosaic completions. At the end of the session, participants were asked to fill in a questionnaire and were shortly interviewed about how they felt the experiment. The experiment evaluation is guided by objective criteria (completion times, actions performed, collisions) and subjective criteria.

4.5. The mosaic application We have designed a dedicated Java application for mosaic completion, running locally and/or remotely on the D IGI TABLE platform. All the mosaics are composed of a 5x5 grid of square pieces (see Fig. 6). Three types of action are allowed: rotating a single mosaic piece, moving one piece or moving a group of pieces. A mosaic piece can be rotated but in 90o steps, as its edges have to remain parallel to the table sides. A user has to touch one of the 4 piece corners, and to perform a rotational motion. A single piece can also be moved by a user along an invisible grid by touching it near its center and dragging it from one place to another. To move a group of pieces, a user defines an invisible bounding box with a multiple contact on the table and drags it from one place to another. A visual feedback is given to let users identify the currently performed action:

Figure 7. Mosaic application: piece move (upper left image), rotation (upper right image) and bounding box for moving several pieces at the same time (bottom row images).

4.6. Event recording and analysis The D IGI TABLE platform is particularly convenient for experiment recording. The ceiling camera used for remote gesture visualization is also used to record an aerial view of the users’ interactions above the table. The full-size video communication system enables to record all the communications between users for all the configurations except for face-to-face collocated condition which needs another camera. Furthermore, the position of the spy camera at the user eyes’ height in the wood screen allows us to identify precisely when the participants are looking at each other in the remote configuration. All the users’ actions on the mosaic pieces are saved in a log file by the Java application (piece number, placement and orientation on the table, action type when touched - rotation, moving or bounding box). The mosaic completion times are also collected.

5. Results 5.1. Task efficiency Table 1 sums up the mean completion times for all the 6 configurations, i.e collocated side-by-side, collocated faceto-face, remote conditions (side-by-side with and without gesture visualization, side-by-side with reconstructed gesture and finally face-to-face with gesture visualization). The values in the cells correspond to the mean completion time M in seconds and the standard deviation SD. When comparing the 4 remote configurations as a whole with the 2 collocated configurations also as a whole, mo-

saics are completed faster in remote (M=292s, SD=96s) than in collocated conditions (M=339s, SD=149s) according to the significant difference obtained by the ANOVA test with repeated measures (F(1,14) = 4.8, p