In this AAAI symposium, we aim to bring together researchers across multiple disciplines–including multimodal systems, human-robot interaction, embodied conversational agents, and spoken dialogue systems–to address a topic of common interest: the modeling, realization, and evaluation of turn-taking and real-time action coordination between humans and artificial interactive systems. This symposium will serve to build common ground for researchers from these disparate backgrounds to share their perspectives, methodologies, and results from their own investigations into the problem of multimodal coordination, and to promote communication, collaboration, and discussion on how to advance the current state-of-the-art in this space.
Regulating human-computer coordination hinges critically on multimodal sensing, making decisions under uncertainty and time constraints, and on synchronizing behaviors across different output modalities. On the sensing side, there are numerous challenges with tracking the conversational dynamics from multimodal data. For example, a turn-taking system must extract and integrate multiple audio-visual signals and knowledge sources to make inferences about user utterances, transition relevant places, floor control actions, speech sources and addresses in multiparty settings, and more. Making coordination decisions often requires reasoning under uncertainty and sometimes strict time constraints on the order of a few hundred milliseconds. Reasoning on an utterance-basis or just about the present moment is often not sufficient; processing needs to happen incrementally and decisions might need to be supported by reasoning about possible futures. Designing and rendering appropriate coordination behaviors (e.g. executing a task-related action, floor-taking actions, floor-releasing actions, and back-channels) appropriate for the affordances of a system’s embodiment raises additional challenges.
Topics Include, but are not limited to:
- models for coordinating linguistic and non-linguistic actions
- computational models for multi-party coordination and turn-taking
- multimodal inference for turn-taking (inferences about user utterances, transition relevant places, floor control actions, backchannels, etc.)
- incremental speech and audio-visual processing
- high-frequency, real-time decision making under uncertainty
- fusion of multiple information sources for making coordination decisions
- machine learning for multimodal inference and making coordination decisions
- communication dynamics in human-human action coordination and turn-taking
- listener feedback behavior, including back-channel generation
- turn-taking phenomena and affordances (e.g. linguistic and non-linguistic actions such as disfluencies, filled pauses, hedging, floor-holding, gestures and gaze, etc)
- generation of coordination and turn-taking behaviours (behavioural rendering)
- issues in coordination among parties with asymmetric roles, goals, or affordances
- effects of social factors and relationships on coordination behaviour
- cross-linguistic and cross-cultural factors
- corpora and resources for action coordination and turn-taking research
- metrics and methodologies for assessing coordination competencies
- empirical evaluation of action coordination and turn-taking models
- comparisons across human-robot interaction, embodied conversational agents, and spoken dialogue systems
The symposium will include:
- oral presentation sessions for accepted full papers,
- poster sessions for accepted short papers and position papers,
- plenary presentations by invited speakers,
- open panel discussions on core challenges from the perspective of different fields,
- breakout discussion sessions on how to facilitate new collaborative research efforts,
- video session: we invite the submission of videos that illustrate both successful coordination in human-machine interactions and also failure cases, as we believe these are as important (if not more) in driving research and the field forward. The accepted videos will be presented during the video session, and will serve as drivers for an open, plenary discussion on research challenges and opportunities in this area.
Prospective authors are invited to submit full technical papers (up to 8 pages) and short position papers (up to 4 pages). Accepted papers will be published in a technical report on the AAAI Digital Library.
In addition, we invite submission of videos (up to 5 minutes with a 1 page accompanying description) that illustrate both successful coordination in human-machine interactions as well as failure cases, as we believe these are as important (if not more) in driving research and the field forward. The accepted videos will be presented during a video session, and will serve as drivers for an open, plenary discussion on research challenges and opportunities in this area. The video descriptions will be archived in the AAAI Digital Library, and the accepted videos will be stored on the symposium website.
All submission should be made in AAAI format. Submissions should be made via the EasyChair site below; no email submissions will be accepted. Submissions should not be anonymized, and the author names and affiliations should be displayed on the first page.
Submission Site: https://easychair.org/conferences/?conf=aaaisss2015