Interactive evaluation of conversational agents : reflections on the impact of search task design

Dubiel, Mateusz and Halvey, Martin and Azzopardi, Leif and Daronnat, Sylvain; (2020) Interactive evaluation of conversational agents : reflections on the impact of search task design. In: ICTIR 2020 - Proceedings of the 2020 ACM SIGIR International Conference on Theory of Information Retrieval. ACM, NOR, 85–88. ISBN 9781450380676 (https://doi.org/10.1145/3409256.3409814)

[thumbnail of Dubiel-etal-SIGIR-2020-Interactive-evaluation-of-conversational-agents]
Preview
Text. Filename: Dubiel_etal_SIGIR_2020_Interactive_evaluation_of_conversational_agents.pdf
Accepted Author Manuscript

Download (778kB)| Preview

Abstract

Undertaking an interactive evaluation of goal-oriented conversational agents (CAs) is challenging, it requires the search task to be realistic and relatable while accounting for the user‘s cognitive limitations. In the current paper we discuss findings of two Wizard of Oz studies and provide our reflections regarding the impact of different interactive search task designs on participants’ performance, satisfaction and cognitive workload. In the first study, we tasked participants with finding a cheapest flight that met a certain departure time. In the second study we added an additional criterion: ‘travel time’ and asked participants to find a fight option that offered a good trade-off between price and travel time. We found that using search tasks where participants need to decide between several competing search criteria (price vs. time) led to a higher search involvement and lower variance in usability and cognitive workload ratings between different CAs. We hope that our results will provoke discussion on how to make the evaluation of voice-only goal-oriented CAs more reliable and ecologically valid.