A Virginia Tech team of  five computer science Ph.D. students at the Sanghani Center for Artificial Intelligence and Data Analytics is one of 10 university teams selected internationally to compete in the Alexa Prize TaskBot Challenge 2. The team will design multimodal task-oriented conversational assistants that help customers complete complex multistep tasks while adapting to resources and tools available to the user, such as ingredients or equipment. 

"For example, if a task requires an item that is not available, the taskbot should be able to adjust the corresponding subtasks and overall plan and suggest appropriate substitutes or alternative steps,” said Ismini Lourentzou, assistant professor in the Department of Computer Science, core faculty at the Sanghani Center, and advisor for the team.

Sponsored by Amazon Science, this year’s challenge has been expanded to include more hobbies and at-home activities and incorporates screen-based interactions into the conversational experience. In addition to verbal instructions, customers with Echo screen devices or a Fire TV can receive step-by-step instructions accompanied by visual aids composed of images and/or videos that enhance task guidance.

Participating teams will address challenges in multimodal knowledge representation, cross-modal retrieval, personalized task adaptation, and dialogue generation. Innovative ideas on improving the presentation of visual aids into every turn in conversation when a screen is available as well as the coordination of visual and verbal modalities are part of the team selection criteria.

Each of the participating teams has received a $250,000 grant to fund its work. In addition, teams receive Alexa-enabled devices and Amazon Web Services cloud computing services to support their research and development efforts as well as Alexa team support.

The Virginia Tech team is composed of Ph.D. students with research experience in computer vision, self-supervised and multimodal machine learning, natural language understanding and generation, and human-computer interaction. They are: Afrina Tabassum, who is serving as the team leader; Amarachi MbakweMakanjuola OgunleyeMuntasir Wahed; and Tianjiao Yu, all students at the Sanghani Center and members of the Perception and Language Lab

The Alexa Prize TaskBot Challenge 2 launched last month with a boot camp at the Amazon headquarters in Seattle. 

“The boot camp was a great experience that allowed us to connect with Amazon Alexa scientists and network with other competing teams,” said Lourentzou. “We are also grateful to the Amazon team for organizing this event and for sharing information about various resources and models that Amazon and Alexa AI has contributed to the field of conversational AI [artificial intelligence].”

One of the most exciting aspects of the challenge, she said, is that customers will be able to interact with their taskbot by saying the prompt “Alexa, let’s work together.” At the end of the interaction, the customer will be asked to rate how helpful the taskbot was in guiding the interaction and will have the option of providing feedback to help teams improve their taskbots.

“Being part of this challenge is a fantastic opportunity to showcase our students’ research accomplishments and to contribute to the fields of conversational AI and multimodal machine learning,” said Lourentzou.

The first place team, which will be announced September, will take home a prize of $500,000.

Share this story