This project explores how people prefer robots to behave during social interactions, focusing on scenarios where a robot is interrupted while conversing with another individual.
The first step will be to create a dataset of interruptions in conversations by:
- Using a mathematical model of how people approach groups, which generates walking trajectories that individuals follow when joining a group.
- Employing an open-source large language model (LLM) to generate dialogue that represents interruptions in human-robot conversations, along with participants’ reactions.
With this dataset, the student will apply human-in-the-loop methods, such as Reinforcement Learning with Human Feedback (RLHF). Human annotators will compare pairs of possible robot behaviors and select the more appropriate one. Using this feedback, the student will train reinforcement learning policies that guide how a robot should respond during conversational interruptions.
Prerequisites
- Programming experience in Python
- Experience with common ML/AI Frameworks and libraries (PyTorch, transformers, etc.)
- Interest in human-robot interaction, machine learning, or reinforcement learning
- Familiarity with large language models is beneficial but not required
Contact
ronald.cumbal@it.uu.se
alessio.galatolo@it.uu.se