We have presented a range of machine learning techniques that we are using to explore the challenges of multi-modal, multi-user, social human-robot interaction. The models are trained on data collected from natural human-human interactions as well as recordings of users interacting with the system. We have given initial results using real data to train and evaluate these models, and have outlined how the models will be extended in the future.