In this paper we describe the computational
and architectural requirements for systems which support real-time multimodal
interaction with an embodied conversational character. We argue that the
three primary design drivers are real-time multithreaded entrainment, processing
of both interactional and propositional information, and an approach based
on a functional understanding of human face-to-face conversation. We then
present an architecture which meets these requirements and an initial conversational
character that we have developed who is capable of increasingly sophisticated
multimodal input and output in a limited application domain.