In the ever-evolving landscape of AI technology, it was only a matter of time before we witnessed the integration of artificial intelligence into roles traditionally performed by humans. However, what’s surprising is that these roles aren’t necessarily manual labor, and the robots in question aren’t humanoids. Boston Dynamics has taken a unique approach by introducing a robot tour guide, and it’s housed within the familiar dog-shaped chassis of their Spot ‘bot.
So, why a tour guide in the form of a robotic dog? The answer lies in the innovative application of technology. Boston Dynamics explains their rationale as follows:
“In particular, we were interested in a demo of Spot using Foundation Models as autonomy tools—that is, making decisions in real-time based on the output of FMs. Large Language Models (LLMs) like ChatGPT are basically very big, very capable autocomplete algorithms; they take in a stream of text and predict the next bit of text. We were inspired by the apparent ability of LLMs to roleplay, replicate culture and nuance, form plans, and maintain coherence over time, as well as by recently released Visual Question Answering (VQA) models that can caption images and answer simple questions about them.
“A robot tour guide offered us a simple demo to test these concepts—the robot could walk around, look at objects in the environment, use a VQA or captioning model to describe them, and then elaborate on those descriptions using an LLM. Additionally, the LLM could answer questions from the tour audience, and plan what actions the robot should take next. In this way, the LLM can be thought of as an improv actor—we provide a broad strokes script, and the LLM fills in the blanks on the fly.
“This sort of demo plays to the strengths of the LLM—infamously, LLMs hallucinate and add plausible-sounding details without fact-checking; but in this case, we didn’t need the tour to be factually accurate, just entertaining, interactive, and nuanced. The bar for success is also quite low—the robot only needs to walk around and talk about things it sees.”
Boston Dynamics’ innovative use of AI and robotics to create a talking tour guide in the form of a robotic dog showcases the potential of these technologies. It not only offers a unique and interactive experience but also highlights the versatility of AI models like ChatGPT in enabling a robotic improvisation that captivates and engages audiences.
The fusion of visual recognition, question-answering capabilities, and the autonomy of the robot guide is a testament to the evolving world of AI-powered solutions. While the LLM may occasionally add imaginative details without verification, in this context, the primary focus is on delivering an entertaining and engaging tour. Boston Dynamics’ creation serves as a glimpse into the future of AI-powered tours and experiences, where robots can seamlessly blend technology, entertainment, and nuanced interactions with humans.