Teach Your Robot to Listen, Think, and Act Using Language Models and ROS 2
What if you could tell a robot what to do in plain English, and it actually did it?
In this hands-on workshop, you’ll build a local ROS 2 system where a small language model turns natural language instructions into robot task sequences. The robot will plan the steps, send them through perception and action nodes, execute them, and use sensor feedback to check whether the task was completed correctly.
There are no cloud APIs and no internet dependency. Everything runs locally on your machine.
The architecture may sound futuristic, but the pattern is already proven in real robotics systems: structured instructions drive step-by-step actions, with feedback at every stage. The difference here is that the interface is natural language instead of a recipe file, inspection plan, or CSV.
The workshop is beginner-friendly for anyone with basic ROS 2 knowledge and some Python experience. We’ll explain the language model side from the ground up, so no prior ML experience is required.
By the end, you’ll have a complete working agentic robotics system that you can take home, adapt, and build on.
By the End You Walk Away With
• A working ROS 2 system that takes natural language instructions and converts them into real robot actions.
• Hands on experience running a small language model locally, with no cloud dependency.
• Understanding of how to design prompts that produce structured, parseable task output instead of freeform text.
• A simple but functional task planner that maps language model output to ROS 2 action nodes.
• Experience grounding language commands in real sensor data from a camera and depth sensor.
• A feedback loop where the system uses perception to verify whether a task actually completed.
• Working knowledge of how to wire language models into ROS 2 nodes, topics, and services.
• A complete, modular codebase you can adapt for your own robot projects
Who Should Attend
• Robotics developers who want to add natural language interfaces to their ROS 2 projects.
• Students and hobbyists who have played with ROS 2 and want to explore how language models fit into the picture.
• ML and AI enthusiasts who are curious about how language models can interact with physical systems.
• Tech leads who want to understand the practical possibilities and limitations of language driven robotics.
• Anyone who has seen the demos online and wants to understand how it actually works under the hood.
What You Will Learn
✅Run a small language model locally using open source tools
✅Write robotics prompts that produce structured JSON task plans
✅Build a ROS 2 node that converts natural language into task commands
✅Break high-level instructions like "pick up the red cup and place it on the shelf" into executable steps.
✅Integrate camera and depth sensor data into the workflow
✅Ground language in perception by matching objects to sensor input
✅Map task commands to robot actions or simulations
✅Use feedback to verify each step before continuing
✅Understand ROS 2 coordinate frames, TF trees, and common transform issues
✅Debug poor model outputs, perception failures, and stuck actions
What You Will Need
• A modern laptop running Ubuntu (recommended), Windows (via WSL2), or macOS.
• ROS 2 Humble or later installed. We will send detailed setup instructions before theworkshop so everyone is ready on day one.
• Basic proficiency in Python and a working understanding of ROS 2 (nodes, topics, publishers). We will explain everything else.
• No ML or language model experience required. We will walk through the language model setup step by step.
• No GPU required. The small models we use run fine on CPU. If you have a GPU, great, it will be faster.
Why Now
Language models are now small enough to run locally on a laptop and capable enough to produce structured outputs that robots can use. What felt like a research demo six months ago is now something you can build in an afternoon with the right tools and architecture.
Robotics companies working on autonomous vehicles, delivery robots, warehouse automation, inspection drones, and service robots are already exploring natural language interfaces. Engineers who can connect language models to real robotic systems, manage edge cases, and make the workflow reliable will be in high demand.
This workshop gives you that foundation.
Format & event details
- Live online, interactive build-along
- Dates: Sat,11th July
- Time: 9:00–13:30 EDT
- Location: Online (link after registration)
🎟️Reserve your seat now — Limited seats. Live support. Real builds.
*By signing up for this event, you agree to receive emails from Packt Publishing.
Lineup
Ashish Ghatge
Good to know
Highlights
- 4 hours 30 minutes
- Online
Refund Policy
Location
Online event
Agenda
-
Open Networking
-
Setup
This setup section verifies that ROS 2 is installed correctly and that the workspace is configured and ready to use. Attendees then download and test the language model that will be used throughout the project. The section also includes a short introduction and Q&A to resolve setup issues before moving into the main robotics workflow.
-
Part 1 — The Language Brain
This section introduces language models in plain English, explaining what they are, how they work, and why they are useful in robotics workflows. It then guides the attendees through setting up a small open-source language model that can run locally on their machine. The focus moves from basic setup to practical robot interaction, showing how to design prompts that return structured JSON output instead of free text. Attendees then build a ROS 2 node that accepts a natural language instruction from the user, sends it to the model, and receives a response. The model output is parsed into a structured task sequence that the rest of the robot system can use. The section also covers how to handle bad or unexpected output, including cases where the model returns unparseable data. By the end of this milestone, attendees can type a natural language instruction and receive a clean sequence of robot tasks that can be passed into the robot control pipeline.