Name: Introducing Robometer: General-Purpose Robotic Reward Model | Yigit Korkmaz posted on the topic | LinkedIn
Uploaded: 2026-03-04T03:56:14.135Z
Duration: 2 min 1 s
Channel: Yigit Korkmaz

Yigit Korkmaz

1mo

Can a single reward model work zero-shot across robots, tasks, and environments? We’re excited to introduce Robometer, a general-purpose robotic reward model trained on 1M+ trajectories that enables: - Online RL - Offline RL - Model-based RL - Data retrieval + imitation learning - Automatic failure detection And more—all zero-shot across robots, tasks, and scenes. Most existing reward models are trained only on successful or reward-labeled demonstrations, which limits their ability to distinguish between successful and failed attempts. Robometer addresses this by using trajectory comparisons, enabling reward models to learn from reward-unlabeled and failure trajectories. Robometer is a 4B-parameter video-language reward model predicting: - Dense reward (0-1 progress) - Success - Trajectory preferences (which attempt was better) We train by pairing input trajectories of different expertise levels and tasks, and by using video rewind. Robometer’s ability to train on reward-unlabeled, failed robot trajectories allows us to scale to training on our RBM-1M dataset: - 21 embodiments: bimanual, humanoid, mobile, tabletop - 1M+ trajs - 140k+ trajectories from mixed-expertise datasets We evaluate Robometer both as a reward estimator and as a component for robot learning. On self-collected, unseen data from three universities, Robometer outperforms existing reward models. We further evaluate Robometer in real-world experiments across four universities (USC, MIT, UTD, UW) on tasks including online RL, offline RL, model-based RL, data retrieval for imitation learning, and zero-shot failure detection. Across these settings, Robometer consistently improves robot learning performance compared to using prior reward models. We hope Robometer can help accelerate research on general-purpose robot learning. We encourage you to try out the model yourself for your own experiments zero-shot or fine-tune it on your own data! Project: http://robometer.github.io Paper: https://lnkd.in/gz7A78bD Code: https://lnkd.in/gdNh2Qej This project was co-led with Anthony Liang. Huge thanks to all our incredible collaborators + advisors: Jiahui Zhang, Minyoung Hwang, Abrar A., Sidhant Kaushik, Aditya Shah, Alex S. Huang, Luke Zettlemoyer, Dieter Fox, Yu Xiang, Anqi Li, Andreea Bobu, Abhishek Gupta, Stephen Tu, Erdem Bıyık, and Jesse Zhang.

10 Comments

Zaki Djellil 1mo

Hello Anthony, I saw your latest robotics project and it's quite impressive. At FileMarket Labs (Techstars-backed), we specialize in POV video datasets captured via proprietary equipment: 4K/FullHD cameras, robotics rigs (now deployed in Kathmandu), and tools for precise metadata (pose, motion, environments). Perfect for training embodied AI, CV, biometrics, or robotics – ethically sourced, consented, and QC'd. We deliver fast (10k+ hours/month capacity) with annotation support. Open to a quick 15-min chat with our CEO Ilya Orlov? Book here: www.calendly.com/filemarket .

Bercan Kilic 1mo

Incredible work!

2 Reactions

Anurag Roy 1mo

Bob Lam

Minjune Hwang 1mo

Awesome work!! Super

1 Reaction

Siddharth Krishnan 1mo

Great work!

1 Reaction

Yunshuang Li 1mo

Great work!!

Behrad Toghi 1mo

really cool work Erdem Bıyık and team!

2 Reactions

Ceren D. 1mo

Proud of you!

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Abrar A.
1mo
Report this post
Take a look at our work on Robometer! Great job for Yigit Korkmaz and Anthony Liang for pushing on it! Robometer enables so so many new ways to train robots easier, from failure detection and online RL to planning in a world model or learning from play data. Take a look below!

Yigit Korkmaz

Computer Science PhD Student at USC | Fulbright Alumni
1mo

Can a single reward model work zero-shot across robots, tasks, and environments? We’re excited to introduce Robometer, a general-purpose robotic reward model trained on 1M+ trajectories that enables: - Online RL - Offline RL - Model-based RL - Data retrieval + imitation learning - Automatic failure detection And more—all zero-shot across robots, tasks, and scenes. Most existing reward models are trained only on successful or reward-labeled demonstrations, which limits their ability to distinguish between successful and failed attempts. Robometer addresses this by using trajectory comparisons, enabling reward models to learn from reward-unlabeled and failure trajectories. Robometer is a 4B-parameter video-language reward model predicting: - Dense reward (0-1 progress) - Success - Trajectory preferences (which attempt was better) We train by pairing input trajectories of different expertise levels and tasks, and by using video rewind. Robometer’s ability to train on reward-unlabeled, failed robot trajectories allows us to scale to training on our RBM-1M dataset: - 21 embodiments: bimanual, humanoid, mobile, tabletop - 1M+ trajs - 140k+ trajectories from mixed-expertise datasets We evaluate Robometer both as a reward estimator and as a component for robot learning. On self-collected, unseen data from three universities, Robometer outperforms existing reward models. We further evaluate Robometer in real-world experiments across four universities (USC, MIT, UTD, UW) on tasks including online RL, offline RL, model-based RL, data retrieval for imitation learning, and zero-shot failure detection. Across these settings, Robometer consistently improves robot learning performance compared to using prior reward models. We hope Robometer can help accelerate research on general-purpose robot learning. We encourage you to try out the model yourself for your own experiments zero-shot or fine-tune it on your own data! Project: http://robometer.github.io Paper: https://lnkd.in/gz7A78bD Code: https://lnkd.in/gdNh2Qej This project was co-led with Anthony Liang. Huge thanks to all our incredible collaborators + advisors: Jiahui Zhang, Minyoung Hwang, Abrar A., Sidhant Kaushik, Aditya Shah, Alex S. Huang, Luke Zettlemoyer, Dieter Fox, Yu Xiang, Anqi Li, Andreea Bobu, Abhishek Gupta, Stephen Tu, Erdem Bıyık, and Jesse Zhang.
Like Comment
To view or add a comment, sign in
Erdem Bıyık
1mo
Report this post
Latest work from our lab together with an amazing team of collaborators: a robotic reward foundation model. It generalizes across tasks and embodiments. This video is a great demonstration of it, and it has many downstream applications that we are excited about. It's open-source, and super easy to use :)

Yigit Korkmaz

Computer Science PhD Student at USC | Fulbright Alumni
1mo

Can a single reward model work zero-shot across robots, tasks, and environments? We’re excited to introduce Robometer, a general-purpose robotic reward model trained on 1M+ trajectories that enables: - Online RL - Offline RL - Model-based RL - Data retrieval + imitation learning - Automatic failure detection And more—all zero-shot across robots, tasks, and scenes. Most existing reward models are trained only on successful or reward-labeled demonstrations, which limits their ability to distinguish between successful and failed attempts. Robometer addresses this by using trajectory comparisons, enabling reward models to learn from reward-unlabeled and failure trajectories. Robometer is a 4B-parameter video-language reward model predicting: - Dense reward (0-1 progress) - Success - Trajectory preferences (which attempt was better) We train by pairing input trajectories of different expertise levels and tasks, and by using video rewind. Robometer’s ability to train on reward-unlabeled, failed robot trajectories allows us to scale to training on our RBM-1M dataset: - 21 embodiments: bimanual, humanoid, mobile, tabletop - 1M+ trajs - 140k+ trajectories from mixed-expertise datasets We evaluate Robometer both as a reward estimator and as a component for robot learning. On self-collected, unseen data from three universities, Robometer outperforms existing reward models. We further evaluate Robometer in real-world experiments across four universities (USC, MIT, UTD, UW) on tasks including online RL, offline RL, model-based RL, data retrieval for imitation learning, and zero-shot failure detection. Across these settings, Robometer consistently improves robot learning performance compared to using prior reward models. We hope Robometer can help accelerate research on general-purpose robot learning. We encourage you to try out the model yourself for your own experiments zero-shot or fine-tune it on your own data! Project: http://robometer.github.io Paper: https://lnkd.in/gz7A78bD Code: https://lnkd.in/gdNh2Qej This project was co-led with Anthony Liang. Huge thanks to all our incredible collaborators + advisors: Jiahui Zhang, Minyoung Hwang, Abrar A., Sidhant Kaushik, Aditya Shah, Alex S. Huang, Luke Zettlemoyer, Dieter Fox, Yu Xiang, Anqi Li, Andreea Bobu, Abhishek Gupta, Stephen Tu, Erdem Bıyık, and Jesse Zhang.
Like Comment
To view or add a comment, sign in
Gautam Salhotra
1mo
Report this post
Progress understanding is key to long-horizon tasks and RL. Specialized reward models don’t scale. Sparse rewards don't provide enough gradient information. We need works like this to enable generalizable progress understanding on real world rollouts.

Yigit Korkmaz

Computer Science PhD Student at USC | Fulbright Alumni
1mo

Can a single reward model work zero-shot across robots, tasks, and environments? We’re excited to introduce Robometer, a general-purpose robotic reward model trained on 1M+ trajectories that enables: - Online RL - Offline RL - Model-based RL - Data retrieval + imitation learning - Automatic failure detection And more—all zero-shot across robots, tasks, and scenes. Most existing reward models are trained only on successful or reward-labeled demonstrations, which limits their ability to distinguish between successful and failed attempts. Robometer addresses this by using trajectory comparisons, enabling reward models to learn from reward-unlabeled and failure trajectories. Robometer is a 4B-parameter video-language reward model predicting: - Dense reward (0-1 progress) - Success - Trajectory preferences (which attempt was better) We train by pairing input trajectories of different expertise levels and tasks, and by using video rewind. Robometer’s ability to train on reward-unlabeled, failed robot trajectories allows us to scale to training on our RBM-1M dataset: - 21 embodiments: bimanual, humanoid, mobile, tabletop - 1M+ trajs - 140k+ trajectories from mixed-expertise datasets We evaluate Robometer both as a reward estimator and as a component for robot learning. On self-collected, unseen data from three universities, Robometer outperforms existing reward models. We further evaluate Robometer in real-world experiments across four universities (USC, MIT, UTD, UW) on tasks including online RL, offline RL, model-based RL, data retrieval for imitation learning, and zero-shot failure detection. Across these settings, Robometer consistently improves robot learning performance compared to using prior reward models. We hope Robometer can help accelerate research on general-purpose robot learning. We encourage you to try out the model yourself for your own experiments zero-shot or fine-tune it on your own data! Project: http://robometer.github.io Paper: https://lnkd.in/gz7A78bD Code: https://lnkd.in/gdNh2Qej This project was co-led with Anthony Liang. Huge thanks to all our incredible collaborators + advisors: Jiahui Zhang, Minyoung Hwang, Abrar A., Sidhant Kaushik, Aditya Shah, Alex S. Huang, Luke Zettlemoyer, Dieter Fox, Yu Xiang, Anqi Li, Andreea Bobu, Abhishek Gupta, Stephen Tu, Erdem Bıyık, and Jesse Zhang.
Like Comment
To view or add a comment, sign in
Jesse Zhang
1mo
Report this post
Our Robometer reward model really works, has code available for zero-shot use for your own robot learning applications, and is easily fine-tunable to your own data. Check out this thread!

Yigit Korkmaz

Computer Science PhD Student at USC | Fulbright Alumni
1mo

Can a single reward model work zero-shot across robots, tasks, and environments? We’re excited to introduce Robometer, a general-purpose robotic reward model trained on 1M+ trajectories that enables: - Online RL - Offline RL - Model-based RL - Data retrieval + imitation learning - Automatic failure detection And more—all zero-shot across robots, tasks, and scenes. Most existing reward models are trained only on successful or reward-labeled demonstrations, which limits their ability to distinguish between successful and failed attempts. Robometer addresses this by using trajectory comparisons, enabling reward models to learn from reward-unlabeled and failure trajectories. Robometer is a 4B-parameter video-language reward model predicting: - Dense reward (0-1 progress) - Success - Trajectory preferences (which attempt was better) We train by pairing input trajectories of different expertise levels and tasks, and by using video rewind. Robometer’s ability to train on reward-unlabeled, failed robot trajectories allows us to scale to training on our RBM-1M dataset: - 21 embodiments: bimanual, humanoid, mobile, tabletop - 1M+ trajs - 140k+ trajectories from mixed-expertise datasets We evaluate Robometer both as a reward estimator and as a component for robot learning. On self-collected, unseen data from three universities, Robometer outperforms existing reward models. We further evaluate Robometer in real-world experiments across four universities (USC, MIT, UTD, UW) on tasks including online RL, offline RL, model-based RL, data retrieval for imitation learning, and zero-shot failure detection. Across these settings, Robometer consistently improves robot learning performance compared to using prior reward models. We hope Robometer can help accelerate research on general-purpose robot learning. We encourage you to try out the model yourself for your own experiments zero-shot or fine-tune it on your own data! Project: http://robometer.github.io Paper: https://lnkd.in/gz7A78bD Code: https://lnkd.in/gdNh2Qej This project was co-led with Anthony Liang. Huge thanks to all our incredible collaborators + advisors: Jiahui Zhang, Minyoung Hwang, Abrar A., Sidhant Kaushik, Aditya Shah, Alex S. Huang, Luke Zettlemoyer, Dieter Fox, Yu Xiang, Anqi Li, Andreea Bobu, Abhishek Gupta, Stephen Tu, Erdem Bıyık, and Jesse Zhang.

1 Comment
Like Comment
To view or add a comment, sign in
Jennifer Gill Roberts
1mo
Report this post
So impressed with the Robometer research! Amazing that you were able to create the largest dataset of robot rewards and that it will enable you to solve so many robotics problems with one simple method.

Yigit Korkmaz

Computer Science PhD Student at USC | Fulbright Alumni
1mo

Can a single reward model work zero-shot across robots, tasks, and environments? We’re excited to introduce Robometer, a general-purpose robotic reward model trained on 1M+ trajectories that enables: - Online RL - Offline RL - Model-based RL - Data retrieval + imitation learning - Automatic failure detection And more—all zero-shot across robots, tasks, and scenes. Most existing reward models are trained only on successful or reward-labeled demonstrations, which limits their ability to distinguish between successful and failed attempts. Robometer addresses this by using trajectory comparisons, enabling reward models to learn from reward-unlabeled and failure trajectories. Robometer is a 4B-parameter video-language reward model predicting: - Dense reward (0-1 progress) - Success - Trajectory preferences (which attempt was better) We train by pairing input trajectories of different expertise levels and tasks, and by using video rewind. Robometer’s ability to train on reward-unlabeled, failed robot trajectories allows us to scale to training on our RBM-1M dataset: - 21 embodiments: bimanual, humanoid, mobile, tabletop - 1M+ trajs - 140k+ trajectories from mixed-expertise datasets We evaluate Robometer both as a reward estimator and as a component for robot learning. On self-collected, unseen data from three universities, Robometer outperforms existing reward models. We further evaluate Robometer in real-world experiments across four universities (USC, MIT, UTD, UW) on tasks including online RL, offline RL, model-based RL, data retrieval for imitation learning, and zero-shot failure detection. Across these settings, Robometer consistently improves robot learning performance compared to using prior reward models. We hope Robometer can help accelerate research on general-purpose robot learning. We encourage you to try out the model yourself for your own experiments zero-shot or fine-tune it on your own data! Project: http://robometer.github.io Paper: https://lnkd.in/gz7A78bD Code: https://lnkd.in/gdNh2Qej This project was co-led with Anthony Liang. Huge thanks to all our incredible collaborators + advisors: Jiahui Zhang, Minyoung Hwang, Abrar A., Sidhant Kaushik, Aditya Shah, Alex S. Huang, Luke Zettlemoyer, Dieter Fox, Yu Xiang, Anqi Li, Andreea Bobu, Abhishek Gupta, Stephen Tu, Erdem Bıyık, and Jesse Zhang.
Like Comment
To view or add a comment, sign in
Alex S. Huang
1mo
Report this post
I'm thrilled to be a part of Robometer, a cross-collaborative project with UW, USC, and MIT! Huge thanks to Jiahui Zhang, Yu Xiang, and all the other collaborators and advisors for a great learning experience. Robometer is a new video-language reward model that improves zero-shot dense-reward and progress predictions across all sorts of tasks, robots and scenes. Try Robometer out at: https://lnkd.in/gT5FvS9N

Yigit Korkmaz

Computer Science PhD Student at USC | Fulbright Alumni
1mo

Can a single reward model work zero-shot across robots, tasks, and environments? We’re excited to introduce Robometer, a general-purpose robotic reward model trained on 1M+ trajectories that enables: - Online RL - Offline RL - Model-based RL - Data retrieval + imitation learning - Automatic failure detection And more—all zero-shot across robots, tasks, and scenes. Most existing reward models are trained only on successful or reward-labeled demonstrations, which limits their ability to distinguish between successful and failed attempts. Robometer addresses this by using trajectory comparisons, enabling reward models to learn from reward-unlabeled and failure trajectories. Robometer is a 4B-parameter video-language reward model predicting: - Dense reward (0-1 progress) - Success - Trajectory preferences (which attempt was better) We train by pairing input trajectories of different expertise levels and tasks, and by using video rewind. Robometer’s ability to train on reward-unlabeled, failed robot trajectories allows us to scale to training on our RBM-1M dataset: - 21 embodiments: bimanual, humanoid, mobile, tabletop - 1M+ trajs - 140k+ trajectories from mixed-expertise datasets We evaluate Robometer both as a reward estimator and as a component for robot learning. On self-collected, unseen data from three universities, Robometer outperforms existing reward models. We further evaluate Robometer in real-world experiments across four universities (USC, MIT, UTD, UW) on tasks including online RL, offline RL, model-based RL, data retrieval for imitation learning, and zero-shot failure detection. Across these settings, Robometer consistently improves robot learning performance compared to using prior reward models. We hope Robometer can help accelerate research on general-purpose robot learning. We encourage you to try out the model yourself for your own experiments zero-shot or fine-tune it on your own data! Project: http://robometer.github.io Paper: https://lnkd.in/gz7A78bD Code: https://lnkd.in/gdNh2Qej This project was co-led with Anthony Liang. Huge thanks to all our incredible collaborators + advisors: Jiahui Zhang, Minyoung Hwang, Abrar A., Sidhant Kaushik, Aditya Shah, Alex S. Huang, Luke Zettlemoyer, Dieter Fox, Yu Xiang, Anqi Li, Andreea Bobu, Abhishek Gupta, Stephen Tu, Erdem Bıyık, and Jesse Zhang.

6 Comments
Like Comment
To view or add a comment, sign in
Synergy Tech Robotics Corp.

235 followers
1mo
Report this post
Universal Robots and Scale AI launch imitation learning system for training industrial robotsUniversal Robots (UR) has unveiled the UR AI Trainer at GTC 2026 in Silicon Valley. Developed in collaboration with Scale AI, the AI Trainer marks a tectonic shift as robots move from pre-programmed applications to fully AI-driven tasks. These systems are powered by robust data generated in AI training cells where “robots imitate humans”.  “Our customers, ranging […] https://lnkd.in/eCUZW3nq

Universal Robots and Scale AI launch imitation learning system for training industrial robots https://roboticsandautomationnews.com
Like Comment
To view or add a comment, sign in
Robotic Crew

3,562 followers
4w
Report this post
Robots learning skills through intention-based control suggest a shift from scripted actions to adaptive behavior. At Robotic Crew, we follow advances that bring autonomy closer to real decision-making. How important is intention for next-gen robotics? https://lnkd.in/dKJN9z_w

Robots with different bodies can now share skills: What intention-based learning changes techxplore.com
Like Comment
To view or add a comment, sign in
ROBOTICS BUSINESS NEWS

9,279 followers
1mo
Report this post
“Our customers, ranging from large enterprises to AI research labs, are no longer just asking for AI features,” said Anders Billesø Beck, VP of AI Robotics Products at Universal Robots. “They need a way to collect high-fidelity, synchronized robot and vision data to train AI models on the same robots they intend to deploy. Our AI Trainer is the industry's first direct lab-to-factory solution for AI model training.” #PhysicalAI #UniversalRobots #ScaleAI #AIRobotics #LabToFactory #ImitationLearning #SmartManufacturing #NVIDIAGTC #Industry40 #Automation #FutureOfWork https://lnkd.in/dsXnY4r5

Universal Robots Introduces AI Trainer at GTC 2026, Pioneering Lab-to-Factory Physical AI Training roboticsbusinessnews.com
Like Comment
To view or add a comment, sign in
Packaging Technology Today

1,007 followers
1mo
Report this post
The AI Trainer marks a tectonic shift as robots move from pre-programmed applications to fully AI-driven tasks. These systems are powered by robust data generated in AI training cells where robots imitate…

Universal Robots and Scale AI Launch Imitation Learning System to Accelerate AI Model Training, Bridging the ‘Lab-to-Factory’ Gap   - Packaging Technology Today https://www.packagingtechtoday.com
Like Comment
To view or add a comment, sign in

1,211 followers

7 Posts

View Profile Connect

More Relevant Posts

Explore related topics

Explore content categories