CVPR 2025 Highlights: Depth Estimation, Semantic Segmentation, and Model Refinement

Keymakr Data Labeling

We create exceptional Training Datasets for Computer Vision AI and Machine Learning Models

Published Jun 26, 2025

This year, Keymakr CEO Michael Abramov once again attended CVPR 2025 — one of the most engaging conferences in the field of computer vision. He shares key insights and takeaways from the event: from the rise of depth estimation technologies and persistent segmentation challenges to emerging trends in generative data and the security of autonomous transport systems..

3D and Depth

Many are discussing how CVPR 2025 differs from previous conferences. In truth, if we're being honest… there are no fundamental differences. Everything remains at the same high level: new topics, striking innovations in computer vision, fresh ideas, and impressive demos. It’s still one of the world’s main stages for technological breakthroughs. The only real difference lies in which ideas took center stage this year.

One of the most noticeable trends was the sharp rise in activity around 3D. But it's not so much about 3D visualization in the traditional sense; rather, it’s about technologies related to perception, commonly called “depth.” I’ve never seen so many attempts to work specifically with depth in imaging.

A key technical focus was on stereoscopic and monocular vision. To illustrate: stereoscopic vision is how humans determine the distance to an object using two eyes. Our eyes are spaced apart and look at the same object from slightly different angles. The brain, without knowing trigonometry, nonetheless calculates the distance based on those angles, essentially using the geometry of a triangle.

Camera-based stereoscopic systems work the same way: two lenses allow the system to "understand" how far away an object is. However, some of the solutions presented at the conference were more unconventional. For example, one development used a single camera, but split the signal internally into two slightly offset streams. This allows for similar depth calculations using just one lens. The demo of this camera was genuinely impressive.

Overall, depth became a major trend. But alongside it, there remains a steady and growing interest in segmentation, which comes as no surprise.

Segmentation and the “Last Mile”

Segmentation remains one of the central challenges in computer vision. It is the process of dividing an image into distinct logical elements, such as a person, a car, a tree, a traffic light, road signs, and so on.

Segmentation is essential in any scenario involving autonomous navigation, whether it's a self-driving car or a drone. It enables the system to make decisions: turn or stop, switch on headlights or a signal, keep moving, or slow down.

Today’s computer vision systems are increasingly facing tasks that until recently seemed like science fiction. In the past, detecting an object from a green background was considered an achievement. Now, that’s just the starting point. After all, the real world isn’t a studio — it’s streets, forests, crowds of people, and complex scenes where neural networks must recognize even the tiniest details.

This is where the annotation work begins: every leaf, flower, and detail in the image must be manually labeled. It’s hard, expensive, and requires enormous effort.

This challenge was actively discussed at the conference. It’s even referred to as the “last mile problem.” The first part of the work, the so-called 90%, can be completed quickly. But the final 10%, where precision is critical, consumes a disproportionate amount of time and resources. It's like writing a book: the draft may be done in a month, but the final editing can take a few months.

This is exactly what a large community of researchers is focused on today: refinement, polishing, and squeezing the maximum performance out of models. So we’re not making huge breakthroughs, it comes down to meticulous engineering.

Recommended by LinkedIn

Pose Estimation using Computer Vision

Siddhi Bansal 4 years ago

Semi-Automatic 3D data annotation using Structure from…

dtLabs 1 year ago

Reimagining Visual Sensing

Kynan Eng 3 years ago

Autonomous transport, robotrucks, and cyber threats

The automotive sector felt very confident at the conference, especially in robotic transportation, particularly autonomous freight systems.

For example, Aurora, a company showcased at CVPR 2025, has already deployed hundreds of fully autonomous trucks on Texas roads. No drivers. No escorts. These vehicles are transporting goods between Houston and Austin, and this is no longer a pilot project, but real logistics in action.

Many experts believe that robotrucks are an even more significant breakthrough than robotaxis. Freight transport is the backbone of the global supply chain, and any improvements in this sector have wide-reaching economic impacts.

However, these advancements come with new threats. Autonomous trucks can be hacked. Imagine: in the past, robbing a train required a gang of horseback riders. Now, all it takes is a skilled hacker. Breach the operating system, change the route, redirect the truck to a different warehouse — and just like that, the cargo is gone.

That’s why, alongside the development of autonomous transportation, a new need is emerging: cybersecurity for autonomous systems. This will become a new industry, with new startups and new challenges.

Generative data and model self-correction

Among the scientific advancements, data generation technologies stood out in particular. One team presented a platform that autonomously analyzes your dataset and identifies missing elements. For instance, say you’re training a model to distinguish basil from weeds. The platform assesses whether your sample set contains enough high-quality images, and if not, it generates the missing scenes itself.

It then fine-tunes the model and checks: has the accuracy improved? If so, the generation was successful. This approach turns generative AI from a mere image creation tool into a strategic method for enhancing models by filling in data gaps.

This is especially relevant when the cost of annotation and collecting new images becomes prohibitively high.

What’s next?

CVPR 2025 once again confirmed: computer vision has become a mature industry. Revolutionary breakthroughs are giving way to engineering refinements, fine-tuning, vulnerability protection, and improving model robustness. And yet, this doesn’t make the conference any less exciting.

A world where trucks drive themselves and cameras see through foliage is no longer science fiction. It’s already here. All that remains is to complete the last mile.

Keymakr Data Labeling

1,021 follower

+ Subscribe

Josh McGregor 10mo

Segmentation still presents challenges, meaning there's still a lot of impactful work to be done. Great summary!

To view or add a comment, sign in

CVPR 2025 Highlights: Depth Estimation, Semantic Segmentation, and Model Refinement

Keymakr Data Labeling

We create exceptional Training Datasets for Computer Vision AI and Machine Learning Models

Recommended by LinkedIn

Keymakr Data Labeling

1,021 follower

More articles by Keymakr Data Labeling

Others also viewed

Unfolding Dimensions with Geometric Autoencoders 🌀🔢

Comparing Instance Segmentation with Other Computer Vision Algorithms

Image Edge Detection with Signal Processing Approach

Choosing Image Sensors for AI Vision – Three Keys to Getting it Right!

SIGGRAPH 2025: 3 Things That Got Us Thinking About the Future of Digital Creation

Bridging the Sim-to-Real Gap with Advanced Physics-Based Simulations for Derformable Solids: Style3D’s Role in AI Training with Newton

Understanding And Preventing Aberrations In Machine Vision Applications

AI for Mushrooms: Shiitake Detection and Quality Classification

🚀 New Horizon Europe Project! Shaping the Future of Cities with Generative AI 🏙️

Innovations Advancing 3d Scene Reconstruction

Computer Vision for Autonomous Robot Navigation

Key Takeaways from AI and Robotics Workshops

Key Developments in 3D and AR Technology

Explore content categories

Recommended by LinkedIn

Keymakr Data Labeling

1,021 follower

More articles by Keymakr Data Labeling

From video streams to executive summaries: how LLMs interpret visual intelligence

Creating living digital worlds from above

Real-world use cases and future potential of physical AI

NVIDIA GTC 2026: key announcements, trends and new products

LLMs in logistics control rooms: natural-language dashboards for faster decisions

Physical AI – the ability of intelligence to touch the world

The role of Human-in-the-Loop in security AI

CES 2026 overview: trends and new products

From landfills to smart recycling plants

Data annotation in sports analytics

Others also viewed

Unfolding Dimensions with Geometric Autoencoders 🌀🔢

Comparing Instance Segmentation with Other Computer Vision Algorithms

Image Edge Detection with Signal Processing Approach

Choosing Image Sensors for AI Vision – Three Keys to Getting it Right!

SIGGRAPH 2025: 3 Things That Got Us Thinking About the Future of Digital Creation

Bridging the Sim-to-Real Gap with Advanced Physics-Based Simulations for Derformable Solids: Style3D’s Role in AI Training with Newton

Understanding And Preventing Aberrations In Machine Vision Applications

AI for Mushrooms: Shiitake Detection and Quality Classification

🚀 New Horizon Europe Project! Shaping the Future of Cities with Generative AI 🏙️

Similar topics

Innovations Advancing 3d Scene Reconstruction

Computer Vision for Autonomous Robot Navigation

Key Takeaways from AI and Robotics Workshops

Key Developments in 3D and AR Technology

Explore content categories