The Annotation Bottleneck Is Shifting

Written by Björn Ingmansson | Nov 27, 2025 8:53:23 AM

For the past decade, the autonomous vehicle industry has been laser-focused on one metric: annotation throughput. How many bounding boxes per hour? How fast can we label a million frames? The implicit assumption was that more labelled data would solve our problems.

But that assumption is breaking down.

Today's challenge isn't labelling faster—it's finding the right data to label in the first place.

From Volume to Value

As fleets grow and sensor data accumulates at petabyte scale, autonomy teams face a new reality: the vast majority of collected data is routine and redundant. Highway cruising in perfect weather doesn't teach your model anything new after the ten-thousandth example.

What matters are the edge cases buried in that mountain of data: the construction worker in a high-vis vest stepping between cones, the wheelchair user crossing at dusk, the delivery truck with non-standard markings. These rare scenarios—often representing less than 0.1% of collected data—are where models actually learn and improve.

The bottleneck has shifted from annotation capacity to data curation: the ability to rapidly identify, prioritise, and surface the specific scenarios that will move your model's performance forward.

Why Traditional Approaches Fall Short

Many teams today rely on automatic ranking algorithms to filter data before annotation. These algorithms are useful but noisy, as they mix genuinely valuable edge cases with false positives and redundant samples. When teams send top-ranked data directly to annotation without validation, they end up paying full annotation prices for low-value data.

The result? Wasted budget, slower iteration cycles, and models that don't improve where it matters most.

Curation Becomes Annotation

The solution isn't better ranking algorithms alone—it's treating curation as a scalable annotation challenge. Just as we built industrial-scale workflows to label millions of bounding boxes, we now need workflows to validate and triage millions of candidate scenarios.

This means:

Lightweight validation steps where trained reviewers rapidly assess whether a scene matches target criteria—at a fraction of the cost of full annotation
Structured backlogs where deep learning teams define search intent ("find 100 verified truck cut-ins") and curation teams work through prioritised datasets systematically
Feedback loops that connect curation insights back to ranking models, continuously improving automation while maintaining human oversight on what gets annotated

By making curation measurable, repeatable, and cost-efficient, we ensure annotation resources focus entirely on confirmed high-value samples. This dramatically improves both efficiency and model relevance.

The Strategic Shift

This transition from volume-first to value-first data operations marks a significant milestone in the industry's maturity. Teams that master data curation will:

Reduce total annotation costs while improving model performance on critical scenarios
Accelerate development cycles by closing the loop from fleet insights to model updates faster
Free engineering capacity from manual data triage to focus on improving automation and model architecture

As foundation models and end-to-end learning mature, the role of human-in-the-loop will continue to evolve—from drawing boxes to teaching judgment, from labelling everything to curating what matters.

The question isn't whether your team can annotate faster. It's whether you can find the right data to annotate in the first place.

That's where tomorrow's competitive advantage lies.

View full post