Creating data labeling guidelines for safe self-driving vehicles

The perception subsystem of an Autonomous Vehicle (AV) must enable the overall vehicle to drive safely. How? Well, most perception systems developed today are utilizing deep learning to perceive, recognize and classify the surrounding environment based on sensor information from sources such as lidars, cameras and radars. When training those systems it is essential to have reliable ground truth with as few mistakes as possible and, even more importantly, ground truth without bias. Why? Simple: a small number of (random) errors in the ground truth data might not make much of a difference, but many errors, especially systematic ones, will.

Creating a dataset for supervised machine learning

As many know, the behavior of a Machine Learning (ML) function is essentially determined by the data it is trained on. For supervised ML algorithms, that data is specified by the labels assigned to it, which are in turn specified by the labeling guideline. Given this relationship between the labeling guideline and final ML function, it is natural that high quality data likewise requires a consistent and accurate guideline. In other words: if the labeling guideline is inconsistent, doesn't describe with enough detail, or is lacking descriptions of important objects or classes, the dataset will also most likely be inconsistent with a lot of systematic errors. Besides, since labeling is often performed by a distributed workforce, the guideline needs to be understood and interpreted in the same way by potentially hundreds of persons, usually with different cultural backgrounds and education levels. Thus, this places additional requirements on the readability and consistency of the guideline: everyone should be able to interpret it in the same way.

Similarities between ML projects and traditional software projects

Comparing the lifecycle of a ML project with the one of a traditional software project reveals some interesting parallels.

Life cycle of an ML project vs traditional software project
Life cycle of an ML project vs traditional software project

It is common knowledge that writing necessary, specific, understandable, accurate, feasible, and testable requirements is important for the success of a software project. It is much cheaper to find mistakes in a requirement review than it is in production! And in this regard, there is a whole field within systems engineering on how to write and handle requirements. There are a large number of tools to help you version control, trace, identify, etc. your requirements, too.

In a machine learning project, as shown in the picture above, the labeling guideline is similar in purpose to the software design and requirement specification used in a traditional software project. However, there is not a similar rigorous process when it comes to assuring the quality of guidelines for data labeling. In the case of those projects, the focus is put on designing the neural network.

Our approach

At Kognic we believe that the labeling guidelines should be treated in the same way as software requirements. They should also be necessary, specific, understandable (even more since there are typically more people that shall understand the instruction), accurate, feasible, and testable. As said above, it is just as important to put a large effort into reviewing your guideline since it is much easier and cheaper to find mistakes during an early review stage rather than during full production also in a machine learning project. In addition, tests both can and should be made on your instructions. Such a test can, for instance, be to let a number of annotators independently annotate a number of selected inputs and then evaluate the agreement level of the annotations, to ensure that the guideline is interpreted consistently.

As experts in safe perception, we can supply you with the tools and knowledge that help you create and version control your instructions and, if needed, trace them to your test cases and their results. All this will let you create a high-quality guideline that captures all function needs, and with zero room for ambiguities. Further, you can even trust us with the analytics! Through our Data Quality Analytics tool we will evaluate the annotated data and the data quality in your dataset and, finally, connect the results to your labeling guideline.

Do you want to know more? Watch my talk “It all starts with a consistent labeling guideline” held at the Gothenburg AI Alliance Conference to discover our way of working.