30/1 2023 ・ Articles / Product

Updating your annotated data made easy - with our new Dataset Refinement tool

You are probably in possession of or creating a large dataset. Throughout the years, you have learned a lot about what data you need to succeed. In the beginning, you might have focused on getting some data through the pipes and training your models.

As your model improved, though, you realized that your data needed to improve. Software and machine learning models are complex systems that need an iterative way of working, and at Kognic we truly believe that you cannot specify your future needs of data in advance, neither what you annotate, nor how well your labels need to be annotated. This is why we believe that updating data should be a key component in modern data engines.

There are multiple reasons why you would want to update annotated data. There could be annotation mistakes such as missing objects, or that a property for an object was mislabeled. Most likely, you have learned a lot when training models and acquiring more data that you need to adapt. This could mean that you, for example, would want to add a tag to firetrucks or start to annotate objects that were not supposed to be annotated previously.

The brute force method of updating data, meaning manually going through all scenes, is infeasible in the long run. Especially since you are most likely updating your data in rare scenarios, like tagging a vehicle as an ambulance. That’s when we saw that there was a clear need to develop a tool that would enable fixing mistakes easily and updating huge volumes of data with new information. With this knowledge at hand, we have created a tool to efficiently update your data in an iterative way.

Up next, let us introduce our Data Refinement tool. ✨

How it works

Refining your data is as effortless as it can be on our platform!

By default, all your annotated data will be available in the refinement tool ready to be browsed in a gallery that has multiple filter possibilities. You can, for example, search for all your firetrucks and have them visualized in images and in point clouds. The superpower of our tools comes from you, as a customer, uploading your model predictions into our platform. Once that has been done, you can browse model predictions alongside your model predictions in the gallery view. By utilizing the model prediction confidences and sorting the examples on those values, you can easily see examples where the model is certain about one thing, whereas the annotation says the opposite. If your model is good, at least a few top-ranked examples are actual annotation mistakes. When you have spotted a few samples, you can easily adjust the labels right away or send them back for correction in the annotation tool.

As a common example, you could want to browse potentially missed objects and add them to your dataset. In the app, you would select that you want to see objects that were found by your model but do not exist in the annotations. Then you would see the objects with the highest predicted confidence that this is an object. You can then easily select the object you want to have annotated, and send them to the annotation app so they are included in the dataset. Once you have got these objects annotated, you can retrain your model (that hopefully is a bit better at detecting objects) and then you can upload new and better predictions to see if you can find other missing objects. This ensures that you spend just the right amount of time and resources on fixing your dataset.

In this way, the Data Refinement tool offers a functionality where mistakes can be immediately fixed for clear and simple cases (e.g. deleting an object or changing a property that has no dependencies in other parts of the annotation), or sending it back to us at Kognic for correction in more complicated situations (e.g. adding a new object). Continuing with our last example, you would build a collection of data, known as a “chunk”, by selecting some obvious mistakes. After that, you could name the chunk “missed cars” and write a description such as “These cars should be annotated”. From the chunk table, you could then send the selection or “chunk” to us at Kognic for correction with a message to specify that some objects are missing. And then, we would fix them!

What does Data Refinement include?

✅ A flexible object-centric image grid that allows for quick inspection of data.

✅ Viewing objects in-context, which allows you to also see objects in annotation tasks through view-links.

✅ Sorting objects, which allows you to sort objects based on loss / model confidence.

✅ Filtering, which allows you to find relevant data by filtering predicted and/or annotated objects using object filters.

✅ Creating chunks, which allows you to group data that is to be reviewed, corrected or investigated further.

✅ Sending to correction through the UI, which provides a simplified process to get from identified error to corrected annotation.

✅ Support for 3D cuboids in LiDAR and 2D images.

This is how we’re helping customers that are already using our tool

It makes us glad that we’re already helping some of our customers fix mistakes, and improve efficiency and model performance.

With the help of the gallery, its ability to sort in terms of descending confidence and to put interesting objects in chunks, our customers have selected the most interesting objects and pressed the magic button “Send to Kognic for correction”. Correction tasks were then created, where the potentially missing boxes were indicated at the corresponding locations in the image. Our Data Delivery Team has then allocated annotators to these tasks and provided the relevant guidelines. And voilà! Tasks corrected, and mission accomplished.

Try it now, improve efficiency and model performance today

Fixing obvious mistakes and updating any previously annotated data is not only possible, but also easy thanks to all the options that our Data Refinement tool supports.

Would you like to see the above mentioned functionalities in action, and learn about everything our tool can do for you and your teams? Click on the button “Schedule a demo” located at the top menu of our webpage. We would be very happy to show you!

Written by

Photo of undefined

Rocío Martínez Climent

Digital Content Producer

[email protected]
Photo of undefined

Isak Hjortgren

Product Area Lead

[email protected]
Photo of undefined

Jonathan Freer

Data Delivery Coordinator

[email protected]