This wiki dives deep into the
world of visionAI and it's easy to get lost in the woods. This is why
we want to share two lessons we learned working on hundreds of visionAI
projects together with our users here at Hasty.ai.
Most of the ML research focuses on coming up with new model
architectures and fancy hyperparameters. However, when you go into the
real world, you're operating under completely different conditions—the
most drastic change probably being that data in the real world is never
as clean in the lab. Consequently, when you're building a model for
production, you should really understand the interplay between your
model and your data and work on getting the right data instead of
chasing after what's SOTA.
Let's make this a bit more tangible: Andrew Ng and his team got stuck
at an accuracy of 76%. Then they split up the team for the next two
weeks: one group trying to improve the data, the other one working on
the model. The data team could increase the model's performance to
93.1%, whereas the other team couldn't improve it at all [source]. Similar results occur often.
In traditional software development, the word "agile" is so widely
used that it became inflated and part of the typical buzzword bingo.
Almost everyone knows it and many teams worldwide implemented the
In ML, however, most teams still follow a linear approach. They
collect data, annotate it for weeks if not months, and only then train
their first model. More often than not the first model performs poorly
and the team notices that they should have collected different data,
chosen another annotation strategy (e.g., masks instead of bounding
boxes), ... The result: 60% of ML projects get killed in the
proof-of-concept stage because many resources have been invested, but
the results are disappointing.
I don't want to sound like a mediocre business consultant stating the
obvious, but the answer to this is a more agile approach. Instead of
annotating the whole dataset first, only a subset of the data should be
labeled and the first model should be trained early in the process.
Then, you can add more data, and when you got that part right, start
tweaking the initial model.
There are a lot of new tools and coming up aiming to enable such a process. Hasty.ai is one of them.
Check out our blog to read more practical guides around visionAI in production. To list a few of many posts: