Parallel Domain: Why Synthetic Data is Critical to the Future of ML
A driver assistance system stops a car when a child unexpectedly runs into the road. A drone drops a package on the doorstep of a customer. A robot picks the shirt you ordered online off a warehouse shelf and packs it for shipping.
How does the car recognize the child? Why does the drone drop the package on your porch and not in your swimming pool? How does the robot select the exact shirt you ordered?
These feats are all made possible by computer vision machine learning (ML) models and the fuel for these models is data.
Machine learning models ingest tremendous amounts of data, which enables them to replicate human performance. Models need sufficient high-quality data to unlock their potential.
At March, we are excited to announce our investment in Parallel Domain, a company that is providing ML developers with an on-demand data generation platform to train and test computer vision models 180x faster than today’s methods.
Obtaining real world data needed to train computer vision models is a significant obstacle to their successful deployment. Generating the necessary volume of high-quality training data is capital and time intensive – enterprise ML projects allocate up to 50% of budget just for data acquisition.
Once the data is collected, the task of sorting and labeling is time-consuming, costly, and error prone. Real data is also often unbalanced, and biased – there are too few examples of the objects and situations the company needs in the data – leading to poor ML performance. This performance can have, as in the example of a child unexpectedly in the road, tremendously high and sometimes even life or death stakes.
Parallel Domain solves this data problem by generating synthetic sensor data (camera, lidar, and radar) at scale and without human intervention in an elastic data cloud, to supplement or even replace real data.
Gartner predicts that by 2024, 60% of the data used for the development of AI and analytics projects will be synthetically generated. This makes the potential synthetic data market worth an estimated $57B by 2027.
Parallel Domain is well positioned to be a market leader in synthetic data for computer vision due to its proprietary procedural generation technology, ever-expanding asset catalog, strong value-proposition to customers, and experienced team.
The company is led by CEO Kevin McNamara. Kevin is supported by an incredibly talented leadership team that has pioneered synthetic data, simulation, and computer graphics at Apple, Tesla, Pixar, EA, NVIDIA, and other world-class organizations.
March Capital is impressed with the company’s strong customer value proposition and traction with leading players in the mobility space, such as Toyota Research Institute, Google, and Continental. We are thrilled to lead the company’s $30 million Series B financing and look forward to working alongside the Parallel Domain team to accelerate the company’s growth and success.
To learn more about Parallel Domain, visit paralleldomain.com.