Unlocking New Capabilities with Unstructured Data through Generative AI and Computer Vision
In 1959, the New York Times featured an article about an “electronic computer” that would soon “be able to walk, talk, see, write, reproduce itself and be conscious of its existence.”
Now, in 2023, we are still far from building a conscious computer, but we’ve certainly achieved computer intelligence that can take us beyond the chihuahua / muffin memes. A fundamental technology first brought to public consciousness by sci-fi media and internet memes but now critical to the practical application of machine intelligence is Computer Vision.
A Brief History of Computer Vision
Computer Vision is where technology meets human-like visual perception. This cutting-edge field emerged in the 1960s when technologists combined artificial intelligence (AI) and image processing to enable machines to understand and interpret visual data. Today, Computer Vision technology drives innovation across industries, from the software supporting medical imaging and augmented reality, to the facial and object recognition capabilities of Amazon’s smart shopping carts or Google’s Waymo autonomous vehicles.
Embraced by tech giants and startups alike, Computer Vision is paving the way for unprecedented advancements with unstructured data. At March Capital, we’re excited by Computer Vision as a category and especially by the enabling technologies and infrastructure supporting it.
How Generative AI Unlocks New Capabilities Within Computer Vision
The recent paradigm shift towards smart technologies powered by generative AI is only increasing the potential capabilities and value creation from Computer Vision.
Generative AI enhances traditional AI methods for anomaly detection and image restoration, two critical aspects of Computer Vision that have historically been highly challenging. Anomaly detection involves identifying rare or abnormal instances in a dataset, which is crucial in applications like defect detection in manufacturing or medical diagnostics. In image restoration, generative AI models learn the underlying patterns of normal data and use those patterns to restore and enhance degraded images.
Another significant contribution of generative AI to the field is its ability to synthesize and augment unstructured data. Traditionally, generating realistic and diverse visual content from scratch or with limited data was a daunting task. However, Generative Adversarial Networks (GANs) and other generative model approaches have revolutionized this process, enabling the creation of high-quality images, videos, and even 3D objects. This breakthrough has far-reaching implications across industries, from generating lifelike virtual environments for gaming and simulation to enhancing data augmentation techniques for more robust training of Computer Vision models.
The application of generative AI to these Computer Vision problems opens up exciting possibilities – from bringing new life to historical archives to enhancing image quality for surveillance, medical imaging, virtual worlds, and more. Two March Capital portfolio companies, SparkCognition and Parallel Domain, have been at the forefront of innovation in this category, employing the techniques mentioned above for years.
How SparkCognition is Leveraging Generative AI and Computer Vision
Founded in 2013, SparkCognition is a trailblazer for solving critical problems for the world’s largest and most impactful companies. It leverages computer vision, machine learning, natural language processing, and generative AI technologies to create an intelligent fabric around existing infrastructure, making physical assets more reliable, efficient, and secure.
For example, SparkCognition’s technology can predict turbine failures for renewable energy operators with greater than 90% accuracy as far as 180 days in advance. It can also analyze video data in real-time to detect everything from trip hazards in a warehouse to weapons and other safety threats in schools, alerting key stakeholders immediately in scenarios where every second counts. Since inception, SparkCognition has helped global companies such as Boeing, Chevron, Airbus, Mondalez, Saudi Aramco, and countless others turn their physical machines into high performing intelligent assets.
SparkCognition’s technical prowess was inspired by its Founder and CEO, Amir Husain, who is not only a serial entrepreneur and tech pioneer, but also an acclaimed author and AI thought leader. Over time, Amir’s leadership and track record helped him build a deep bench of technical team members, including Sridhar Sudarsan, current CTO of SparkCognition and former CTO of IBM Watson. The team includes over 50 PhDs, who have accumulated over 200 IP assets and patents for SparkCognition’s portfolio. With its deeply technical DNA, SparkCognition remains at the forefront of cutting-edge technology, consistently pushing boundaries and setting new standards in innovation.
Its most recent game changing advancement in generative AI was announced in May. In collaboration with Shell, SparkCognition is working to accelerate the pace of subsurface imaging and Oil and Gas exploration. Using deep learning they have been able to generate reliable seismic images with as little as 1% of the traditional data required, while preserving seismic image quality. The approach augments human capabilities and drives previously unattainable outcomes, leading to substantial workflow improvements, results in days rather than months, significant cost saving, and a market advantage for SparkCognition. SparkCognition has filed seven patents on this work and they’re working toward extending their collaboration with Shell to tackle many more challenges.
This specific project highlights the type and scale of problem that SparkCognition’s generative AI capabilities are set out to solve. However, SparkCognition’s broader AI platform capabilities and use cases are boundless. Stay tuned for their next groundbreaking innovation.
How Parallel Domain Enhances Computer Vision Infrastructure with Generative AI
Parallel Domain, a leader in the computer vision infrastructure and enablement space, is also using generative AI and Computer Vision to offer substantial workflow acceleration and cost-savings for its customers.
Parallel Domain is the leading platform for unstructured synthetic data generation (i.e., the tech that builds the data behind our virtual worlds and simulation engines) to enable better and faster development of perception systems.
After leading computer graphics and systems simulation projects at pioneering companies like Apple, Microsoft Game Studios, and Pixar Animation Studios, Founder and CEO, Kevin McNamara was inspired to develop a synthetic data platform to help computer vision and perception developers tackle the hardest problems in obtaining the data they need to train and test models for the real world. The platform accelerates the data acquisition, training, and testing cycle for any computer vision or perception model by 180x. World-class mobility companies such as Toyota Research Institute, Google, and Continental are all beneficiaries of the Parallel Domain platform.
The latest product from Parallel Domain is Data Lab, a self-serve Python API that enables users to generate exactly the data they need, when they need it. In addition to Data Lab, Parallel Domain recently introduced a groundbreaking innovation called Reactor that combines generative AI and 3D simulation within their platform. This integration produces highly detailed and realistic synthetic data, where generative AI diversifies scenarios and objects, and 3D simulation adds physical accuracy for robust AI model training. Notably, the data Reactor generates is fully annotated for machine learning workflows, removing the need for manual labeling.
Reactor allows data generation through Python and natural language commands, empowering users to fill their datasets with diversity and realism with just a couple sentences or lines of code, thereby redefining synthetic data creation. This innovative capability is already gaining traction among Parallel Domain’s customers, promising great potential for elevating machine learning models in various applications.
Looking Ahead
As models become more sophisticated, they will produce even more realistic and diverse content, indistinguishable from human-created visuals. The applications will expand further, unlocking new use cases and driving further innovation in Computer Vision – as we’ve seen from companies like Landing AI, Overjet, and Unitary AI that are addressing emerging applications in the field. We look forward to what’s to come and are excited to back companies leading the charge at the intersections of synthetic data, computer vision, and generative AI. If you’re building in these areas, or innovating in an adjacent space, we would love to hear from you.
—
To learn more about SparkCognition and Parallel Domain and their transformative AI technologies, visit their websites at www.sparkcognition.com and www.paralleldomain.com and stay tuned for future spotlights of exciting new product releases!