Iterative Lean Startup principles are so well understood today that a minimum viable product (MVP) is a prerequisite for institutional venture funding, but few startups and investors have extended these principles to their data and AI strategy. They assume that validating their assumptions about data and AI can be done at a future time with people and skills they will recruit later.
But the best AI startups we’ve seen figured out as early as possible whether they were collecting the right data, whether there was a market for the AI models they planned to build, and whether the data was being collected appropriately. So we believe firmly that you must try to validate your data and machine learning strategy before your model reaches the minimal algorithmic performance (MAP) required by early customers. Without that validation — the data equivalent of iterative software beta testing — you may find that the model you spend so much time and money building is less valuable than you hoped.
So how do you validate your algorithms? Use three critical tests:
- Test the data for predictiveness
- Test for model-market fit, and
- Test for data and model shelf life
Let’s take a closer look at each of these.
Testing for predictiveness
Startups must make sure that the data powering their AI models are predictive of, rather than merely correlated with, the AI’s target output.
Because the human body is so complex, AI-powered diagnostic tools are one application particularly vulnerable to mistaking correlative signals with signals that are predictive. We have met many companies showing incredible gains in patient outcomes by applying AI to track subtle changes in weekly scans. A potential confounding factor could be that patients who are undergoing these weekly scans are also having their vitals recorded more regularly, which may also hold subtle clues about disease progression. All of that additional data is used in the algorithm. Could the AI be trained just as effectively on these less invasive vitals, for much less cost and stress inflicted on the patient?
To tease out confounding correlations from truly predictive inputs, you must run experiments early on to compare the performance of the AI model with and without the input in question. In extreme cases, AI systems built around a correlative relationship might be more expensive and may achieve lower margins than AI systems built around the predictive inputs. This test will also enable you to determine whether you are collecting the complete dataset you need for your AI.
Testing for model-market fit
You should test for model-market fit separately from product-market fit. Some startups may first go to market with a “pre-AI” solution that is used to capture training data. Even though you may have established product-market fit for that pre-AI product, you can’t assume users of that pre-AI solution will also be interested in the AI model. Insights from model-market fit tests will guide how you should package the AI model and build the right team to bring that model to market.
Testing for model-market fit is more difficult than testing for product-market fit because user interfaces are easy to prototype but AI models are difficult to mock up. To answer model-market fit questions, you could simulate an AI model with a “person behind the curtain” to gauge end user response to automation. Virtual scheduling assistant startup X.ai famously used this approach to train its scheduler bot and find the appropriate modes and tones of interaction by observing tens of thousands of interactions conducted by human trainers. This approach may not be appropriate for applications where the content or data may hold sensitive or legally protected information, such as interactions between doctors and their patients or attorneys and their clients.
To test customer willingness to pay for an AI model, you could dedicate a data scientist to serve as a consultant to existing customers and provide them with personalized, data-driven prescriptive insights in order to demonstrate ROI for an AI. We’ve seen many startups in healthcare and in supply chain and logistics offer this service to convince their customers to invest the time and manpower into building integrations into the customer’s tech stack.
Testing for data and model shelf life
Startups must understand early on how quickly their dataset and models become outdated in order to maintain the appropriate rate of data collection and model updates. Data and models become stale because of context drift, which occurs when the target variable that the AI model is trying to predict changes over time.
Contextual information could be helpful in explaining the cause and rate of context drift, as well as help calibrate data sets that have drifted. For example, retail purchases can be highly season-dependent. An AI model might see that wool hat sales increased over the winter and unsuccessfully recommend them to customers in April. That crucial contextual information can be impossible to recover if it is not recorded when the data is being collected.
To gauge the rate of context drift, you can try to “mock up” a model and observe how quickly its performance degrades in real-life settings. You can do this without training data using some of the following strategies:
- Build a rules-based model with known frameworks where applicable
- Repurpose a model trained on a strongly related but separate domain, such as using book recommendation models to recommend movies
- Simulate customer data with mechanical turks
- Partner with industry incumbents to obtain historical data;
- Scrape the Internet for publicly available data
If the mocked up model degrades quickly, the AI model will be vulnerable to context drift. In this case, historic data may not be useful beyond a certain point of time in the past, because an AI model trained on that outdated data will not be accurate.
New era, new playbook
Enterprise customers and investors increasingly see data and AI as a necessary competitive advantage for startups, but AI powered products still require a heavyweight development process. As is the case with all business questions, you must still validate your data and AI strategies iteratively and as early as possible to avoid wasting valuable time and resources on projects that will not bear fruit. The three tests outlined here provide a way to validate AI models before you build a working model. As more and more startups implement them, these ideas will become part of the toolkit to create a Lean AI Startup and will change the bar for venture funding in the era of intelligence.