Spence Green: Making the World Multilingual with AI-Powered Translation
By: Terah Lyons
Spence Green was in his mid-twenties living in the Middle East learning Arabic when Google Translate came out in 2006. Struck by the extent to which language barriers limited opportunities and access to information–especially on the Internet–for people around him, he was inspired by the product and its mission at the time of its release. He later accepted a Google Research internship to work on Google Translate – and the rest, as they say, is history. After completing a Masters and PhD at Stanford in Computer Science focused on machine translation, Spence co-founded Lilt in 2015. The company had a focused mission: to make the world’s information accessible to everyone regardless of where they were born or which language they speak. If the web is to be a truly global information resource, we need companies like Lilt helping to increase access to that information to everyone, in every language.
Lilt successfully built the world’s first adaptive machine translation product, with a solution that generates a virtuous cycle between statistical and neural machine translation methods and human translators in-the-loop. Their models adapt to their translators as they work, updating parameters automatically along the way. The value they are delivering to their customers is clear: Lilt helps companies like ASICS to sell shoes and Intel to sell chips – and even helps public school districts to enable their immigrant parent communities to read and understand their childrens’ school materials.
Today, Lilt’s product offering combines state-of-the-art natural language processing technology and professional human translators to help organizations scale their multilingual programs, to accelerate go to market more rapidly, and to improve global customer experience. The company just closed a $55 million Series C – and Zetta is proud to have been a partner to Lilt from their beginning. As in the creation of any company, the journey has not been without hard lessons and hard-won triumphs along the way, and Spence was generous enough to share plenty of the lessons he has drawn from his experience as CEO with us for the Founders’ Playbook.
Read on for some wisdom: Spence walks us through why you should only start a company as a very last resort; why delivering on AI’s value proposition often means providing a service, instead of merely a technological enablement; and why founding a startup–and especially an AI startup–is part hypothesis testing, part street fight.
Terah Lyons: Let’s start with the basics: What does Lilt do, and how do you use machine learning?
Spence Green: We do global experience, which means making digital products and services available in the language of the user's choice. That's really important. I think that historically, most products and services on the internet have been in one language–mostly in English. And if you want to use them in another language, you have to use Google Translate. So we give businesses the ability to distribute a great product experience in all languages.
Historically this is done entirely by human labor. So you take the strings out of a website or a software application and you send them to people and they type, and then it comes back. And the problem with that is that the amount of information in the world is increasing at a much faster rate than the amount of people in the world. And so if you want to bridge that gap, you need to use technology.
TL: Walk us through Lilt’s founding story. Is the product idea the company is centered upon today the same as it was when you set out?
SG: [My Co-Founder] John and I are both researchers and we met at Google about 10 years ago, working on Google Translate. So both of us have been interested in information access for most of our professional lives. At that time, Google didn't use Google Translate for any of its products or services, and this seemed like a missed opportunity to make the internet more open and connected. The reason is that machine translation systems don't give you a quality guarantee or any sort of certificate of correctness, if you want to think about it in more machine learning terms. So the natural solution is to start building human-in-the-loop systems. The problem with that is people have been trying to do that since the late 60s, and they had never been able to make it work.
So that turned into work we did for about five years. And we got it to work–sort of. We never really set out to start a company. Both of us, like I said, were focused on this problem in the world. And John had a point of view that a company like Google would not do this because of the way the market was laid out. We didn't think we'd have much impact if we just made it like an open source research project. So the next thing to think about was starting a company. We decided we were the least bad people to do that even though we had never founded anything before.
TL: After investigating all other available options.
SG: That was exactly the process of elimination that we went through. But the problem was all of our go-to-market hypotheses were completely wrong. Our observation was correct–which was that you would use machine learning to change the unit economics of translation. You could think about it as a production cost. You're selling words to businesses and if you can achieve a factor improvement in the production cost of those words, then you can achieve mission success, but also build a really great business.
And the problem with that is that the way this market is organized, you basically have to vertically integrate, which means you have to become a technology-enabled service or a full-stack startup–or whatever you want to call it. I just call it a “solution”. You basically have to become a full solution.
TL: In other words, you realized you couldn’t sell your software. You had to sell translated words. Which meant you needed to hire the humans and deliver the service, not the technology. How did you reach that decision, and what tradeoffs did you consider?
SG: Exactly. So then the problem with that is that becomes a lot more operationally intense. I love our business. I think it's really interesting. But it's just not the business that we thought we were building from the outset, and it's a lot more complicated and operationally intense than anything that we thought we were getting ourselves into originally.
We went through five go-to-market hypotheses and the only reason we're sitting here having this conversation is because we didn't run out of cash in the middle of that. We gave ourselves enough time to take a bunch of shots.
So the first one was we tried to do a freemium. The observation was right: if you could change the unit economics, that turns into gross margin improvement. And so our initial idea was that we give this to every translator in the world, they improve their own personal gross margin. That'll be great. They make more money and we'll have like a $15, $20-a-month self-service signup tool, basically. Tool business turns out to be extremely challenging because behavior change is extremely challenging. So nobody wants to use your tool.
So then we thought, well, we'll go sell to these agencies that basically aggregate the supply side and get translators together to serve businesses.
And there were some smaller [iterations]. So we thought, well, that's kind of like an SMB sale and we'll sell it for like $500 a month or something like that. And then we found out that the way the supply chain works, each link up the supply chain dictates technology choices to each link further down. So even if they wanted to buy our tool, they couldn't because their upstream customers dictated what they were doing. We also found with these small deals–especially with a complex technical product like this–the support costs to even get them to try to understand it massively outweighed anything that you could possibly charge. So then we thought, okay, we'll go to the big enterprise vendors because they're big companies and they should be able to understand. The problem with that was that they're told what technology to use by their end enterprise customers. So even if they wanted to standardize on one product, they couldn't.
So we couldn't get any business done there. So then we thought, okay, well we'll try to sell the software to the end enterprise customer. And we did that. But now you have a structural misalignment because what our product fundamentally does is cannibalize the business of the vendor. So we'd sell it to the enterprise customer. And then the enterprise vendor would say, “Oh, it generates bad quality. It's not usable.”
After all that, we could just never get a proof of concept done. So then the last resort was either go out of business or consume the entire supply chain and not [just] a services component, which is exactly what we did.
TL: I'm curious if you can go one level deeper there and talk to us a little bit more about what and how you decided to monetize. How did you approach the issue early on, and how do you think about it today?
SG: This is specific to enterprise, but I wish somebody had just grabbed me, physically shoved me into a corner and said, I'm not letting you out until you understand that it's very difficult for businesses to change the way they buy things. And so we tried to change the pricing model and the packaging and all this stuff. And it was just a complete waste of time. Our customers were looking for simplicity: here's the offer from the incumbent and here's the offer from the startup. And I think we wasted a lot of time trying to innovate.
A friend once told me that you can innovate on distribution, you can innovate on product, and you can innovate on pricing and packaging–and that you really ought to just pick one of those things. If you pick all three, you're in for a tall order. And we picked all three, so that was stupid.
Ultimately, we zeroed in on innovating on product. We did have to innovate on distribution because this was a vertically integrated thing–how do you get at customers–but now we don't fiddle with pricing and packaging. So try to make it look as similar to what your buyer is used to buying as possible. That is true in our business, at least–I don't know if that applies to anybody else, but that's what I learned.
TL: Let’s talk a little more about the human-in-the-loop component. How have you seen the evolution of that part of the business over time, and what have the biggest challenges and benefits been for you?
SG: Well, you can think of translation as data labeling–at least, that’s one way to think about it. Think about labeling an English sentence with a Japanese sentence, and that set of symbols that you label the English sentence with is what we call that a “sentence pair” –and that's the training example that you use to train a machine translation system. So the thing that we developed at Stanford years ago was doing online learning and production, where as soon as the person doing the translation creates a label, you just train on it immediately. And then you get a kind of a self-training system, if you will.
That is what we built 10 years ago–now we're, I don't know, seven, eight generations into building it at this point. That's the human in the loop part, which actually doesn't really change what translators do. They just use a slightly different interface and the system passively observes what they do and it learns from that. And then it gives them predictions that help them produce more. And if you can get them to produce more per unit time, you can monetize that productivity, which is how the business works.
TL: Have you seen growth in the efficiency of that process over time?
SG: We have a key metric that we measure, which is called “word prediction accuracy”. It looks like predictive typing. So, for example, when somebody's typing and the system generates a next word, you can just calculate whether they pick that next word, or do they type over it? And then you can get an accuracy figure. And that number for us is about 80 percent these days. So one way you could think about that is that 80 percent of the words that we deliver to customers are machine generated. And that figure was around 50 percent, at least when we launched our first neural network system about four years ago.
TL: Pretty impressive development. Would you say the goal is ever going to be trying to eliminate the human-in-the-loop component, or you see that as integral to the function of the business?
SG: Well, the interesting thing about language is that it's dynamic and productive. People are creating new words and phrases all the time. And the distribution of language usage is changing all the time. So that's observation number one. Observation number two: we're effectively in a paradigm right now where supervised learning is what works. Semi-supervised self-training works for these large language models, but for a lot of tasks, it's got to be supervised learning. So to train these systems, you have to get the data from somewhere. And so that's what humans increasingly do, is they create this data that train the systems. There are a lot of business workflows now where fully automatic translation solves the problem. So, for example, if you book something on Airbnb for, I don't know, six, seven years now, it's been integrated with Google Translate and that's perfectly sufficient for booking something on the Airbnb platform.
There are a lot of use cases like that that can be fully automatic, but then there are a lot of them that are not–for example, applications where people are making trading decisions, medical applications where people are making health decisions, marketing websites–where it's not just a correct translation you want, but on-brand translation. It's very difficult to articulate that as a set of constraints or a set of features or some sort of machine-readable construct that you could train a system on. And so those are the places where you still need word knowledge, common sense reasoning, the things that machine learning systems can't do very well...
TL: Maybe not quite yet. You just described several different use cases–this feels like a good segue into a discussion of customer discovery and acquisition. How did you begin your customer discovery process? What were some of the biggest early learnings?
SG: Well, I described those five different business models and each one of those was a very different customer. Honestly, revenue solves all problems. If you don't get any money from a given experiment, that's a bad experiment. So we tried going after each one of those customer sets for the business model. And then when we got to the enterprise–well, I think this is generally true of enterprise companies, but it was especially true in our case where our sale is more like a professional services sale, or historically it has been because that's what the business service has been historically. It hasn't been a technology sale. So it depends a lot on social proof. And getting the first bit of social proof doesn't feel like hypothesis testing. It feels like a street fight.
You just have to get ten customers. However you can possibly do it, you just do it. Then at least you can get some case studies and get some social proof. I think in other applications you can make an ROI argument or something like that, or maybe, better yet, you're selling a new capability that nobody has. And so there's no real change management that has to be done to try something. Whereas in our case, there's a lot of change management involved. And so you have to de-risk that. And one of the big ways to do that is with social proof. That part was the initial focus.
TL: You mean there's change management for your customers? Like switching translation vendors is more than just like, Hey, I used to send my file here and now I send it there. Why is that? Because you change their workflow too?
SG: That's right. Well, we actually hired an ethnographer to help us, who had training in psychology, to try to figure this out, because it was so baffling. For our particular industry, the way [our buyer’s] job is constructed is that it's a high-responsibility, low-agency job, and the rational response to being in a circumstance like that is to be a risk minimizer.
These are our enterprise buyers: for example, the localization manager at a tech company. And we're selling a product that you can't evaluate yourself. So, you send me a document in English, I send you a document in Korean, you have no idea whether what I gave you is good or not. So that just tends towards risk minimization.
TL: What was the story that persuaded these risk minimizers to take a risk, or made them feel like it was less of a risk?
SG: It was really down to the personality type. We found a couple of people who were basically early adopters, like to try new things. And we had a few cases in which it was somebody who had had a really bad experience with one of our competitors and wanted to try something new instead of the traditional way of doing things. That's the part where it was kind of a knife fight–where you just have to talk to as many people as possible to find these humans. There are five people somewhere in the world who will buy the thing. And the question is, can you get to those five people before you run out of money?
TL: Spence, this is insightful. We see technical founders chronically underestimate the importance and difficulty of lead generation and customer acquisition.
Obviously, generating a high volume of leads is easier to do once you're proven–in contrast to the situation where you're trying to search for those needles in the haystack, those true risk-taking early adopter types. What worked?
SG: If I were to do it over again, I would be a lot more systematic. At the early phase it's very difficult to figure out, is the product wrong? Is the positioning wrong? Are you just not very good at selling? These are all really consequential. These are big levers. And when you have a long sales cycle, you press one of the levers and you're not going to know the answer for like three months. So it's really about compressing and reducing the experimental cycle as much as you can.
I just went to a bunch of events and I talked to a lot of people. I kind of looked at what our main competitor did–which was a lot of event marketing. Every event they had some big booth with couches and stuff like that. And it was kind of frustrating because we couldn't possibly compete with a marketing budget like that.
I found out at one point they were spending more per month [in marketing] than we had raised in the history of our company. And that's really hard. You have to do the guerilla marketing thing, but that basically just amounts to talking to a lot of people.
We have customers speak at our executive offsite–usually it's customers that haven't had a good experience with us. So I get them to tell the executive team all the stuff we did wrong. And that seems to stiffen people's spines a little bit. We brought in our third-ever customer and she reminded me that she had met me at a conference in Montreal in 2016–two years before she bought our product (this was 2018).
Now that company is one of our top four customers, but two years is infinity in seed startup time.
TL: What made her buy-in after two years? Do you know?
SG: I think they came to a point where they had a business need that they couldn't solve. They were basically working on their support site and the content volume was such that the traditional approach didn't match their budget. And she had seen me at a conference making these wild claims. And so she decided to give it a try.
TL: We think the vertically integrated business model is relevant to lots of Applied AI startups - though not every one. Do you have any mental framework about what companies are best suited to the vertically-integrated model? Any aspects, of customer or competition or nature of the product?
SG: Well, I think the issue is that most machine learning technologies are features, not businesses. Or they're features–not products. Or, even–they're features of a product. Consider sales and marketing technology companies, what do they do? They have the product and then they get traction and then they get to $50 million in revenue or whatever. And then they can start hiring machine learning people to study their data. It comes later. And so if it's going to come in the beginning, you really have to ask, Well, what product is there for this sentiment analysis thing or this large language model thing, or this object detection thing, because all of these technologies are just for the most part mimicking what people do.
It's easier to think of applications like security–cybersecurity applications are some of the most mature, scaled ML companies because you can use the product to simply detect intrusions. But outside of security, it's tough. And then you end up doing solution selling, you're solving a business use case. If you're doing solution selling, you start looking like what Oracle and HP do, not like what the average SaaS company does. At that point, you've got a pro serve component and you've got account managers and you've got solutions architects and all this stuff and all of that is just to get the technology into the enterprise workflow. But that's tough because investors just hate all that. They like the SaaS company with the LTV to cap under twelve.
TL: Spoken like a grizzled veteran of fundraising. But investors like those things because they reflect patterns that make it easier to build a successful business. Perhaps because they enable rapid iteration cycles, which you’ve mentioned several times as key to startup success.
SG: Yes. I mean, here's the way that I would characterize it from the investor point of view, what are they valuing? They're, they're valuing the present value of future cash flows. Right? These businesses that are 85 percent gross margin that kick off a bunch of cash and have attractive sales efficiency have all this cash, which they can plow into sales and marketing. And that's all good. But I think the problem is that the products are much easier to reproduce. So you have twenty startups that are all trying to do the same thing, and there will eventually be two that win and take most of the profits and hopefully you pick the two that win. I think these machine learning businesses, they're much harder operationally to get right.
So the bigger risk is execution risk. You just can't execute very well. So you can get the financials into a place where it makes sense as a venture backable company. But if you can do that, well, then now you're not in a market with twenty other competitors. You're kind of alone. And that's the real price.
Like [Michael] Porter talks about, there's product differentiation and cost differentiation, it's the two classic vectors of competitive differentiation, but there's a third that he added later on, which is operational excellence. And I think that's mostly where these vertically integrated machine learning companies come in, is if you can actually master the operational part, that's very difficult to get right.
TL: I'm curious what you would pick if you had to pick one of those priorities that you just described in competitive differentiation. Is there one that you're really focused on right now? Or do you feel like you have to win all three of those categories to come out on top?
SG: We don't have to win pricing and packaging because we mimic what our competitors do or what the industry standard approach is. Actually, at this point, I'm spending more time on product than I have in years, which is interesting. I think it's because machine translation itself is maturing. We have more customers moving to fully automatic workflows and the way that our product has developed over time, our workflow integrations have gotten a lot more sophisticated and that is what ends up being a more compelling source of differentiation in the sales cycle than it was historically. And we always had this hypothesis: We had this piece of technology and then we had to wrap it in a service to sell it. And we always thought, because the technology's going to get better and better and better, at some point the technology is going to come back out from under this cloak that we put it under, and that has finally started to happen.
TL: So Spence, you were an academic PhD as a founder-CEO. A lot of AI companies have technical founders–as they should, because if you don't understand the tech, you're going to have a hard time in this business. But what we find working with technical CEOs, and especially academics, is that there is often a sense that they are the product person and that it’s someone else’s responsibility to go to events and conferences, so make the first sales. And I wonder if you would've gotten the same result if you hadn't put yourself out there. What made you realize it should be you? And what gave you the courage to do it?
SG: Well, I think there's John and my working relationship. When we decided to do this together, he was going to manage the research and the technology, and I was going to figure the business out. And that just seemed like the right division of labor. My conception of the job was just to do whatever it takes. If you're the CEO, your job is all of the jobs. And hopefully, you can pick the job that's the biggest problem to solve.
TL: What is one piece of advice you would give to your past self? Because your past self is really our audience here.
SG: I was speaking recently with a researcher at a tech company; he is thinking about leaving and going to a startup. And he asked me this question. I think what I came up with was: If we knew Lilt was going to be an enterprise company, I would've spent a lot more time thinking about our customers in a different way. At the end of the day, big businesses are big systems. They have procurement departments and they have legal departments and they have these other functions and you really need to understand how that system works before you build anything or try to sell anything. And I was completely oblivious to that. And I think we probably would've saved ourselves a lot of time if I had just spent some time thinking about that like an engineer and trying to understand how that worked. Because, like I said earlier, you can't change all that machinery. It doesn't change. So you have to build something that conforms to it.
You can bring a systems mindset to sales. I thought it was like suspenders and cigars and stuff like that. It's not.
TL: Last question: You are that rare founder who, as a new PhD, made the transition from academia to CEO. I wonder if you have words of wisdom for people walking in those shoes, because now you are the role model that you were looking for once upon a time.
SG: I'm still working on that. I don't know what my persona is, but I think the hardest thing that's been to learn is the group dynamics. Also, to some extent, it’s a systems problem, but one that is primarily driven by the primitive parts of our brain. Organizational dynamics. I think the hardest thing in the world is to motivate and organize people around a common goal. The system you're building is so much more than the product–it's the company. And it took me a long time to understand that.
TL: Thanks, Spence. You have provided so much honest, valuable advice and we can’t thank you enough for it.