Linear #178.5: All Things Voice AI with Mike Droesch of Bessemer Venture Partners
One vSaaS breakdown. One biz story. One 'how to'. In your inbox once a week.
The Voice AI Manual
Mike Droesch of Bessemer Venture Partners sits down with Nic to discuss where Voice AI is working, where it’s overcrowded, where the blue oceans still are, and why voice may become the next great application layer in vertical software, not just "AI that answers the phone."
Voice is an interface shift.
Voice isn’t a slicker chatbot with audio. It’s a much higher-bandwidth interface than typing, clicking, or filling out forms — and when you increase the bandwidth of data flowing in, the quality of the output changes dramatically too.
Mike’s argument: voice becomes interesting when it stops being a feature and starts becoming a serious way to gather context and take action. We’re not talking about AI that answers the phone — we’re talking about a new layer in software that captures more nuance, makes better decisions, and eventually owns entire workflows that used to require humans.
There is a lot changing in AI right now. New models. New wrappers. New infrastructure. New noise. But if you forced Mike to pick one category with the potential to reshape software more than people realize, it would be this one. Not because it’s trendy. Not because the demos are cool. Because voice changes the amount and quality of information that can flow into a machine — and that quietly changes everything downstream of it.
That’s the lens Mike brought into this conversation. Not “AI that answers the phone.” Not a better IVR. A new context layer for software, and eventually a new action layer on top of it. If he’s right, voice ends up being one of the most important new wedges into vertical software we’ve seen in a long time.
This Weeks Vertical Titan: Mike Droesch (Partner @ Bessemer)
Mike came up as an engineer before venture, joined Bessemer close to nine years ago, and got pulled into vertical software at exactly the moment the category started proving itself. Since then he’s built investment roadmaps across marketplaces, supply chain, vertical SaaS, and now vertical AI.
He’s had a front-row seat to the rise of companies like Abridge, Rilla, and Vapi — meaning he’s thought about vertical voice AI as much as almost anyone in the market. The difference in his framing is that he doesn’t treat voice as a narrow subcategory. He treats it as an interface shift.
One of his biggest surprises of the AI era: the fastest-adopting sectors have been regulated ones — healthcare, insurance, financial services, legal. They care deeply about workflow structure and control, and when voice gives them tight gates, identity validation, and rigorous agent monitoring, they actually move faster than the markets you’d expect to.
And the line every founder in this category should write on a wall: “The infrastructure stuff is table stakes. Distribution matters more than it ever has because it’s so easy to get to feature parity with other products.” Latency and reliability are the price of admission. Distribution, workflow ownership, trust and evals are the moat.
Voice AI becomes massive when it stops being a conversation and starts being a complete workflow.
— Mike Droesch, Bessemer Venture Partners
Voice AI changes the economics of software when it actually works. Traditional SaaS made labor more productive. The best voice companies start to replace it — and that moves the budget conversation from the IT line item to payroll and broader opex.
When voice actually works, the economics of software change.
A few observations from Mike on where Voice AI stops looking like a demo and starts looking like a category:
From IT line item to payroll line item.
In the traditional SaaS world, a lot of companies sold tools that made labor more productive. In the Voice AI world, the best companies don’t just support labor — they start replacing parts of it. That changes where the budget comes from. Instead of selling into an IT line item, you’re selling against payroll and broader operating expense.
Mike gave the example of a workflow that might support a roughly $30K ACV in its SaaS form but something closer to $150K when an AI agent actually performs the job end to end. That’s a massive unlock if it proves durable across categories — and it’s why the comp set for voice companies starts to look more like BPO and labor spend than next-seat-license SaaS.
Don’t build voice unless you can close the loop.
Mike said it directly: he would not want to spend his time building a voice agent company if he didn’t think it could do an entire job end to end. There are going to be plenty of products that sound great in a demo, have pleasant conversations, and then hand off to a human the moment things get complicated. Those might become features.
The monster companies are far more likely to be built by owning a full workflow, not a moment inside it. The frame for founders is simple: if the realistic ceiling of your product is “assistive in a slice of the job,” you’re building a feature for someone else’s platform.
The limit isn’t capability anymore. It’s permission.
Mike made a subtle but important distinction: the limiting factor is increasingly not whether the model can handle the interaction. It’s whether buyers trust the model enough to let it actually act.
We’re approaching a world where agents understand more than people expect. But connecting them to core systems, letting them move money, resolve a claim, complete a booking, or change a mission-critical record requires a different level of confidence. That confidence is not built by vibes. It’s built by governance, observability, and repetition. The buyer’s CISO is on your roadmap whether you invited them or not.
Every call is product data.
Mike’s view is that the compounding advantage comes from seeing real-time evals and tests run in production. Every interaction becomes more than a service event — it becomes product data. Every conversation generates signals. Every signal can be measured against expected performance.
Over time, the companies best at instrumenting, interpreting, and improving from those signals build a very real flywheel around quality. A lot of founders still talk as if the moat is the model. Increasingly, it’s the eval system wrapped around the model.
Escalation to a human isn’t failure.
Some people still talk about handoff like it represents failure. Mike’s framing is more practical. In real-world workflows, people usually call when something has already gone wrong, or when they’re in an edge case. The job isn’t to fantasize about 100% automation on day one.
The job is to know exactly where the boundaries are, solve what the system can solve confidently, and fail over elegantly when it can’t. In many verticals, that handoff is the difference between trust being built and trust being destroyed.
Platform + usage today. Outcomes when they’re measurable.
Outcome-based pricing gets attention, but the market today still mostly looks like some version of platform fee plus usage bundle. It’s legible to buyers, maps well to current products, and doesn’t require perfect agreement on what a “successful outcome” actually means. Outcomes will arrive first in use cases where they’re standardized and easy to verify.
One other line that stuck with us: what “fast” looks like today is crazy relative to five or ten years ago. Categories are forming in real time. Feature parity emerges in weeks. Speed itself becomes a meaningful — if temporary — moat. Teams that learn fastest, instrument fastest, and improve fastest buy themselves the time needed to build something deeper.
Nine lessons for building in voice AI.
Frameworks for founders and investors deciding what to build, what to back, and what to walk away from.
#1. Don’t build voice unless you can own the whole job
Mike said it directly: he would not want to spend his time building a voice agent company if he didn’t think it could do an entire job end to end. Pleasant-sounding demos that hand off to a human the second things get hard become features. Companies are built by owning a full workflow, not a moment inside it.
Action item: Map the job from first ring to final action. If your product can’t credibly close the loop, you’re building a feature for someone else’s platform.
#2. Sell against payroll, not the IT line item
Traditional SaaS made labor more productive. Voice AI, when it works, replaces parts of it. That changes where the budget comes from. A workflow that supported a $30K SaaS ACV can support something closer to $150K when an AI agent actually performs the job end to end.
Action item: Price against the loaded cost of the human doing the work today — not against the seat-license comp you’d benchmark to in 2021.
#3. Infrastructure is table stakes. Distribution is the moat.
Low latency, reliability, model quality — all required, none differentiating. Feature parity is reached in weeks now, not years. The deeper moat comes from distribution, workflow ownership, trust, evals, and the feedback loops generated by real calls happening in production.
Action item: Audit how much of your roadmap is infra plumbing vs. distribution and workflow depth. Rebalance accordingly.
A quick word from our incredible partner,
Parafin - the leading embedded capital partner.
4.8 star rating on Trustpilot, with over 500 reviews [source]
71% CSAT rating across all partners
84 NPS
Serving top Fortune 500 companies, including Amazon, Walmart, and DoorDash
#4. The limiting factor is trust to act, not model capability
The models can already understand more than buyers expect. The real question is whether buyers trust the agent enough to let it move money, resolve a claim, complete a booking, or change a mission-critical record. That confidence isn’t built by vibes — it’s built by governance, observability, and repetition.
Action item: Treat permissions, audit trails, and rollback as product surface area.
#5. Evals are the compounding moat
Mike’s view: the compounding advantage comes from real-time evals and tests running in production. Every interaction becomes product data. Every signal can be measured against expected performance. The companies best at instrumenting, interpreting, and improving from those signals build a real flywheel around quality.
Action item: Ship eval infrastructure on day one. The team that learns fastest from production calls compounds faster than the team with the shinier model.
#6. Graceful handoff is part of the product
Escalation to a human isn’t failure. People usually call when something has already gone wrong or they’re in an edge case. The job isn’t 100% automation on day one. It’s to know where the boundaries are, solve what the system can solve confidently, and fail over elegantly when it can’t.
Action item: Design the handoff like a feature, not an apology. The smoothness of the human transfer is where trust is built — or destroyed.
#7. Pricing is platform + usage — for now
Outcome-based pricing gets the attention, but today’s market mostly looks like a platform fee plus usage bundle. It’s legible to buyers, maps to current products, and doesn’t require perfect agreement on what a ‘successful outcome’ is. Outcome pricing arrives first in use cases where the outcome is standardized and easy to verify.
Action item: Don’t force outcome pricing before the outcome is measurable. Use platform + usage as the bridge from the SaaS era into the agent era.
#8. Regulated industries adopt faster than you’d guess
Counter-intuitive but true: healthcare, insurance, and financial services have been among the fastest-adopting sectors. They care deeply about workflow structure, control, and compliance. If voice lets them gate conversations, validate identity before releasing information, and monitor agent behavior rigorously, they move faster than less-regulated markets.
Action item: Lead with controls, not capability. In regulated verticals, the compliance story is the sales story.
#9. Hunt the messy categories, not the obvious ones
Scheduling, debt collection, recruiting — already crowded. The interesting opportunities are where voice enables something previously too nuanced, messy, or expensive to scale: legal intake, expert-network conversations, mixed-modality workflows combining voice with screen or imagery, field inspections in industrial, roofing, construction.
Action item: If your category has a clean ROI deck and ten well-funded competitors, you picked the easy wedge. Go find the operationally dense one.
Where voice is already a knife fight — and where the real opportunity still hides.
One of our favorite parts of the conversation. Mike’s read on where founders are piling in, and where the next category-defining companies are most likely to be quietly built.
Mike’s most provocative line:
That sounds extreme until you think about how bad most current service experiences are. If the agent is faster, more accurate, always available, and actually resolves the issue, the old assumption that humans are always preferred quietly breaks down.
Voice AI isn’t a feature added to software.
It’s a new chapter in software itself.
If you’re building — or backing — voice AI today, Mike’s worldview comes down to four commitments:
1. Pick an economically meaningful job. Not a demo, not a feature. A workflow with real labor cost attached, where ownership of the job rewrites the budget.
2. Build for trust to act. Governance, observability, permissions, audit trails. The buyer’s confidence to hand over the keys is the actual product.
3. Instrument great evals. Treat every call as product data. The compounding advantage is the eval system, not the underlying model.
4. Earn the right to own the workflow. Start with a wedge you can credibly close, expand into adjacent steps, and let graceful handoff carry you until the system can carry itself.
Not every voice company will matter. Not every demo will become a business. But the founders who pick economically meaningful jobs, build real trust, instrument great evals, and earn the right to own the workflow end to end — those are the ones with a shot at building very big companies.
Thanks for reading LINEAR. I reply to every email…
Have any questions, feedback, comments? Let us know, we work for you!









