Technology

GitHub Copilot competitor Codeium raises $150M at a $1.25B valuation | TechCrunch


A startup whose product competes with GitHub Copilot and other AI-powered coding assistants has achieved unicorn status.

On Thursday, Codeium said it closed a $150 million Series C round led by General Catalyst that values the company at $1.25 billion post-money. The round, which also saw participation from existing investors Kleiner Perkins and Greenoaks, brings the company’s total funding raised to nearly a quarter-billion dollars ($243 million) a mere three years since its launch.

Codeium’s co-founder and CEO, Varun Mohan, told TechCrunch that Codeium hasn’t even touched the $65 million Series B tranche it raised in January yet. Back then, just eight months ago, Codeium was valued at half a billion dollars.

“Even though we’ve barely made a dent in our existing funding, we believe that this injection of capital will allow us to significantly ramp up R&D and growth while making even larger strategic bets,” he said.

Codeium was founded in 2021 by Mohan and his childhood friend and fellow MIT grad, Douglas Chen. Prior to Codeium, Chen was at Meta, where he helped to build software tools for VR headsets like the Oculus Quest. Mohan was a tech lead at Nuro, the autonomous delivery startup, responsible for managing the autonomy infrastructure team.

The startup began as a radically different company called Exafunction, focused on GPU optimization and virtualization for AI workloads. But in 2022, Mohan and Chen sensed a bigger opportunity in generative coding and decided to rebrand — and pivot.

“Despite the influx of generative AI tools, developers are still struggling with time-consuming coding tasks,” Mohan said. “Many of the AI-driven solutions provide generic code snippets that require significant manual work to integrate and secure within existing codebases. That’s where our AI coding assistance comes in.”

Codeium’s platform, powered by generative AI models trained on public code, serves up suggestions in the context of an app’s entire codebase. It supports around 70 programming languages and integrates with a number of popular development environments, including Microsoft Visual Studio and JetBrains.

Codeium
Codeium’s AI-generated coding suggestions.
Image Credits: Codeium

To attract devs away from Copilot and other rivals, Codeium has released a generous free tier to start. The strategy seems to have worked: Today, the startup has more than 700,000 users and over 1,000 enterprise customers, including Anduril, Zillow and Dell.

Quentin Clark, managing director at General Catalyst, implied that Codeium won some of its larger contracts by embracing a steadfastly client-centric approach to product research.

“The team’s approach has always been to follow its customers, leading the company to build solutions on their terms — deployable in any environment and supporting more languages than anyone else,” Clark said in a statement. “What Codeium has created isn’t just a demo, an announcement, or an idea — this is a fully scaling business, with large enterprises adopting the product across their entire organizations.”

Businesses are often wary of exposing proprietary code to a third party — for instance, Apple reportedly banned staff from using Copilot last year, citing concerns about confidential data leakage. To attempt to allay such fears, Codeium began offering a self-hosted installation option alongside its standard software-as-a-service plan.

Codeium
Image Credits: Codeium

Companies can now deploy Codeium’s service on their own hardware if they wish. Or they can adopt a hybrid setup, keeping their data on their own devices while using Codeium’s servers for computing needs.

There’s always some risk involved in data transfers to the cloud, but Mohan claimed that Codeium leverages strong encryption. “We never train our proprietary generative autocomplete model on user data, never sell data and ensure all data transmission is encrypted,” he added.

Codeium has also taken steps to remove “non-permissively” licensed code (e.g., code under copyright) from the datasets it used to train its AI models. Some code-generating tools trained using restrictively licensed or copyrighted code have been shown to regurgitate that code when prompted in a certain way, posing a liability risk (i.e., developers that incorporate the code could be sued). Mohan said that’s not the case with Codeium, thanks to its training data prep and filtering approach.

“We also remove any remaining data that looks similar to code that is explicitly non-permissively licensed just in case other people copied code without providing the proper attribution and licensing,” he added. “On top of this, we have state-of-the-art, post-generation attribution filtering and logging in the case that these large probabilistic models produce code that is similar to public code, whether permissively or non-permissively licensed.”

But what about hallucinations? Most AI coding tools are notorious for making stuff up, which can be quite destructive in an enterprise environment.

An analysis by developer tooling startup GitClear found that generative AI tools have resulted in more mistaken code being pushed to codebases over the past few years. And a Purdue study found that over half the answers that OpenAI’s ChatGPT gives to programming questions are incorrect. Security researchers have warned of the potential for such tools to amplify existing bugs in software.

A recent survey from cybersecurity firm Snyk found that nine in ten developers worry about the broader security implications of using AI coding platforms. But Mohan claimed that Codeium’s supposedly superior, deep context-rich tech yields more trustworthy results than most.

“Our context awareness engine is able to ground results in what is already existing in a user’s codebase, leading to suggestions with fewer hallucinations and more adherence to existing syntax, semantics and standards,” he said.

Whether benchmarks back that up or not, Codeium’s sales pitch seems to be resonating with the right execs: Revenue hit eight figures this year. Mohan said the 80-person, Mountain View-based startup plans to expand headcount to 120 by 2025 as it aims to make a bigger dent in a market with formidable competitors like Tabnine, Anysphere and Poolside.

Catching up to Copilot, which had over 1.8 million paying users as of April, probably isn’t in the cards for Codeium — at least not imminently. It doesn’t have to be. As Mohan rightly noted, given the widespread adoption of AI coding tools among developers (despite their reservations), even a small slice of the nascent segment is bound to be lucrative.

Polaris Research projects that the AI code tools market will be worth $27.17 billion by 2032.

“An overabundance of hype is a challenge the industry faces,” Mohan said. “This will make it harder for every company to truly convince end users that they are at the forefront of possibility. But we believe that truth-seeking and realistic AI companies like Codeium will eventually cut through this noise.”



Source link