Databricks: the $100 billion startup most people have never heard of

Among the ten highest-valued startups in the world, there is exactly one you've probably never heard of — it's called Databricks.

-

For perspective: the company is currently valued at $100–130 billion, and it sits on that list alongside household names like OpenAI and SpaceX.

Despite annual revenues exceeding $3.5 billion and a workforce of 8,000 employees, it has remained largely anonymous — for two reasons.

-

The first reason is that it operates in big data and AI model training on enterprise data — domains that don't reach the average consumer the way OpenAI does, and don't carry the spectacle of SpaceX's rockets.

The second reason is that its customers are governments and large corporations, keeping it deep in the shadows of the B2B world.

-

The company's technological solution is somewhat complex, and it touches on how big data and databases are stored.

Structured data at small scale — say, usernames for an e-commerce site — is typically stored in a database built like an Excel spreadsheet, with columns and rows. That kind of database is easy to manage, and data can be retrieved from it relatively simply using queries.

Very large databases containing diverse data — detailed medical records for millions of patients, for example — are stored in databases like MongoDB, where data isn't organized in tables but in a binary JSON format.

The world of big data is entirely different. Instead of an organized database, all types of files are piled on top of one another in a massive heap on a server called a bucket — Amazon's S3, for instance.

-

Managing files in big data is challenging and prone to errors. Attempting to update a file simultaneously from two different sources can cause corruption, and exporting cleanly structured data via queries is difficult.

Databricks bridges this gap with a management layer in which all metadata about the files, and every operation performed on them, is first recorded in a dedicated log file.

-

The AI revolution has sent the company's valuation soaring, because Databricks offers businesses and governments the ability to build a custom AI model trained directly on their own enterprise data.

This approach delivers high model accuracy — the model is intimately familiar with all of the organization's data — along with full data security, since the model remains private.

An additional advantage is that all data stays entirely within the company's ownership, on its own storage servers, while the Databricks system trains the model via an external connection. This is especially important for government entities that handle sensitive information.

The company is expected to go public soon, and although competition already exists in the market — primarily from Snowflake — its technology is widely regarded as superior, and it continues to generate rapidly growing revenues.

--

*Pictured: Databricks' London office.*

👋 Hi, I'm Shlomo Strauss — follow me for more content on science and technology.

Databricks: the $100 billion startup most people have never heard of