Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
AI Businesses Cloud Open Source

$28B Startup Says Companies Were Refusing Their Free Open-Source Code as 'Not Enterprise-Ready' (forbesindia.com) 49

"Ali Ghodsi was happily researching AI at Berkeley when he helped invent a revolutionary bit of code — and he wanted to give it away for free," remembers Forbes India. "But few would take it unless he charged for it.

"Now his startup is worth $28 billion, and the career academic is a billionaire with a reputation as one of the best CEOs in the Valley." (Literally. VC Ben Horowitz of Andreessen Horowitz calls him the best CEO in Andreessen Horowitz's portfolio of hundreds of companies.) Inside a 13th-floor boardroom in downtown San Francisco, the atmosphere was tense. It was November 2015, and Databricks, a two-year-old software company started by a group of seven Berkeley researchers, was long on buzz but short on revenue. The directors awkwardly broached subjects that had been rehashed time and again. The startup had been trying to raise funds for five months, but venture capitalists were keeping it at arm's length, wary of its paltry sales. Seeing no other option, NEA partner Pete Sonsini, an existing investor, raised his hand to save the company with an emergency $30 million injection...

Many of the original founders, Ghodsi in particular, were so engrossed with their academic work that they were reluctant to start a company — or charge for their technology, a best-of-breed piece of future-predicting code called Spark, at all. But when the researchers offered it to companies as an open-source tool, they were told it wasn't "enterprise ready". In other words, Databricks needed to commercialise. "We were a bunch of Berkeley hippies, and we just wanted to change the world," Ghodsi says. "We would tell them, 'Just take the software for free', and they would say 'No, we have to give you $1 million'."

Databricks' cutting-edge software uses artificial intelligence to fuse costly data warehouses (structured data used for analytics) with data lakes (cheap, raw data repositories) to create what it has coined data "lakehouses" (no space between the words, in the finest geekspeak tradition). Users feed in their data and the AI makes predictions about the future. John Deere, for example, installs sensors in its farm equipment to measure things like engine temperature and hours of use. Databricks uses this raw data to predict when a tractor is likely to break down. Ecommerce companies use the software to suggest changes to their websites that boost sales. It's used to detect malicious actors — both on stock exchanges and on social networks.

Ghodsi says Databricks is ready to go public soon. It's on track to near $1 billion in revenue next year, Sonsini notes. Down the line, $100 billion is not out of the question, Ghodsi says — and even that could be a conservative figure. It's simple math: Enterprise AI is already a trillion-dollar market, and it's certain to grow much larger. If the category leader grabs just 10 percent of the market, Ghodsi says, that's revenues of "many, many hundred billions."

Later in the article Ghodsi offers this succinct summary of the market they entered.

"It turns out that if you dust off the neural network algorithms from the '70s, but you use way more data than ever before and modern hardware, the results start becoming superhuman."
This discussion has been archived. No new comments can be posted.

$28B Startup Says Companies Were Refusing Their Free Open-Source Code as 'Not Enterprise-Ready'

Comments Filter:
  • by 93 Escort Wagon ( 326346 ) on Saturday October 23, 2021 @03:41PM (#61920733)

    Does this “reluctant” company just happen to be heading towards another round of attempting to raise capital - or an IPO?

    Oh, hey, there it is in the last paragraph!

    • by parityshrimp ( 6342140 ) on Saturday October 23, 2021 @05:47PM (#61920967)

      I was done after this sentence:

      Databricks' cutting-edge software uses artificial intelligence to fuse costly data warehouses (structured data used for analytics) with data lakes (cheap, raw data repositories) to create what it has coined data "lakehouses" (no space between the words, in the finest geekspeak tradition).

      • Yup. It had nothing to do with being open source. It wasn't Enterprise Ready(tm) until it had insulted the English language,

        • Oooh this is cool. Thank you. And I just googled how to make it stop showing me release notes when there's an update. I think I ran into the safe mode thing when I was tired and working late at night and it pissed me off enough that I didn't look for the settings... I really don't like things I didn't ask for popping up in my face.
  • by retchdog ( 1319261 ) on Saturday October 23, 2021 @03:57PM (#61920787) Journal

    a clumsy service wrapped around a clumsy frontend for spark. you'd have to pay me to use it.

    they should call it data-outhouse.

    • Got any better alternatives?
      • Got any better alternatives?

        A monkey and a whiteboard for it to fling poo at. Predictive accuracy is probably about the same.

      • by sfcat ( 872532 ) on Sunday October 24, 2021 @11:52AM (#61922351)

        There are several. The funny part is that in the 70s, at Berkeley technology was invented that would processes data at 10X the speed of Databrick's technology. Specifically, it was the datapath architecture at the heart of databases. Today, there is a product that uses this to make a Spark like system with 10X the throughput. It is called SQLStream Blaze [sqlstream.com]. Disclaimer, I was one of the architects. The problem...you have to program it in SQL (instead of Scala or some other GP language). Its not just a DB, it does streaming queries at 16MB/core second. Basically it does inserts from selects between streams (which are like tables but without the persistence).

        It could have a Scala front end to it on top of the SQL one but we never had the resources to write it. The reason, the VCs threw all their money at companies reselling Spark. The even worse part of all of this is that Spark was created at the same place where the right way to do an analytics architecture was invented. Literally, all they had to do to write Spark correctly was listen to the most famous Software Professor that UC Berkeley had ever produced. But they were too good for that and instead invented their own way to do it that is 10X slower. So now, all of our analytics clusters take 10X the power and time to do the same thing as if we used datapath architectures instead. People like Ali Ghodsi are complete charlatans and if someone wants to know why SV is going down the drain, I point at people like him. He was a full professor and should know that what he is slinging is BS. But the money causes him to sling his BS to the determent of everyone. In the past, VCs would employ tech due diligence experts who would have told them this guy isn't an expert in systems programming and his code is slow and reinvents the wheel badly. But his success shows that if you throw enough BS and marketing, even SV can been fooled quite easily.

      • i think sfcat made the point clearly, but not completely.

        the key to avoiding databricks is to know what you're doing in the first place. databricks is trying to position itself as the microsoft word of whatever it does, but if you look at the testimonials, it's generally being used for glorified ETL. nothing wrong with ETL, but 1) databricks promotes itself as a viable AI platform and it's essentially fraudulent. the spark ML built-ins are terrible, and the spark parallelism engine is terrible, and 2) if yo

    • by narcc ( 412956 ) on Saturday October 23, 2021 @04:30PM (#61920841) Journal

      a clumsy service wrapped around a clumsy frontend for spark

      Sounds enterprise ready to me. What else is on the check list ...

      Does it require at least two different, highly specific, versions of crystal reports and the full installation of the quickbooks SDK for no discernible reason?

      • by Junta ( 36770 ) on Saturday October 23, 2021 @06:28PM (#61921005)

        Despite decades in the industry, I've never managed to understand how 'enterprises' end up paying crazy amounts of money for terrible software that works so poorly and generally even when 'working' doesn't seem to provide benefit to... anyone in particular.

        • I agree. The company I work used a kuldge of accounting software to handle their distribution.

          Half of their issues could be cleaned up by using modern current software designed for distribution first and accountants second.

          Nope instead they have spent 10,000's customizing the whole thing around their business.

        • by ShanghaiBill ( 739463 ) on Saturday October 23, 2021 @09:48PM (#61921267)

          Despite decades in the industry, I've never managed to understand how 'enterprises' end up paying crazy amounts of money for terrible software

          Let me explain:

          Let's say you use free software. Who gets the blame when things go wrong? Answer: You do.

          Let's say you buy cheap software. Who gets the blame? Answer: Still you, because you cheaped out and the guy who recommended the expensive option will be pointing the finger at you and saying "I told you so!"

          Now, let's say you spend millions on Oracle Enterprise crapware. Of course, that is going to be a disaster. But you paid them MILLIONS, so you can smugly point the finger at them while waving around the contract full of broken promises, and avoid all culpability.

          • by sfcat ( 872532 )
            Never tried to use Oracle support I see. I assure you that in the case where their software breaks and your infrastructure fails, the problem is still yours to fix. And you are still on the hook to try to fix it. You might be able to get your boss to see that it isn't your fault. But when the division is downsizes due to poor performance, you are the one shown the door all the same. Blaming doesn't fix the problem. And support often doesn't know why the bug is happening in the first place.
        • by sfcat ( 872532 )
          So sometimes people will use phrases to hide meaning. In this case, what they were telling him is that his code was architected incorrectly and was 10X slower than other implementations. Sometimes free means terrible. Hadoop is free and total dogshit. There's a lot of shit opensource projects. There are amazing ones too. So there is certainly a precedent for free being awful sometimes. Giving something away for free doesn't mean it was good in the first place. Spark certainly isn't. Spark is what ha
      • Well we can offer you three contractors at $300/h each. Does that make it feel enterprise now?

        Truth is if an expense is saving the company money or making it money, then every piece of that expense is worthwhile.

    • by crunchygranola ( 1954152 ) on Saturday October 23, 2021 @05:45PM (#61920961)

      Says someone who knows nothing about it.

      I am using Databricks on a huge enterprise project and it is awesome.

      But dismissing something as crap is sound just so cool!

      • by ChatHuant ( 801522 ) on Saturday October 23, 2021 @09:13PM (#61921221)

        Well, if you did use it and know what you're talking about you should be disqualified from posting in this thread. How can others feel smug thrashing it when you show up with actual user experience?

        • by sfcat ( 872532 )
          Do you work for Databricks? Because I've literally heard 100s of people talk about it and you are the first to say anything nice at all about it. Seriously, its a sport in multiple places I've consulted. Kinda like your momma jokes.
      • Would you mind telling us what's great about it?

  • by Anonymous Coward on Saturday October 23, 2021 @04:19PM (#61920819)

    How fucking hard is that to understand? A business enterprise isn't interested in a pile of open source software getting tossed at it by some academics. They want to know that you'll fix bugs, update features, provide training and basically stick around long enough to continue doing that sort of thing. And they'll pay for it.

    I have to admit to being jealous of all the bank they're pulling, especially because the article has nothing about any actual value their software provides other than waving an AI magic wand.

    • I have to admit to being jealous of all the bank they're pulling, especially because the article has nothing about any actual value their software provides other than waving an AI magic wand.

      That's how the late stage capitalism game is played. The only way you make more money is by starting with enough money to pitch your bullshit ideas to other rich suckers and getting them to fall for it.

    • by Junta ( 36770 ) on Saturday October 23, 2021 @06:42PM (#61921023)

      Often 'Enterprise-Ready' means "gives the vendor lots of money, but provide crap for support anyway"

      I remember even early in my career, we had several major issues with various 'enterprise-ready' products in a short time:
      -IBM support was unable to figure out why the software was crashing, and told us we'd have to pay more to debug the software that we already paid for that wouldn't even run at all. Ultimately I figured it out from their stack traces and guessing what they might have been doing and what about our data might trigger it. I even offered a test case and they said we'd have to pay more to send the test case to their developers.
      -A commercial SMB file serving application for Solaris was failing and the vendor was three days into a support ticket open about why updated Windows workstations were unable to access the share. I did a search and found the solution in a samba mailing list because samba had the same thing.
      -Two different hardware appliances from different hardware vendors with 'premium' upgraded warranties failed, one had a parts shortage and said there'd be a month before we could get a replacement part, and the other the vendor said not only were they out of parts, but they were never going to restock, and told us to buy a newer model at our own expense.

      • by khchung ( 462899 )

        Often 'Enterprise-Ready' means "gives the vendor lots of money, but provide crap for support anyway"

        I see you never understood what "support" really meant to an enterprise.

        You thought it meant putting in skilled people to fix any of your problems, but to an enterprise, "support" really meant "taking the blame" and follow up with whatever action is needed to pacify the stakeholders. The key is so that the managers of the enterprise won't get blamed.

        • by sjames ( 1099 )

          But keep in mind, that pacification is generally in the form of presenting an invincible armor of lawyers using the pages of fine print as a bludgeon.

    • by Bert64 ( 520050 )

      They want to *think* that you'll do those things, in reality most "enterprise" software is full of bugs that never get fixed, and instead you end up paying even more money for consultants to work around the bugs and training your users how to work around the bugs.

      Generally the more it costs the more buggy it is and the less likely anyone will ever fix the bugs, but because you spent so much on it you don't want to admit how buggy it is and end up doubling down on it.

      • They are happy to pay someone to take care of problems as they arise, and who will have an interest in keeping them happy enough that they will help (for a fee) to resolve the issues. The zero cost of the software is not that much of an attraction if you will have to spend even more trying to support the software internally.

    • by sjames ( 1099 )

      Since when has enterprise software had a bug fixed (usually badly) without billing the time to the customer anyway? Same for training. Might as well hire a 3rd party or even do it internally.

    • I wonder about Slashdot sometimes...

      Whenever there's an article about the low adoption of Linux in the consumer market, people around here can only look at things from an enterprise point of view, and are totally stumped as to why Linux isn't more popular in the mainstream. Here we have an article which raises a support issue very obvious to anyone familiar with the enterprise market, and people around here are stumped once again.

    • I see this all time. PHB's will more readily believe a product, service, or advice if it comes at a cost. The higher the cost, the more believable it is. It's a version of the fallacy of Appeal to Authority. I've seen this in my own organization, it goes something like this:

      Sr. Network Engineer to PHB: If we're going to grow at the scale you are stating, I'll need two more engineers, some new hardware, and about six months to pull it all together. The first year total should be no more than $500-750k (total

  • by oldgraybeard ( 2939809 ) on Saturday October 23, 2021 @04:36PM (#61920859)
    to buy their service/product why are they having issues at all? This looks like an PR announcement for their next funding round (of other peoples money).
  • It looks like the summary of this article is trying to ridicule companies that were reluctant to accept the product this startup was offering for free. But "free" is sometimes too expensive. The product they develop is some sort of data warehouse. What kind of guarantees did this startup offer regarding security, protection of personal data, protection against data loss, etc.? Was the product easy to implement? Did it have the right integrations? Was it easy to administer and scale? If the "free" product l
    • by sfcat ( 872532 )
      It is rebranded Spark with IBM level prices for consulting services which will "implement' it for you. It is by a wide margin the most expensive way to get the least quality for a modern analytics system. Its not a data warehouse and this business exists solely because most businesses have no idea how to build an analytics infrastructure (and even less reason to have one at all) but recently have decided they need one (even though they don't really know what it does).
  • [The software uses] data warehouseswith data lakesto create what it has coined data "lakehouses"... Users feed in their data and the AI makes predictions about the future.

    They missed the most important part, the stuff hidden in the ellipse.

  • ""It turns out that if you dust off the neural network algorithms from the '70s, but you use way more data"
    • by sfcat ( 872532 )
      Its funny and sad because the neural net technology is what he is supposed to be an expert in. And the current generation of neural networks looking almost nothing like the papers from the 70s. If you give this company your money, you deserve to lose it.
  • There was a useful open-source library that an engineer wanted to use, where I work. We had to reject it because the open-source license had been amended to include prohibition against being used by "greedy capitalists." While we don't consider ourselves "greedy capitalists" (and, in fact, we are occasionally accused of being liberal socialists by people on the internet), it wasn't clear what the exact legal implications of that statement were. So, we rejected the use of the library as "not having an enter

  • Back around 2000, Linux wasn't being accepted, and Linus was near bankrupt. Then Slashdot told him to found Red Hat... and all the problems were solved.

    • by ebvwfbw ( 864834 )

      Back around 2000, Linux wasn't being accepted, and Linus was near bankrupt. Then Slashdot told him to found Red Hat... and all the problems were solved.

      Linux has nothing to do with RedHat. RedHat was founded in 1994 and I met the guys in 1995 in Washington DC pitching the company. I was in on the IPO I think that was in 2000 and the rest is history. They were never near being bankrupt. In fact they were the only ones making money for years and proved it could be done. If it weren't for redhat things like Ubuntu/debian/suse wouldn't be around. Especially debian with now much stuff they stole from redhat.

      These things predate slashdot and Cmdr taco.

  • You don't need AI to graph a correlation between engine temperature and engine breakdown over time.

    Everything with AI is a overhyped scam right now.
    • the impact of overheating on an engine is a classical example where AI (machine learning) definitely helps. The reason is that you can have an infinite number of scenarios for overheating. It's only a small fraction of them (long-term exposure to a constant heat) where it MIGHT be true that you don't need AI to graph the correlation.

    • You don't need AI to know that you are overheating, or that failure is imminent. It might however be handy for figuring out why you are overheating, or even if they are at risk of same. Powertrain warranty terms have tended to increase. If you're going to have longer warranty periods you're going to want to predictively detect failures and prevent them if possible, and also to have leisure to schedule service so that you can maintain the minimum repair resources while still providing the maximum uptime.

  • by msauve ( 701917 ) on Saturday October 23, 2021 @09:59PM (#61921289)
    "It turns out that if you dust off the neural network algorithms from the '70s, but you use way more data than ever before and modern hardware, the results start becoming superhuman."

    ELIZA, is that you?
  • by Casandro ( 751346 ) on Sunday October 24, 2021 @04:46AM (#61921653)

    I mean first of all "Enterprise Ready" usually is code for "so many secret bugs we will never fix, you'll need to hire consultants to use it".
    Then it seems to be the usual "AI"-bullshit which uses decades old ideas to somehow avoid thinking about statistics.
    To top it all, it seems to argue that, just because they are valued by investors, the company is somehow important or actually worth a lot. We have seen over and over with companies like "We Work" or "Theranos" that this simply isn't the case.

One person's error is another person's data.

Working...