Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
AI Open Source

Databricks Claims Its Open Source Foundational LLM Outsmarts GPT-3.5 (theregister.com) 17

Lindsay Clark reports via The Register: Analytics platform Databricks has launched an open source foundational large language model, hoping enterprises will opt to use its tools to jump on the LLM bandwagon. The biz, founded around Apache Spark, published a slew of benchmarks claiming its general-purpose LLM -- dubbed DBRX -- beat open source rivals on language understanding, programming, and math. The developer also claimed it beat OpenAI's proprietary GPT-3.5 across the same measures.

DBRX was developed by Mosaic AI, which Databricks acquired for $1.3 billion, and trained on Nvidia DGX Cloud. Databricks claims it optimized DBRX for efficiency with what it calls a mixture-of-experts (MoE) architecture â" where multiple expert networks or learners divide up a problem. Databricks explained that the model possesses 132 billion parameters, but only 36 billion are active on any one input. Joel Minnick, Databricks marketing vice president, told The Register: "That is a big reason why the model is able to run as efficiently as it does, but also runs blazingly fast. In practical terms, if you use any kind of major chatbots that are out there today, you're probably used to waiting and watching the answer get generated. With DBRX it is near instantaneous."

But the performance of the model itself is not the point for Databricks. The biz is, after all, making DBRX available for free on GitHub and Hugging Face. Databricks is hoping customers use the model as the basis for their own LLMs. If that happens it might improve customer chatbots or internal question answering, while also showing how DBRX was built using Databricks's proprietary tools. Databricks put together the dataset from which DBRX was developed using Apache Spark and Databricks notebooks for data processing, Unity Catalog for data management and governance, and MLflow for experiment tracking.

This discussion has been archived. No new comments can be posted.

Databricks Claims Its Open Source Foundational LLM Outsmarts GPT-3.5

Comments Filter:
  • but I don't brag about it.

  • by Tom ( 822 )

    The model requires ~264GB of RAM

    (from the github link)

    Bit much for my local setup, so we'll have to wait a bit before we can test this properly.

    • ... should be in quotes - if anything, it's even more restrictive than the LLaMA license. In particular banning the use of its outputs to train anything that's not DBRX related.

      This is beyond, as you note, that it's too large to run locally. Saying "only 36 billion are only active on any input" isn't useful with MOEs because the decision on gating isn't taken until you get to that layer / token, based on the inputs leading up to it, and loading on-demand is simply not practical, speed-wise. You have to ha

      • by Rei ( 128717 )

        ** The "in particular" is a commonality with the LLaMA license, not an example of a new restriction

  • by zmollusc ( 763634 ) on Tuesday April 02, 2024 @04:22AM (#64363174)

    I am running a LLM instance on drastically underclocked hardware and it is outperforming everything else, in that it produces fewer errors and hallucinations per hour than any other chatbot.

  • Emacs doctor [emacswiki.org] was ahead of it's time.
    Still better than GPT, because it asks you to think by yourself and find the answer.br
  • We see constant headlines about better performance but only offer a foggy notion of what that means. We have a need for speed. Real world users aren't going to wait 15 to 60 seconds to get a response from a prompt even if it is a good answer. And if the current crop of AI really gains traction, quotas for GPU time will become a very real problem and time to response will stunt adoption of the technology.
  • It may be source-available, but this is a shitty closed license resembling Facebook's Llama license.

    • by Rei ( 128717 )

      Yep. If anything, it's even worse.

      Mixtral is still the best that's truly open (and also not e.g. something that claims to be open but was clearly trained on closed inputs and which the ancestral model(s)' author(s) could make claims to).

      • I thought Stable Diffusion was open source?
        • by Rei ( 128717 )

          This discussion is about LLMs, not diffusion models.

          But for the record, Stability uses a wide range of different licenses for their different products, some of which are entirely proprietary, and others of which are quite open. They've been trending towards increasingly closed, though.

  • Define your niche, find a good 7B-30B models and you outsmart GPT-3.5 (beginning with 70B sometimes even GPT-4).
    Most claims to outsmart GPT-3.5 in all disciplines with a single model are false.

Human resources are human first, and resources second. -- J. Garbers

Working...