Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Open Source AI

Grok AI Goes Open Source (venturebeat.com) 38

xAI has opened sourced its large language model Grok. From a report: The move, which Musk had previously proclaimed would happen this week, now enables any other entrepreneur, programmer, company, or individual to take Grok's weights -- the strength of connections between the model's artificial "neurons," or software modules that allow the model to make decisions and accept inputs and provide outputs in the form of text -- and other associated documentation and use a copy of the model for whatever they'd like, including for commercial applications.

"We are releasing the base model weights and network architecture of Grok-1, our large language model," the company announced in a blog post. "Grok-1 is a 314 billion parameter Mixture-of-Experts model trained from scratch by xAI." Those interested can download the code for Grok on its Github page or via a torrent link. Parameters refers to the weights and biases that govern the model -- the more parameters, generally the more advanced, complex and performant the model is. At 314 billion parameters, Grok is well ahead of open source competitors such as Meta's Llama 2 (70 billion parameters) and Mistral 8x7B (12 billion parameters). Grok was open sourced under an Apache License 2.0, which enables commercial use, modifications, and distribution, though it cannot be trademarked and there is no liability or warranty that users receive with it. In addition, they must reproduce the original license and copyright notice, and state the changes they've made.

This discussion has been archived. No new comments can be posted.

Grok AI Goes Open Source

Comments Filter:
  • Comment removed (Score:5, Informative)

    by account_deleted ( 4530225 ) on Monday March 18, 2024 @11:04AM (#64325141)
    Comment removed based on user account deletion
  • by Press2ToContinue ( 2424598 ) on Monday March 18, 2024 @11:09AM (#64325161)
    giving the keys to the matrix to everyone.
    • by alvinrod ( 889928 ) on Monday March 18, 2024 @11:40AM (#64325233)
      That's the whole point of OSS. Why let a few governments or corporations control it? Not everyone will want it or even be capable of using it, but the opportunity is there. Otherwise you may as well say the same about an operating system or a web server.
    • More like giving everyone a couple of wooden blocks to play with.
    • by Rei ( 128717 )

      I dunno, I'm going to wait for some actual benchmarks.

      It's a gigantic model, by far the largest true-open-source (aka Apache, MIT, or similar licensed) model out there. But I strongly suspect it's severely undertrained relative to its size.

      Would probably be a great base for any research on increasing information density in models via downscaling, though.

  • Comment removed (Score:4, Informative)

    by account_deleted ( 4530225 ) on Monday March 18, 2024 @11:22AM (#64325203)
    Comment removed based on user account deletion
    • Deep Thought [wikipedia.org] was named after the computer from the Hitchhiker’s Guide to the Galaxy [wikipedia.org].
    • by Rei ( 128717 )

      Elon himself has made the same mistake ;)

  • by Rosco P. Coltrane ( 209368 ) on Monday March 18, 2024 @11:38AM (#64325223)

    is fucking debilitating. Even Soviet propaganda didn't come that thick and fast.

    • by Anonymous Coward

      Would you like some bitcoin stories instead?

    • by Rei ( 128717 ) on Monday March 18, 2024 @11:51AM (#64325281) Homepage

      Man, you seem stressed. Why not vent to ChatGPT about your problems and ask it for some advice on how to deal with the situation?

      • by mjwx ( 966435 )

        Man, you seem stressed. Why not vent to ChatGPT about your problems and ask it for some advice on how to deal with the situation?

        Oddly enough, I can see that as being a useful use of ChatGPT or another "AI" platform if it can be progammed to provide useful suggestions to people venting their problems at it. A FreudBot if you will. A lot of people will be reluctant to use mental health services, even if they're abundant and free, because they are scared of judgement when talking to an actual person. This is a much more surmountable barrier when talking to a machine.

        OTOH, you'd want it to be managed and programmed by someone reputab

    • Be the ignore you want to see in the world.

    • Soviet propaganda was rate limited by scarcity of typewriter ribbons and carbon paper, and by KGB review consequences.
  • by Dan East ( 318230 ) on Monday March 18, 2024 @12:05PM (#64325333) Journal

    The model is too big for casual use (IE running inference at home). LLama can just barely fit on either one 48 GB GPU, or two 24 GB GPUs. The number of parameters in Grok is much, much greater, really placing it outside the realistic realm of those wanting to run inference on their own hardware in a more hobbyist fashion. Sure, you can always run it on a normal CPU with a ton of RAM, but you'd be looking at multiple seconds per token.

    • Sure, "more" sounds better, but those need to live somewhere. And I haven't seen a GPU with an SSD yet to use as virtual memory (or whatever way you'd get crazy high amounts of parameters local).

      Scaling by the same factor gives 215GB of VRAM necessary, right? And me still sitting here with a 1080 ti...

      Hmm... 11GB dedicated, and 16GB shared? That's probably just CPU RAM that it can address (doesn't really speed things along).

  • Grok-1 doesn't seem like a terrible deal for half way between ChatGPT 3.5 and 4. 314B parameters, 8 experts, 2 experts per token... If that translates to something like 70B parameters per token mere mortals with lots of ram should be able to get a few tokens/s on a CPU.

  • They start the trademark claim yet?

    • by CAIMLAS ( 41445 )

      On what basis?

      The word 'grok' has existed since 1961, and has been a word of common colloquial use (in some demographics) for decades. It's in the New Hacker's Dictionary, descendant from The Jargon File (Jargon-1), which goes back to 1975. I was using the term in the 90s in high school.

      Their misspelling is trademarked, not the word, which has clear prior art. (That's also probably why the model is called grok-1, I assume.)

      As near as I can tell, there's no overlap between Groq and grok.

      Furthermore, grok's e

  • Can I run this locally somehow, ask it things and get the same results back as I would talking to the real Grok AI? Does this include all the training data?

  • The move ... now enables any other entrepreneur, programmer, company, or individual to take Grok's weights ... and use a copy of the model for whatever they'd like...

    That should probably read "Any[one] who happens to not only have 0.65TB of spare storage but also has access to enormous amounts of GPU processing." The model will just about fit in an AWS P5 instance, with a current on-demand price of $98.32 per hour, or $860K per year before bulk discounts. I'm sure that there will be plenty of takers for this, but I think it's a bit of a stretch to say any other entrepreneur, programmer, company, or individual just yet.

  • If it says anything even remotely negative about Musk?

To thine own self be true. (If not that, at least make some money.)

Working...