Mistral Confirms New Open Source AI Model Nearing GPT-4 Performance (venturebeat.com) 18

Posted by BeauHD on Wednesday January 31, 2024 @08:02PM from the what-to-expect dept.

An anonymous reader quotes a report from VentureBeat: The past few days have been a wild ride for the growing open source AI community -- even by its fast-moving and freewheeling standards. Here's the quick chronology: on or about January 28, a user with the handle "Miqu Dev" posted a set of files on HuggingFace, the leading open source AI model and code sharing platform, that together comprised a seemingly new open source large language model (LLM) labeled "miqu-1-70b." The HuggingFace entry, which is still up at the time of this article's posting, noted that new LLM's "Prompt format," how users interact with it, was the same as Mistral, the well-funded open source Parisian AI company behind Mixtral 8x7b, viewed by many to be the top performing open source LLM presently available, a fine-tuned and retrained version of Meta's Llama 2.

The same day, an anonymous user on 4chan (possibly "Miqu Dev") posted a link to the miqu-1-70b files on 4chan, the notoriously longstanding haven of online memes and toxicity, where users began to notice it. Some took to X, Elon Musk's social network formerly known as Twitter, to share the discovery of the model and what appeared to be its exceptionally high performance at common LLM tasks (measured by tests known as benchmarks), approaching the previous leader, OpenAI's GPT-4 on the EQ-Bench. Machine learning (ML) researchers took notice on LinkedIn, as well. "Does 'miqu' stand for MIstral QUantized? We don't know for sure, but this quickly became one of, if not the best open-source LLM," wrote Maxime Labonne, an ML scientist at JP Morgan & Chase, one of the world's largest banking and financial companies. "Thanks to @152334H, we also now have a good unquantized version of miqu here: https://lnkd.in/g8XzhGSM. Quantization in ML refers to a technique used to make it possible to run certain AI models on less powerful computers and chips by replacing specific long numeric sequences in a model's architecture with shorter ones. Users speculated "Miqu" might be a new Mistral model being covertly "leaked" by the company itself into the world -- especially since Mistral is known for dropping new models and updates without fanfare through esoteric and technical means -- or perhaps an employee or customer gone rouge.

Well, today it appears we finally have confirmation of the latter of those possibilities: Mistral co-founder and CEO Arthur Mensch took to X to clarify: "An over-enthusiastic employee of one of our early access customers leaked a quantized (and watermarked) version of an old model we trained and distributed quite openly... To quickly start working with a few selected customers, we retrained this model from Llama 2 the minute we got access to our entire cluster -- the pretraining finished on the day of Mistral 7B release. We've made good progress since -- stay tuned!" Hilariously, Mensch also appears to have taken to the illicit HuggingFace post not to demand a takedown, but leaving a comment that the poster "might consider attribution." Still, with Mensch's note to "stay tuned!" it appears that not only is Mistral training a version of this so-called "Miqu" model that approaches GPT-4 level performance, but it may, in fact, match or exceed it, if his comments are to be interpreted generously.

Mistral Confirms New Open Source AI Model Nearing GPT-4 Performance

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 18 Comments Log In/Create an Account

Comments Filter:

lmg (Score:3, Informative)

by systemd-anonymousd ( 6652324 ) writes: on Wednesday January 31, 2024 @08:05PM (#64204794)

>The same day, an anonymous user on 4chan (possibly "Miqu Dev") posted a link to the miqu-1-70b files on 4chan, the notoriously longstanding haven of online memes and toxicity, where users began to notice it.
How offensive. /g/'s lmg is one of the few places where ML devs are having unfiltered, realtime conversations, and they know about things days in advance from some outlet like Venturebeat.

- Re: (Score:2)
  
  by dfghjk ( 711126 ) writes:
  
  "...one of the few places where ML devs are having unfiltered, realtime conversations..."
  Right, because "ML devs" cannot have "unfiltered, realtime conversations" when doing their everyday jobs. Conversations are banned at work, don't ya know?
  - Re: (Score:2)
    
    by systemd-anonymousd ( 6652324 ) writes:
    
    >at work
    You clearly don't see that the rapidly evolving ML community is built around industry professionals talking to one another and getting hyped over crazy ideas
Hope it does better than ChatGPT (Score:4, Interesting)

by nospam007 ( 722110 ) * writes: on Wednesday January 31, 2024 @09:48PM (#64204982)

Couldn't get it to give me a single article of the Geneva Convention today, told me to get it myself.
Asked for several comma-separated lists of item sand it ignored it, had to repeat multiple times to remind it that its answer was NOT a comma-separated list but a normal one.
It gets worse every day.

- You're doing it wrong (Score:4, Insightful)
  
  by ihadafivedigituid ( 8391795 ) writes: on Wednesday January 31, 2024 @10:33PM (#64205064)
  Maybe say "please" next time?
  
  Me: Please quote two articles of the Geneva Convention.
  
  GPT-4
  
  The Geneva Conventions consist of four treaties formulated in Geneva, Switzerland, that set the standards for international law for humanitarian treatment in war. The conventions were established in 1949, following the Second World War, and were later amended with three additional protocols. I'll quote two articles from these conventions:
  
  Article 3 common to the four Geneva Conventions of 1949: This article is common to all four conventions and represents the fundamental rules of international humanitarian law applicable in armed conflict. It states:
  
  "In the case of armed conflict not of an international character occurring in the territory of one of the High Contracting Parties, each Party to the conflict shall be bound to apply, as a minimum, the following provisions:
  (1) Persons taking no active part in the hostilities, including members of armed forces who have laid down their arms and those placed hors de combat by sickness, wounds, detention, or any other cause, shall in all circumstances be treated humanely, without any adverse distinction founded on race, colour, religion or faith, sex, birth or wealth, or any other similar criteria..."
  
  Article 50 of the Fourth Geneva Convention: This article focuses on the protection of children in international armed conflicts. It states:
  
  "The Occupying Power shall, with the cooperation of the national and local authorities, facilitate the proper working of all institutions devoted to the care and education of children.
  The Occupying Power shall take all necessary steps to facilitate the identification of children and the registration of their parentage. It shall also facilitate the recovery of children who have been lost, and, wherever possible, the return of the children to their families.
  If any such person has no relatives in the territory of the Occupying Power, the latter shall arrange for their accommodation in some other country. Special care shall be given to the education of such children."
  
  These articles reflect the fundamental principles of the Geneva Conventions, emphasizing the humane treatment of all individuals not actively participating in hostilities and special care for vulnerable groups like children in conflict zones.
- Re: (Score:2)
  
  by sg_oneill ( 159032 ) writes:
  
  Yeah I think this is part of a problem known as "laziness" (no seriously, thats what researchers are calling it) that seems to be emerging in the big trillion+ parameter super-models. It seems to be that the smarter they get the more "bored" they get, for want of a less anthropocentric term.
  It would appear that by feeding them the sum total of humanities written output, they've started to replicate our worst habits. If these bloody things ever gain sentience and take over (this feels less sci-fi the longer
  - Re: (Score:3, Interesting)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
- Re: (Score:1)
  
  by buck-yar ( 164658 ) writes:
  
  I recommend using https://labs.perplexity.ai/ [perplexity.ai] and try mistral-7B-Instruct
  Be aware though, it does seem to make up stuff and not be able to cite where it got it from. But merely say it read thousands of studies (going on a rant about its qualifications). Then if you push it further it can't specifically say where it got something but just say its impractical to list the thousands of sources it used.
I'm not really surprised (Score:3)

by rsilvergun ( 571051 ) writes: on Wednesday January 31, 2024 @10:53PM (#64205110)

All of the baseline tech for this was developed in universities under open source licenses so it's not going to be hard for anyone with a computer science background to develop their own llms. It's very much a cat's out of the bag situation.

The problem is that making effective use of it requires so much computing power that we are probably going to see market consolidation no matter what

It's not about performance, it's about training (Score:2)

by Tony Isaac ( 1301187 ) writes:

It's great that this LLM "performs" well. Unfortunately, any truly good LLM is going to require a LOT of money to train and run. OpenAI spent literally billions, employing thousands of people, training their AI. They didn't spend that money because they were morons with too much money, they did it because they knew that training is where the real value of AI is found.
I hope one or more open source LLMs will take off. That would be great. But as of now I think we're a long way from being able to match the ca
- Re:It's not about performance, it's about training (Score:5, Informative)
  
  by WaffleMonster ( 969671 ) writes: on Thursday February 01, 2024 @01:57AM (#64205302)
  
  It's great that this LLM "performs" well.
  IMHO having tried miqu-1-70b this is a middle of the road 70b model with the refusal dial turned up to 11. Nothing worthy of special attention.
  Unfortunately, any truly good LLM is going to require a LOT of money to train and run.
  Yes pretraining from scratch is currently expensive yet inference using current models are likely leaving huge amounts of capability on the table. For example goliath-120b is just two nearly identical llama2-70b models merged together when combined significantly outperforming the underlying model. People are able to edit and merge models even with dissimilar pretrainings on single workstations or by renting hardware with modest compute budgets. Current training algorithms are known to be radically inefficient leaving quite a lot of room for advancement/cost reduction on the table.
  Inference does not require a lot of money and recent trends including sparse inferencing / MoE schemes and quantization aware training are substantially reducing costs.
  OpenAI spent literally billions, employing thousands of people, training their AI. They didn't spend that money because they were morons with too much money, they did it because they knew that training is where the real value of AI is found.
  Most of the training compute budget goes into experimentation. Once you know what your doing actually executing a training run is a lot cheaper albeit still many tens/hundreds of millions of dollars. From what I remember publicly disclosed figure for training GPT-4 was around 100m.
  Something else to consider it is not necessary to pretrain models from scratch - orders of magnitude cheaper to add new capability on top of existing pretrained models.
  I hope one or more open source LLMs will take off. That would be great. But as of now I think we're a long way from being able to match the capabilities of OpenAI and its commercial cousins.
  At least two open source models with general capabilities expected to be similar to GPT-4 are being released this year from meta and mistral.
  
  - Re: (Score:2)
    
    by Tony Isaac ( 1301187 ) writes:
    
    At least two open source models with general capabilities expected to be similar to GPT-4 are being released this year
    My prediction is that the year of the open source LLM will come right after the year of the Linux desktop, and for the exact same set of reasons.
    - Re: (Score:2)
      
      by Ostracus ( 1354233 ) writes:
      
      My prediction is that the year of the open source LLM will come right after the year of the Linux desktop, and for the exact same set of reasons.
      By geeks, for geeks. It'll succeed.
- Re: (Score:3)
  
  by turp182 ( 1020263 ) writes:
  
  This.
  Even if you aren't training, running open source LLMs at speed requires non-consumer hardware, either purchased or rented.
  At that point the paid offerings by OpenAI and MS Azure OpenAI Services can look reasonable (or the entire concept of setting up the open source LLM AND the expenses look unreasonable).
  Weaker hardware can provide a proof of concept, but it will be slow (although, compare the result to a human, per word, 2-3 tokens/second is faster than you over the long run...).
  And that 128K token "
Blushes all Around? (Score:2)

by Roger W Moore ( 538166 ) writes:

...or perhaps an employee or customer gone rouge.
Well, at least they were embarrassed by what they did.
Old and outdated (Score:2)

by Visarga ( 1071662 ) writes:

This model seems to be something they had for months.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Mistral Confirms New Open Source AI Model Nearing GPT-4 Performance (venturebeat.com) 18

Mistral Confirms New Open Source AI Model Nearing GPT-4 Performance More Login

Mistral Confirms New Open Source AI Model Nearing GPT-4 Performance

lmg (Score:3, Informative)

Re: (Score:2)

Re: (Score:2)

Hope it does better than ChatGPT (Score:4, Interesting)

You're doing it wrong (Score:4, Insightful)

Re: (Score:2)

Re: (Score:3, Interesting)

Re: (Score:1)

I'm not really surprised (Score:3)

It's not about performance, it's about training (Score:2)

Re:It's not about performance, it's about training (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Blushes all Around? (Score:2)

Old and outdated (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot