Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
AI Open Source

China's DeepSeek Coder Becomes First Open-Source Coding Model To Beat GPT-4 Turbo (venturebeat.com) 108

Shubham Sharma reports via VentureBeat: Chinese AI startup DeepSeek, which previously made headlines with a ChatGPT competitor trained on 2 trillion English and Chinese tokens, has announced the release of DeepSeek Coder V2, an open-source mixture of experts (MoE) code language model. Built upon DeepSeek-V2, an MoE model that debuted last month, DeepSeek Coder V2 excels at both coding and math tasks. It supports more than 300 programming languages and outperforms state-of-the-art closed-source models, including GPT-4 Turbo, Claude 3 Opus and Gemini 1.5 Pro. The company claims this is the first time an open model has achieved this feat, sitting way ahead of Llama 3-70B and other models in the category. It also notes that DeepSeek Coder V2 maintains comparable performance in terms of general reasoning and language capabilities.

Founded last year with a mission to "unravel the mystery of AGI with curiosity," DeepSeek has been a notable Chinese player in the AI race, joining the likes of Qwen, 01.AI and Baidu. In fact, within a year of its launch, the company has already open-sourced a bunch of models, including the DeepSeek Coder family. The original DeepSeek Coder, with up to 33 billion parameters, did decently on benchmarks with capabilities like project-level code completion and infilling, but only supported 86 programming languages and a context window of 16K. The new V2 offering builds on that work, expanding language support to 338 and context window to 128K -- enabling it to handle more complex and extensive coding tasks. When tested on MBPP+, HumanEval, and Aider benchmarks, designed to evaluate code generation, editing and problem-solving capabilities of LLMs, DeepSeek Coder V2 scored 76.2, 90.2, and 73.7, respectively -- sitting ahead of most closed and open-source models, including GPT-4 Turbo, Claude 3 Opus, Gemini 1.5 Pro, Codestral and Llama-3 70B. Similar performance was seen across benchmarks designed to assess the model's mathematical capabilities (MATH and GSM8K). The only model that managed to outperform DeepSeek's offering across multiple benchmarks was GPT-4o, which obtained marginally higher scores in HumanEval, LiveCode Bench, MATH and GSM8K. [...]

As of now, DeepSeek Coder V2 is being offered under a MIT license, which allows for both research and unrestricted commercial use. Users can download both 16B and 236B sizes in instruct and base avatars via Hugging Face. Alternatively, the company is also providing access to the models via API through its platform under a pay-as-you-go model. For those who want to test out the capabilities of the models first, the company is offering the option to interact. with Deepseek Coder V2 via chatbot.

This discussion has been archived. No new comments can be posted.

China's DeepSeek Coder Becomes First Open-Source Coding Model To Beat GPT-4 Turbo

Comments Filter:
  • Model link? (Score:5, Informative)

    by DogFoodBuss ( 9006703 ) on Wednesday June 19, 2024 @08:06AM (#64560595)
    Can we at least get the HF download link so that we could all try this out? Sheesh... https://huggingface.co/deepsee... [huggingface.co]
  • China at its finest (Score:2, Interesting)

    by boulat ( 216724 )

    A lot of the answers are identical to ChatGPT, word-for-word, even same Latex font and notation.

    They have gotten so good at stealing, its kind of impressive.

    An entire country dedicated to breeding and stealing.

    • by cascadingstylesheet ( 140919 ) on Wednesday June 19, 2024 @08:28AM (#64560651) Journal

      An entire country dedicated to breeding and stealing.

      Chicago? (ducks)

      • I've only ever done layovers in Chicago. Just endless rectangular blocks of usually damp, brown houses. Is there actual stuff there worth stealing?

    • by Anonymous Coward

      They have gotten so good at stealing, its kind of impressive.

      I immediately wondered how long before Americans will accuse Chinese of stealing, turned out it was almost the first post.

      They have even open sourced their model, go ahead and tell us which parts were stolen, genius. You are no better than the trolls that accused open source databases stealing from commercial products. Find the parts that were copied or shut up.

    • by AmiMoJo ( 196126 )

      Do you have a link to the evidence? The only thing mentioned in TFA are screenshots of test results, not the bot itself. Those tests do look like Latex output, but they aren't the AI itself.

      I have a feeling this is just another example of us dismissing everything China does, until suddenly China moved ahead of us and we move on to tariffs and bans. The Economist has an interesting article on how US attempts to kill Huawei have backfired: https://archive.ph/ykPGX [archive.ph] (paywall bypass)

      • by gweihir ( 88907 )

        China has a lot of problems. But industrially, they are not only catching up, in some areas they are now far ahead. And that is not good. Mindless dismissing them is a large part of the reason. If you dismiss somebody often enough it becomes a habit and you do not notice anymore what they actually do and can do. And then it is too late.

        The article you link is a nice example: Yes, China is still behind in the chips business. But they found they can survive the sanctions and this lag in tech skills will not g

        • by AmiMoJo ( 196126 )

          More than just survive. They have done really well, while the damage to the US is extensive.

        • > And that is not good

          Why not?
          USA being #1 is only good for the USA. China has shown it's more interested in lifting the world out of poverty than enforcing its will on the word, unlike the USA which insists on having its military bases/presence everywhere (even in China!).

    • When the models are trained on a lot of the same data, they're going to regurgitate similar things. There's only so many massive datasets around.

      • by gweihir ( 88907 )

        In fact, there is basically one. And OpenAI stole it just the same as China apparently has done now.

    • by Njovich ( 553857 )

      I had a similar situation where the response had a similar peculiarity (the example data chosen). I wonder if they are simply using chatgpt API in the background lol?

      • by Njovich ( 553857 )

        After testing some more, seems like most definitely not. Perhaps just trained on chatgpt data or a coincidence.

    • by AmiMoJo ( 196126 )

      Okay, I downloaded it (https://huggingface.co/deepseek-ai) and can confirm that this is not true. It's not ripped off from ChatGPT, it doesn't produce identical results.

    • Sounds like China is better at practicing capitalism than the USA.

    • Breeding is something they don't do as well anymore. But stealing? Yeah they got that down to a T.

    • I wondered how long it was going to take for suckers buying into propaganda playing "Let's you and him fight".
  • by iAmWaySmarterThanYou ( 10095012 ) on Wednesday June 19, 2024 @08:48AM (#64560719)

    LLM are really good "next word guessers" which literally do no work between input queries.

    Zero capacity to think, imagine, or anything else that would look like real AGI.

    I wish the LLM people would be loudly proud of the excellent work they have done in their field and stop spewing nonsense about AGI which is a completely unrelated concept.

    • by Viol8 ( 599362 ) on Wednesday June 19, 2024 @08:53AM (#64560727) Homepage

      "LLM are really good "next word guessers" which literally do no work between input queries."

      A common misconception. They're a damn site more than next word guessers otherwise they'd be nothing more than fancy markov models which have been around for half a century and having written the latter I can tell you they don't come close to outputting the same kind of intelligent responses of GPT.

      If you don't believe me ask GPT to solve a random equation you give it - you think thats just next word guessing?

      • Ok, fine, they are next word guessers that can 'solve' problems they've been fed similar training material before.

        They can not solve anything that wasn't fed and have zero ability to generate originate content.

        A human has inner thoughts, view points, opinions, feelings. LLM has cold dead nothing. Then you query it. It computes a response based on its training model. Then does nothing. It is not anticipating your next question or wondering why you asked or suggesting your project is in the wrong track b

        • by Bumbul ( 7920730 )

          Ok, fine, they are next word guessers that can 'solve' problems they've been fed similar training material before.

          They can not solve anything that wasn't fed and have zero ability to generate originate content.

          Of course they can. Been there, done that. You just told us that you have not really tried one yourself - I would refrain from commenting on this topic if I were you.

          • by taustin ( 171655 )

            All I've ever seen from ChatGPT (and yes, I have played around with it) is regurgitate, generic summaries of the crap that the internet is build from. Nothing new, nothing original, nothing even with any depth. Just generic, regurgitated crap.

            • by Bumbul ( 7920730 )

              All I've ever seen from ChatGPT (and yes, I have played around with it) is regurgitate, generic summaries of the crap that the internet is build from. Nothing new, nothing original, nothing even with any depth. Just generic, regurgitated crap.

              The discussion was about problem solving ability. I have thrown at it quite complex and original physics and math problems at it, and it excels in those. Sure, if the issue of the GP is that GPT has learned laws of physics (and mathematical equations governing those) and uses those to solve those problems, then yes, it is "just utilizing what it has learned from the internet".

              • by gweihir ( 88907 )

                Are you sure the problems were original? Because when I did the same for some original CS _easy_ problems, it failed abysmally with a total score of zero.

          • by gweihir ( 88907 ) on Wednesday June 19, 2024 @06:41PM (#64562307)

            Ok, fine, they are next word guessers that can 'solve' problems they've been fed similar training material before.

            They can not solve anything that wasn't fed and have zero ability to generate originate content.

            Of course they can. Been there, done that. You just told us that you have not really tried one yourself - I would refrain from commenting on this topic if I were you.

            No, they cannot. Anybody claiming that is simply clueless or in denial. First, the math does not allow them to be original. Not at all. Second, whenever you compare what an LLM and a careful web-search deliver you, the LLM is inferior. It is just a lot faster.And then, LLMs are very easy to mislead on elementary stuff. For example, I recently developed some "LLM-safe" exam question for an "open Internet" CS exam. Turns out that for everything that is easy to look up, this is next to impossible. But for anything where you need to think a bit or understand context, it is actually quite easy. It is not that that the LLM does not deliver results. It is that it misses important details, especially if these details are understood by experts but usually not discussed because experts think they are trivially obvious and hence not very interesting.

            • by Bumbul ( 7920730 )

              Ok, fine, they are next word guessers that can 'solve' problems they've been fed similar training material before.

              They can not solve anything that wasn't fed and have zero ability to generate originate content.

              Of course they can. Been there, done that. You just told us that you have not really tried one yourself - I would refrain from commenting on this topic if I were you.

              No, they cannot. Anybody claiming that is simply clueless or in denial.

              There is plenty of actual science written on this topic already. Start e.g. here: https://arxiv.org/abs/2310.087... [arxiv.org]

      • by Saffaya ( 702234 )

        "If you don't believe me ask GPT to solve a random equation you give it - you think thats just next word guessing?"

        Yes it is.
        Because next I will ask the reasoning behind the solve. You know, like a mathematics teacher would ask a student.

        • by Viol8 ( 599362 )

          Sorry, your point is? It can explain the reasoning behind it if you ask it. Perhaps you should try actually using it sometime.

        • Because next I will ask the reasoning behind the solve. You know, like a mathematics teacher would ask a student.

          There is no way I'm explaining to a mathematics teacher how obvious 1+1=2 is. Especially not a 300 page book [storyofmathematics.com].

          Nobody got time for that.

          • by gweihir ( 88907 )

            1+1 = 2 is not obvious at all. It happens to be a _definition_. It could easily be something completely different.

      • Oh yes, I tried using chatgpt as a math / science tutor (on the advice of my college proffesors) It's really good at convincing you its bullshit random guesses are correct. Humanity is doomed because so many clueless people are completely ignorant about how these systems work and what they are actually capable of, despite people in the know screaming it at them at the top of their lungs.
        • by jacks smirking reven ( 909048 ) on Wednesday June 19, 2024 @10:42AM (#64561059)

          Last year I was trying to cram a formula into a spreadsheet, pretty basic, just needed to formulaically determine the minor area of a circle based on an angle and a radius, both of which i provided to the AI.

          Now I knew the answer because I drew it up in CAD first but it could never actually give me the correct answer. I knew the mistakes it was making and I would even try to tell it exactly what to do and the answer I knew I needed but it basically could not contextualize, it would just keep apologizing and circulating the same 3 or so wrong answers with different "proofs" (and even they were wrong). I went back and forth both with GPT and Gemini (it was still Bard at the time) and neither of them could do it. It feels like it determines the answer somewhere behind the scenes from training data and then works backwards to justify itself.

          At the end of the day the tutor site "Math Is Fun" led me to the correct answer and I learned a bit of high school trig all over again.

        • by gweihir ( 88907 )

          This. ChatGPT is good for anything you could have just looked up. It is a complete failure for anything that requires the tiniest bit of thinking or insight. I tested in on my last Software Security lecture exam. Those things you can easily find with a search engine, it got 100% score on (the exam is on paper). Those questions you need to think a small bit (I like to make those about half of the point value), it got a flat 0%. No success at all. And it always failed the same way: It presented some related s

        • by narcc ( 412956 )

          on the advice of my college proffesors

          You should ask for a refund.

      • I have yet to see any that aren't easily misled by sampling the wrong question. For example, I had one recommend I use tr instead of sed because tr was more efficient for another person in another answer. It really had no idea whether the answer was right for me or not, it just knew that it had strong hits on a few key words and assumed it was correct.
      • Re: (Score:2, Insightful)

        by taustin ( 171655 )

        "LLM are really good "next word guessers" which literally do no work between input queries."

        A common misconception. They're a damn site more than next word guessers otherwise they'd be nothing more than fancy markov models which have been around for half a century and having written the latter I can tell you they don't come close to outputting the same kind of intelligent responses of GPT.

        Intelligent responses like putting glue on your pizza and eating a small rock every day for your health?

        At best, giving them the benefit of the doubt that they're actually more than next word guessers, if you train it on the internet (and what else is there) and most of the internet is crap, then most of your LLM is crap. And that's exactly what we're seeing.

        • by Viol8 ( 599362 )

          Well tell you what mate, why don't you go and write a procedural (or even declarative) program with similar comprehension of language and data and semantics. You know, like AI researchers had been trying to do since the 50s without much success.

          • by taustin ( 171655 )

            I'm not the one claiming magic black boxes that will cure all the world's ills, if you only give me enough money.

            They are. Time to put up or shut up is long gone.

            (And your response of "if you think it's impossible, then do it yourself" makes you look stupid .)

            • by Viol8 ( 599362 )

              "I'm not the one claiming magic black boxes that will cure all the world's ills, if you only give me enough money."

              Neither did I. Perhaps give the straw men a rest and try thinking of a sensible response next time.

              • by taustin ( 171655 )

                "I'm not the one claiming magic black boxes that will cure all the world's ills, if you only give me enough money."

                I didn't claim you did. You did claim I did, when you told me to do something I clearly believe is impossible.

                Neither did I. Perhaps give the straw men a rest and try thinking of a sensible response next time.

                You should, yes. Or maybe you get off on public humiliation. Not that there's anything wrong with that.

          • by narcc ( 412956 )

            with similar comprehension of language and data and semantics.

            You know they don't actually understand things, right? That's not how they work. It's just statistics and probability. There is, objectively, no comprehension, reason, or analysis.

            • by Viol8 ( 599362 )

              It doesn't need to so long as it appears to.

              https://en.wikipedia.org/wiki/... [wikipedia.org]

              • by narcc ( 412956 )

                Um... That's not what Searle is saying ... at all. Not even a little bit.

                Where do I even begin? First, Searle is arguing against computationalist approaches to Strong AI (a term he coined, by the way). Computationalists believe that by writing the right kind of program not only will it model a mind, it will be a mind. They consider the Turing test to be a kind of "scientific" test for minds. Searle concludes that even if such a machine could be built that it would be insufficient.

                His argument isn't com

      • by narcc ( 412956 )

        A common misconception.

        It's not a misconception, it's a perfectly accurate description.

        They're a damn site more than next word guessers

        Nope. That's precisely what they are. This is an indisputable fact.

        otherwise they'd be nothing more than fancy markov models

        LLMs are not Markovian, but they're similar in that they produce sequences one token at a time on the basis of learned probabilities. What makes LLMs unique is that they can generate next-token probabilities for novel sequences, something that a Markov model can't do, using sequences that are longer than would be practical.

        If you don't believe me

        I do not, because I know how they work. LLM chatbot

        • by Viol8 ( 599362 )

          To paraphrase an old saying - if you don't know what you're talking about its best to just shut up. You might want to take that advice.

          • by narcc ( 412956 )

            You do know that I have an actual background in this right? Fancy degrees and everything. I absolutely do know what I'm talking about.

            You, in contrast, fancy yourself an expert because you played with a chatbot.

            Maybe take your own advice there, kid.

    • by AmiMoJo ( 196126 )

      While entirely true, it doesn't matter as long as they are doing useful work. How useful that work is can be debated, but it seems pretty clear that we will soon reach a point where AI can do useful coding on behalf of non-coders.

      • That's great. LLM can do some useful work. Absolutely. LLM just isn't AGI, anything like it, and never will be. They need to stop talking about and making promises they can literally never keep.

        • by gweihir ( 88907 )

          That would be nice. But they are now all deep into derangement from greed and to not care about truth even one bit. I remember statements by OpenAI that no, ChatGPT was not AGI and would never be and that yet people would mistake it for AGI. I have seen no such statements anymore for quite some time. Too much money is involved by now.

      • by narcc ( 412956 )

        we will soon reach a point where AI can do useful coding

        I disagree. It seems obvious to me that LLMs will never reach that point, given the way the function.

        We can make larger models, more specialized models, models that tinker with context in various ways, etc. None of that, however, will change the fundamental nature of the system. LLMs are incapable of thinks like deliberation, reason, and analysis. This is an objective fact. LLMs write code the same way they produce any other text: one token at a time, without retaining any internal state between tokens,

        • LLMs write code the same way they produce any other text: one token at a time, without retaining any internal state between tokens

          Context IS the state between tokens.

          giving no consideration at all to any potential future tokens.

          No consideration of future would mean they wouldn't be able to tell a coherent story by applying common story elements. They wouldn't be able to organize output coherently. I wouldn't be able to ask this very model to do things that rely on such consideration.

          DeepSeek v2: Please say ten random words followed by the word autocomplete.

          1. Pomegranate
          2. Quasar
          3. Zephyr
          4. Nomenclature
          5. Haphazard
          6. Juxtaposition
          7. Pneumatic
          8. Cryptic
          9. Oblivion
          10. Synergy
          11. Autocomplete

          • by narcc ( 412956 )

            You don't have a clue, but here you are, rambling about things you clearly know nothing about... Ugh...

            Context IS the state between tokens.

            First, I said no internal state. Learn how to read. Also, the context is text, there is no guarantee that the last input token will match the last output token.

            No consideration of future would mean they wouldn't be able to

            OMG you're stupid. Your ignorant assumptions are obviously incorrect.

            It is an objective fact that each output token is produced without considerate to any potential future token. That this confuses you doesn't change reality.

            Maybe you should act

            • First, I said no internal state. Learn how to read.

              Context is internal to the execution of the model and it is maintained and updated as each token is reevaluated. It is the models short-term memory.

              Also, the context is text, there is no guarantee that the last input token will match the last output token.

              It is only represented to the user as text. In practice the key-value cache is being updated as new tokens are added to context for evaluation. Processing is necessarily serial. You have to wait for the last token before you can process the next one after last token was added so there is in fact a guarantee on the sequencing. There are various prediction sc

    • It's not exactly the right approach, but you're looking at it the wrong way. The brain self-forms more than one of this type of model. We don't know how similar, but I really think it's on the right track. In the brain, there's nothing telling them to start and stop. They are constantly receiving input queries in the form of the five sense and also feeding inputs and outputs into each other. The brain having "regions" could just be an emergent effect of how close the neurons are to the major nerve bun

      • humans make shit up on the spot based on their average statistical experiences (training/learning) then reverse reason it the same way GPT's do. *currently* implementations of LLM's behave partially like a frozen snapshot of a brain, occasionally unfrozen, given input, have their output read, then are frozen again, but there is nothing saying we cna't hook up a network of the, feeding into each other with constant input from the outside and giving them a 'train of thought' and access to a shared 'working me
        • The human brain is self actualized and has agency.

          If you run new data constantly through an LLM all you're doing is tweaking node connections. There is still no active 'thought' or anything that could be confused with thought going on. The LLM won't suddenly have original thoughts or make a decision about what its next input source should be or anything else.

          It only runs "thoughts" through its internal network in direct response to a human input query.

          You could hit it with API calls all day from an automa

          • The LLM won't suddenly have original thoughts or make a decision about what its next input source should be or anything else.

            It can certainly make decisions about input sources, there are a number of agent framework suites that enable just that.

            It only runs "thoughts" through its internal network in direct response to a human input query.

            All you have to do is change the end sequence to something the model was not trained on and it will talk to itself forever.

    • LLM are really good "next word guessers" which literally do no work between input queries.

      Zero capacity to think, imagine, or anything else that would look like real AGI.

      I wish the LLM people would be loudly proud of the excellent work they have done in their field and stop spewing nonsense about AGI which is a completely unrelated concept.

      Call it whatever you want, I use them every day with programming and it's very impressive.

      Short of impractically copying huge examples here, not sure how to "prove" that to the "LLM's can't program!" crowd, but yeah, actually, they can, to a really impressive degree.

    • by taustin ( 171655 )

      I wish the LLM people would be loudly proud of the excellent work they have done in their field and stop spewing nonsense about AGI which is a completely unrelated concept.

      What they've done doesn't attract nearly as many gullible investors as what they claim to have done.

      There's a word for companies that don't actually sell anything other than their stock.

    • Zero capacity to think, imagine, or anything else that would look like real AGI.

      Nobody is talking about AGI.

      I wish the LLM people would be loudly proud of the excellent work they have done in their field and stop spewing nonsense about AGI which is a completely unrelated concept.

      See above, nobody is asserting DeepSeek v2 is an AGI.

      • by gweihir ( 88907 )

        Have you seen some of the comments here? Does not look like "nobody" to me.

        • Have you seen some of the comments here? Does not look like "nobody" to me.

          Which comments? Care to point out any?

          • by gweihir ( 88907 )

            There are enough. Not only for this story. A simple web search for "slashdot AGI" yields tons of examples and that is only for the stories.

            • There are enough. Not only for this story. A simple web search for "slashdot AGI" yields tons of examples and that is only for the stories.

              So you can't name any? I figured that would be the case.

      • AGI is in the summary. There are plenty of people both on slashdot and at AI based companies talking about LLM = AGI.

        • AGI is in the summary. There are plenty of people both on slashdot and at AI based companies talking about LLM = AGI.

          The only mention of AGI in the summary is an aspirational mission statement having nothing stated or implied to do with the model at hand.

    • The pre-training of foundation models does lead to next word guessers at first. That's what you get when you only train on next word prediction on a corpus of text. If you directly use one of those models that were only trained this way, you'll find that they don't do anything useful, just spit out text that continues whatever sentences you input..

      But that's not what modern LLMs are: pre-training is just the first step. After that, they are fine-tuned to follow instructions, and reinforcement learning is
      • They are powerful tools, yes. No doubt.

        But how is that fine tuning any different than a focused version of the initial training?

      • by narcc ( 412956 )

        does lead to next word guessers at first. [...] After that, they are fine-tuned to follow instructions [...] At that point saying the models are "predicting the next word" isn't technically accurate any more.

        Oh, you are deeply confused.

        They are next word guessers from start to finish. Fine-tuning doesn't change anything fundamental about the system, it only narrows the range of possible outputs. It doesn't magically give them new abilities.

        • Oh, you are deeply confused.

          They are next word guessers from start to finish. Fine-tuning doesn't change anything fundamental about the system, it only narrows the range of possible outputs. It doesn't magically give them new abilities.

          The point of these systems and why anyone cares at all is that they are able to generalize applying what they've learned in useful ways.

          The autocomplete/next word guessing thing is a meaningless distraction. If I said "The alpha sticking problem can be solved by..." and asked a computer to complete the sentence even if it actually provided a valid answer the underlying autocomplete statements would be no more or less applicable. What actually matters is the capabilities of the system not unfalsifiable wor

          • by narcc ( 412956 )

            You've shown time and again that don't know anything at all about LLMs. Your thoughts on the subject are completely worthless.

            Go away.

            • You've shown time and again that don't know anything at all about LLMs. Your thoughts on the subject are completely worthless.
              Go away.

              Classic narcc. No receipts, no arguments just derisive commentary and table pounding.

    • by gweihir ( 88907 )

      Quite true. LLMs are in some way "better search" and have limited capabilities to summarize (limited rather badly by the lack of insight).

      But they have a number of severe problems besides absolutely no potential for AGI:
      - Training them on LLM output is a disaster due to model collapse. This can only be avoided by not training them on such output. Nobody can reliably identify such output though.
      - LLMs do not link enough back to the original training data. Hence nobody is motivated to give them more. People w

      • - Training them on LLM output is a disaster due to model collapse. This can only be avoided by not training them on such output. Nobody can reliably identify such output though.

        If it were a disaster people wouldn't be intentionally doing it to improve the quality of their models.

        Hence, what we see now may well be close to "peak capability" for LLMs and all future ones will be weaker. At the same time, the current ones age and also get weaker. I have no idea how fast this will go, but it is quite possible that LLMs are essentially over in 10 or 20 years because no new ones can be trained.

        All of this nonsense started with the old curse of recursion paper. It used a toy 125m model trained recursively in the most ridiculous manner to blatantly obvious effect. The paper has nothing to contribute to the real world impact on real world model quality. There is no objective basis to support your extraordinary claims.

  • Why are we reposting to /. from a website that just reposts press releases as news?

  • It's not bad. I asked it "Write a function that takes an image and a zernike polynomial description of an optical system aberration and returns the image after passing through the system."

    and it returned a decent description of how to install some relevant modules in Python, the funciton, and how to use it. The function isn't correct, but it's pretty close.

    Less trivial questions get pretty useless results. If it can find a library that does the job it seems pretty good at writing some structure around it. I

    • by dargaud ( 518470 )
      I asked a question about a little-used communication I knew nothing about and couldn't find code samples (just the specs). The first answer was hilariously wrong. After insisting and trying to coax it in the right direction, the next 6 were increasingly less stupid but still completely wrong. Only the 8th one was in the right direction (but still wouldn't compile).
      • by ceoyoyo ( 59147 )

        Yup. They're not able to write non-trivial code. They do seem to be capable of a lot of the grunt work that's the bread and butter of the corporate web app world though. They might even be able to turn out Google's next messaging app even faster than russiancoders.com.

        I do think it's hilarious that GPT 4 basically told me to go fuck myself. My friend, who insisted I try his GPT4 account after I mentioned my results with 3.5, pointed out that it's an advantage not to have to go through and delete wrong answe

        • by gweihir ( 88907 )

          Actually, they are not able to _find_ non-trivial code. They cannot write code at all. And with those two key insights, the observed behavior becomes obvious.

          This fact becomes more important when a model ages. And age they will because threining gets harder and harder due to model collapse.

          • by ceoyoyo ( 59147 )

            They write code. Even if you don't care to understand how they work, there are countless examples where the model comes up with something that isn't copied.

            I was a bit disappointed in their ability to copy actually. I've asked questions where I knew the solution was freely available online in many-year-old public GitHub repos, well labelled with hints in the prompt even, and they failed utterly. Also, none of them have read my papers.

            • by gweihir ( 88907 )

              Nope. They copy code and can adapt and mesh copies of code together to a degree. They cannot write code at all.

    • by gweihir ( 88907 )

      They're good at writing boilerplate and glue. GPT 4 seems better at recognizing its limits. So basically the performance of your average code monkey.

      That seems to be about it. But keep in mind that this here applies to the average "code monkey": https://blog.codinghorror.com/... [codinghorror.com]

  • Ollama version:

    quantizationQ4_0
    8.9GB

    OFC it needs more vram than I have... foiled again.

  • I mean, it is China. The possibility of this actually being 500 Chinese students behind the interface is real. Can it do things that require some minimal thinking? Then it is fake.

  • Now that is going to set the AI world on fire!
    https://www.youtube.com/watch?... [youtube.com]

Executive ability is deciding quickly and getting somebody else to do the work. -- John G. Pollard

Working...