Stability AI Launches StableLM, an Open Source ChatGPT Alternative 17
An anonymous reader quotes a report from Ars Technica: On Wednesday, Stability AI released a new family of open source AI language models called StableLM. Stability hopes to repeat the catalyzing effects of its Stable Diffusion open source image synthesis model, launched in 2022. With refinement, StableLM could be used to build an open source alternative to ChatGPT. StableLM is currently available in alpha form on GitHub in 3 billion and 7 billion parameter model sizes, with 15 billion and 65 billion parameter models to follow, according to Stability. The company is releasing the models under the Creative Commons BY-SA-4.0 license, which requires that adaptations must credit the original creator and share the same license.
Stability AI Ltd. is a London-based firm that has positioned itself as an open source rival to OpenAI, which, despite its "open" name, rarely releases open source models and keeps its neural network weights -- the mass of numbers that defines the core functionality of an AI model -- proprietary. "Language models will form the backbone of our digital economy, and we want everyone to have a voice in their design," writes Stability in an introductory blog post. "Models like StableLM demonstrate our commitment to AI technology that is transparent, accessible, and supportive." Like GPT-4 -- the large language model (LLM) that powers the most powerful version of ChatGPT -- StableLM generates text by predicting the next token (word fragment) in a sequence. That sequence starts with information provided by a human in the form of a "prompt." As a result, StableLM can compose human-like text and write programs.
Like other recent "small" LLMs like Meta's LLaMA, Stanford Alpaca, Cerebras-GPT, and Dolly 2.0, StableLM purports to achieve similar performance to OpenAI's benchmark GPT-3 model while using far fewer parameters -- 7 billion for StableLM verses 175 billion for GPT-3. Parameters are variables that a language model uses to learn from training data. Having fewer parameters makes a language model smaller and more efficient, which can make it easier to run on local devices like smartphones and laptops. However, achieving high performance with fewer parameters requires careful engineering, which is a significant challenge in the field of AI. According to Stability AI, StableLM has been trained on "a new experimental data set" based on an open source data set called The Pile, but three times larger. Stability claims that the "richness" of this data set, the details of which it promises to release later, accounts for the "surprisingly high performance" of the model at smaller parameter sizes at conversational and coding tasks. According to Ars' "informal experiments," they found StableLM's 7B model "to perform better (in terms of outputs you would expect given the prompt) than Meta's raw 7B parameter LLaMA model, but not at the level of GPT-3." They added: "Larger-parameter versions of StableLM may prove more flexible and capable."
Stability AI Ltd. is a London-based firm that has positioned itself as an open source rival to OpenAI, which, despite its "open" name, rarely releases open source models and keeps its neural network weights -- the mass of numbers that defines the core functionality of an AI model -- proprietary. "Language models will form the backbone of our digital economy, and we want everyone to have a voice in their design," writes Stability in an introductory blog post. "Models like StableLM demonstrate our commitment to AI technology that is transparent, accessible, and supportive." Like GPT-4 -- the large language model (LLM) that powers the most powerful version of ChatGPT -- StableLM generates text by predicting the next token (word fragment) in a sequence. That sequence starts with information provided by a human in the form of a "prompt." As a result, StableLM can compose human-like text and write programs.
Like other recent "small" LLMs like Meta's LLaMA, Stanford Alpaca, Cerebras-GPT, and Dolly 2.0, StableLM purports to achieve similar performance to OpenAI's benchmark GPT-3 model while using far fewer parameters -- 7 billion for StableLM verses 175 billion for GPT-3. Parameters are variables that a language model uses to learn from training data. Having fewer parameters makes a language model smaller and more efficient, which can make it easier to run on local devices like smartphones and laptops. However, achieving high performance with fewer parameters requires careful engineering, which is a significant challenge in the field of AI. According to Stability AI, StableLM has been trained on "a new experimental data set" based on an open source data set called The Pile, but three times larger. Stability claims that the "richness" of this data set, the details of which it promises to release later, accounts for the "surprisingly high performance" of the model at smaller parameter sizes at conversational and coding tasks. According to Ars' "informal experiments," they found StableLM's 7B model "to perform better (in terms of outputs you would expect given the prompt) than Meta's raw 7B parameter LLaMA model, but not at the level of GPT-3." They added: "Larger-parameter versions of StableLM may prove more flexible and capable."
What sort of machine ... (Score:2)
would you need to run this ?
Available online, but AI is not very good yet. (Score:5, Informative)
It's available online here [huggingface.co] if you just want to play with it.
The modern Chat AI's are not very good at all, but most people won't notice because the output looks so good. Enter "Give me a 3-day agenda for a vacation in Miami" and you'll get a list of reasonable-sounding things to see and do... but unless you are a travel agent familiar with Miami you won't be able to tell if the answers are legit. And if you go to a place that doesn't exist, you might think that it closed down after the chatbot was trained or something.
ChatGPT gives bad answers a *lot*, and unless you are an expert in the subject to begin with, you wouldn't notice the problems; for example, confusing "dielectric constant" with "loss tangent". I always imagine my conversation with ChatGPT as a HS student finding out information for an essay - and so far it's never been completely right.
I've been trying to come up with an easy way to show people how bad the AI chatbots are(*), and I think asking about popular movies is the answer. People know what happened in movies pretty well, and so can be "subject matter experts" in what actually happened. It's also highly unlikely that online reviews and synopses would have enough bad info to skew the results.
Here's an example from the Stability AI link cited above:
Me: Which characters died in the movie “Avengers Endgame”?
Stability AI:
There were several characters who died in the movie “Avengers Endgame”. These are:
1. Thanos: This is the villain who ultimately defeats the Avengers in the battle against Thanos at the end of the movie.
2. Captain America: This is the main character who sacrifices himself to save the world from Thanos.
3. Iron Man: Tony Stark is the protagonist of the movie who turns against Thanos after his plan to use a powerful weapon against Earth is foiled.
4. Star-Lord: The main antagonist of the movie, who turns against Thanos after he is captured by the Eternals and enslaved.
5. Ant-Man: This is a supporting character who is a brilliant scientist who assists Thanos in his plan.
6. Captain America: Tony Stark’s friend and ally who turns against Thanos after he discovers that his army of robots is being shut down by Ant-Man.
7. Ant-Man: A remote control creature that is transformed into a giant flying insect, capable of fighting on two legs and can crawl through walls.
8. Wasp: The protector of Earth, who is brainwashed into joining Thanos’ army against her own people.
9. Squirrel-paws: The scientist who becomes the best android to serve Thanos.
10. War Machine: This is the mechanic who is instrumental in the fight against Thanos.
I have a similar set showing bad output from ChatGPT. Actually, I have a *lot* of bad outputs from ChatGPT, but the movie questions are the easiest way to show up the problems.
Ask it about a movie you've seen, and see what hilarity it gives you.
(*) I'm an AI researcher, working on strong AI as my day job.
Hilariously wrong (Score:3)
Are those wrong? I'm not familiar with the movie.
Not only are they wrong, they are wrong in an increasing level of hilarity all the way down.(*)
Additionally, the system missed the character "Black Widow", who died in the movie and was easily the third most important character death, after Thanos and Iron man.
But the point is: ask it about a movie that *you* have seen and know something about. The link is available, you must have seen some movies, see what it comes up with.
(*) A microphone in my office after I typed that question would have recorded: "what
Re: (Score:3)
Re: (Score:2)
If it can generate only some generic statements then it is not very useful. Maybe can be used for marketing or generating online comments; some art at best.
Re: (Score:2)
Not really. Let me give you three examples of things where they have been useful to me. First example: I had to construct a survey on a topic I do not normally do surveys on. I had come up with eight questions I wanted on my survey. Then I told ChatGPT the survey topic and asked it to suggest other questions. It gave me five more questions. Two of those were nearly identical to ones I had, one was just a bad question, one was one I had considered by rejected, but one of them was really good. I had already
Re: (Score:2)
Do you want to say that they are good at producing soup of words with very little factual information?
I prefer to think of it as word salad. Lots of filler, not much actual nutrition.
Re: (Score:2)
Just curious, who is paying people to work on strong AI? The last problem you mentioned working on was counting colors. Your affiliation with LW lead me to believe that that was just a hobby project.
Re: (Score:1)
gives an answer that begins
whereas ChatGP
Re: (Score:2)
Being Actually Intelligent (maybe), I avoided Avengers, so couldn't judge on this list. But you make a good and simple observation - that is, the AI hasn't actually watched the film, or even read the script. It's just scraped umpteen web pages that talk about the film. As such, it's only as good as they are. If they're (collectively) wrong, then so is it.
No one considers Google Search to be "AI", and so if you ask for "dead characters in avengers movie" or whatever, you just get a load of pages to go look a
Re: (Score:2)
https://github.com/Stability-A... [github.com]
Running the 7b paramter model on a 16 GB card requires converting the params down to 8 bit, which will impact the results somewhat.
Re: (Score:2)
These smallish LLM's run on a PC with 8G or so, GPU not required, if you use something like llama.cpp with INT4 quantization.
RPI4 could work though perhaps a little slow.
Model size (Score:2)
Re: (Score:1)
The real question is why anybody would run LLM on the phone when cloud seems much more convenient when it comes to upgrades, power usage, etc
This issue is not about running it on your phone. The issue is about running a LLM on you PC for privacy reasons, running LLM on your robot, unaffected by your internet connection, or running it on your embedded device that you wish to control with a simple conversational AI.
Having 16 graphics cards and a 1TB on your robot is not a real solution.
Alternative ChatGPT assistant (Score:1)