IBM Open-Sources Its Granite AI Models (zdnet.com) 10
An anonymous reader quotes a report from ZDNet: IBM managed the open sourcing of Granite code by using pretraining data from publicly available datasets, such as GitHub Code Clean, Starcoder data, public code repositories, and GitHub issues. In short, IBM has gone to great lengths to avoid copyright or legal issues. The Granite Code Base models are trained on 3- to 4-terabyte tokens of code data and natural language code-related datasets. All these models are licensed under the Apache 2.0 license for research and commercial use. It's that last word -- commercial -- that stopped the other major LLMs from being open-sourced. No one else wanted to share their LLM goodies.
But, as IBM Research chief scientist Ruchir Puri said, "We are transforming the generative AI landscape for software by releasing the highest performing, cost-efficient code LLMs, empowering the open community to innovate without restrictions." Without restrictions, perhaps, but not without specific applications in mind. The Granite models, as IBM ecosystem general manager Kate Woolley said last year, are not "about trying to be everything to everybody. This is not about writing poems about your dog. This is about curated models that can be tuned and are very targeted for the business use cases we want the enterprise to use. Specifically, they're for programming."
These decoder-only models, trained on code from 116 programming languages, range from 3 to 34 billion parameters. They support many developer uses, from complex application modernization to on-device memory-constrained tasks. IBM has already used these LLMs internally in IBM Watsonx Code Assistant (WCA) products, such as WCA for Ansible Lightspeed for IT Automation and WCA for IBM Z for modernizing COBOL applications. Not everyone can afford Watsonx, but now, anyone can work with the Granite LLMs using IBM and Red Hat's InstructLab.
But, as IBM Research chief scientist Ruchir Puri said, "We are transforming the generative AI landscape for software by releasing the highest performing, cost-efficient code LLMs, empowering the open community to innovate without restrictions." Without restrictions, perhaps, but not without specific applications in mind. The Granite models, as IBM ecosystem general manager Kate Woolley said last year, are not "about trying to be everything to everybody. This is not about writing poems about your dog. This is about curated models that can be tuned and are very targeted for the business use cases we want the enterprise to use. Specifically, they're for programming."
These decoder-only models, trained on code from 116 programming languages, range from 3 to 34 billion parameters. They support many developer uses, from complex application modernization to on-device memory-constrained tasks. IBM has already used these LLMs internally in IBM Watsonx Code Assistant (WCA) products, such as WCA for Ansible Lightspeed for IT Automation and WCA for IBM Z for modernizing COBOL applications. Not everyone can afford Watsonx, but now, anyone can work with the Granite LLMs using IBM and Red Hat's InstructLab.
Interesting to see Red Hat's influence (Score:2)
It looks like Red Hat is managing to turn big blue a shade of purple - I don't think this would have happened without Red Hat's influence.
Re: Interesting to see Red Hat's influence (Score:1)
Re: (Score:2)
Perhaps you don't know IBM's history. There is irrefutable evidence, and lots of it, that they actively assisted the Nazis with the Holocaust.
What was said above might have been uncharitable, but it certainly isn't stupid.
Re:Interesting to see Red Hat's influence (Score:5, Informative)
RedHat is a completely independent business in IBM. No one at RedHat had anything to do with this whatsoever.
Also IBM, for the record, has a longer history in open source than RedHat itself does. https://www.ibm.com/opensource... [ibm.com]
Young folks seem blissfully unaware that Linux would not even exist without IBM, it would have died a long time ago at the hands of SCO lawsuits.
Re: Interesting to see Red Hat's influence (Score:1)
Pretty sure Matt Hicks would disagree with this statement
Re: Interesting to see Red Hat's influence (Score:1)
Their fascinating paper (Score:4, Informative)
I read through their InstructLab paper and it is surprisingly educational. For example;
"The majority of the cost of training an LLM comes from the pre-training phase. During this phase, a model is trained in an auto-regressive manner to predict the next token in the target language using trillions of tokens worth of unlabeled data, requiring thousands of GPUs training for months at a time. Alignment tuning, typically happens in two stages: instruction tuning, followed by preference tuning. Instruction tuning is more akin to the traditional model training approach in machine learning, where the model is trained directly on tasks of interest. "
https://arxiv.org/html/2403.01... [arxiv.org]
Preference tuning is where humans rate responses as preferred or unpreferred. "Meta’s LLaMA 2 models were trained with just tens of thousands of high quality human-generated instruction/response data pairs".