AI + ML

This article is more than 1 year old

Meta releases code for massive language model to AI researchers

Now they can experiment with the algorithms even if they don't have hundreds of GPUs

Wed 4 May 2022 // 04:54 UTC

Meta will release a giant language model to academics, in hope that better understanding of how these systems work can make them less toxic and biased.

The Open Pretrained Transformer (OPT-175B) has 175 billion parameters, matching commercial language models like OpenAI's GPT-3. These types of systems have introduced capabilities for developers to build upon like automated copywriting, content moderation, or even coding. But they can generate text that's biased, toxic, and inaccurate, making them risky to use.

As Meta knows only too well from some of the human-generated texts it struggles to manage.

Proprietary tools are often out of reach for academic researchers who want to investigate the technology's issues - both in terms of access to a model's underlying code and possessing resources to build and train their own language models. Meta's latest code release, however, can help them study these systems in more detail.

"We are sharing Open Pretrained Transformer, a language model with 175 billion parameters trained on publicly available data sets, to allow for more community engagement in understanding this foundational new technology," researchers at the social media biz said on Tuesday. "For the first time for a language technology system of this size, the release includes both the pretrained models and the code needed to train and use them."

Meta has also released subsets of the full model – up to 66 billion parameters – for anyone to use. The complete and largest OPT-175 system, however, is only available to researchers on request for noncommercial applications. It was trained using 992 Nvidia 80GB A100 GPUs, reaching a performance of 147 TFLOPS per chip. Future researchers won't need to build the model and train it from scratch, because Meta is providing them with the code to deploy it on 16 Nvidia V100 GPUs.

Training such large models is tricky. Meta's team of researchers said they experienced numerous failures, and had to restart the whole process 35 times over a two month period, according to a paper [PDF] on arXiv.

A Meta spokesperson told The Register releasing OPT-175 will help academics reproduce results from large language model (LLM) papers.

"It is important to improve transparency and openness around large-scale research so that the future we build with this technology is more equitable and fair. The future of LLM work cannot solely live in the hands of those with financial interests in keeping this research behind closed doors," the spokesperson stated.

The code for Meta's smaller pre-trained models can be found here. If you're an academic and want the full version you can request it by completing this form. ®

Topics

Special Features

Vendor Voice

Resources

AI + ML

Meta releases code for massive language model to AI researchers

Now they can experiment with the algorithms even if they don't have hundreds of GPUs

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

Meta: If you're in our house running AI-massaged political ads, you need to 'fess up

Meta trials Purple Llama project for AI developers to test safety risks in models

Tech world forms AI Alliance to promote open and responsible AI

Shielding the data that drives AI

Creating a single AI-generated image needs as much power as charging your smartphone

Don't be fooled: Google faked its Gemini AI voice demo

Mere minority of orgs put GenAI in production after year of hype

With all eyes on OpenAI, Meta drags its Responsible AI team to the recycle bin

Trust us, says EU, our AI Act will make AI trustworthy by banning the nasty ones

Exposed Hugging Face API tokens offered full access to Meta's Llama 2

HPE targets enterprises with Nvidia-powered platform for tuning AI

The AI everything show continues at AWS: Generate SQL from text, vector search, and more

About Us

Our Websites

Your Privacy