Skip to content

Latest commit

 

History

History
187 lines (146 loc) · 9.58 KB

README.md

File metadata and controls

187 lines (146 loc) · 9.58 KB

Awesome Open (Source) Language Models

Friends of OLMo and their links. Built for the 2024 NeurIPS tutorial on opening the language modeling pipeline by Ai2 (slides here).

Language models (LMs) have become a critical technology for tackling a wide range of natural language processing tasks, making them ubiquitous in both AI research and commercial products. As their commercial importance has surged, the most powerful models have become more secretive, gated behind proprietary interfaces, with important details of their training data, architectures, and development undisclosed. Given the importance of these details in scientifically studying these models, including their biases and potential risks, we believe it is essential for the research community to have access to powerful, truly open LMs. In this tutorial, we provide a detailed walkthrough of the language model development pipeline, including pretraining data, model architecture and training, adaptation (e.g., instruction tuning, RLHF). For each of these development stages, we provide examples using open software and data, and discuss tips, tricks, pitfalls, and otherwise often inaccessible details about the full language model pipeline that we've uncovered in our own efforts to develop open models. We have opted not to have the optional panel given the extensive technical details and examples we need to include to cover this topic exhaustively.

This focuses on language models with more than just model weights being open -- looking for training code, data, and more! The best is fully open-source language models with the entire pipeline, but individual pieces are super valuable too.

🚧 Missed something? Give us a PR to add! 🚧


OLMo 2 (Nov. 2024)

AMD OLMo (Oct. 2024)

HuggingFace SmolLM (v2 Oct. 2024)

DataComp (Jun. 2024)

Databricks / formerly Mosaic ML

LLM 360

EleutherAI

Cerebras

RWKV

M.A.P.

Zyphra

Together.AI

NVIDIA

PyTorch / Meta