Open-Source Code Generation Models Catalog
AuthorEmmanuel Secretaria
Published Nov 12, 2025
Open-Weight AI Models: These models release their trained weights publicly but often under custom “source-available” licenses rather than standard open-source licenses. Open-Source AI Models: These models are released under OSI-approved open-source licenses (e.g. MIT License, Apache 2.0).
This document lists open and open-weight code-generation models available as of 2025, sorted by size, context window, and license.
| Model Name | Organization | Size & Context | License / Notes | Link |
|---|---|---|---|---|
| CodeLlama‑7B‑Instruct | Meta AI | 7B params, ~16K tokens context | Community License (Llama family) | https://theorionai.com/blog/top-open-llms |
| CodeLlama‑13B‑Instruct | Meta AI | 13B params, ~16K tokens | Community License (Llama family) | https://theorionai.com/blog/top-open-llms |
| CodeLlama‑34B‑Instruct | Meta AI | 34B params, ~16K tokens | Community License (Llama family) | https://e2enetworks.com/blog/top‑8‑open‑source‑llms‑for‑coding |
| CodeLlama‑70B‑Instruct | Meta AI | 70B params, context maybe <16K | Community License (Llama family) | https://e2enetworks.com/blog/top‑8‑open‑source‑llms‑for‑coding |
| StarCoder‑15B | BigCode | ~15B params | Apache 2.0 (open code‐model) | https://theorionai.com/blog/top‑open‑llms |
| CodeGen2.5‑7B‑Instruct | Salesforce AI Research | ~7B params | Open code generation focus | https://github.com/prthm786/awesome-genai-models |
| DeepSeek‑Coder‑6.7B‑Instruct | DeepSeek AI | ~6.7B params, long context (≥64K) | Source/weight open (check repo) | https://www.reddit.com/r/LangChain/comments/1ite411 |
| Qwen2.5‑Coder‑7B‑Instruct | Alibaba Cloud | 7B params, context ~128K tokens | Open‑weight under Apache 2.0 | https://www.reddit.com/r/ChatGPTCoding/comments/1ite5bb |
| Qwen3‑Coder | Alibaba Mail | 32B params, new architecture | Announced July 2025, open‑source model | https://www.reuters.com/world/china/alibaba-launches‑open‑source‑ai‑coding‑model‑touted‑its‑most‑advanced‑date‑2025‑07‑23/ |
| CodeGeeX‑13B | Unknown / Community | ~13B params | Open model for code generation | https://www.edenai.co/post/top-free-code-generation-tools-apis-and-open‑source‑models |
| CodeT5+‑16B | Salesforce / Community | ~16B params | Encoder‑decoder model for code tasks | https://github.com/prthm786/awesome-genai-models |
| Granite‑Code‑3B/8B/20B/34B | IBM Research | Multiple sizes: 3B,8B,20B,34B | Decoder‐only code model family | https://github.com/prthm786/awesome-genai-models |
| WaveCoder‑Ultra‑6.7B | Microsoft Research | ~6.7B params | Code generation & repair specialty | https://www.reddit.com/r/LangChain/comments/1ite411 |
| Free‑GPT‑Engineer | Community | Smaller model tuned for project generation | MIT license | https://www.edenai.co/post/top-free-code-generation-tools-apis-and‑open‑source‑models |
| Duckargs | Community | Small model for CLI argument generation | Open license | https://www.edenai.co/post/top-free-code-generation-tools-apis-and‑open‑source‑models |
| CodeBERT | Microsoft Research | ~0.5‑1B params | Multilingual code understanding model | https://www.edenai.co/post/top-free-code-generation-tools-apis-and‑open‑source‑models |
| Lambda / LLaMA 3 (Code fine‑tune) | Meta AI | 8B / 70B params | Code fine‐tuned variant of LLaMA3 | https://en.wikipedia.org/wiki/Llama_%28language_model%29 |
| BLOOM (multilingual + code) | BigScience | 176B params | Multilingual, includes code languages | https://en.wikipedia.org/wiki/BLOOM_%28language_model%29 |
| Gemma / Gemini family | Google DeepMind | Multiple sizes | Multimodal, some code tasks | https://en.wikipedia.org/wiki/Gemma_%28language_model%29 |
| GPT‑OSS‑120B / 20B | OpenAI | 120B / 20B params | Open‐weight (released 2025), general tasks including code | https://timesofindia.indiatimes.com/technology/tech‑news/openai‑launches‑new‑open‑source‑ai‑models-gpt‑oss‑120b-and‑gpt-oss‑20b |