Navigate:

All ReposLlama

~$LLAMA↑0.1%

Llama: Inference code for language models (deprecated)

PyTorch inference for Meta's Llama language models.

LIVE RANKINGS • 12:30 PM • STEADY

OVERALL

#350

AI & ML

#96

30 DAY RANKING TREND

ovr#350

·AI#96

STARS

59.2K

FORKS

9.8K

7D STARS

+28

7D FORKS

-3

Tags:

AI & ML

See Repo:

Learn more about Llama

Llama is an inference framework for Meta's open-source language models, ranging from 7 billion to 70 billion parameters. It uses PyTorch and supports distributed inference through torchrun, allowing parallel execution across multiple GPUs. The codebase includes model loading utilities, tokenizer integration, and example scripts for chat completion tasks. The repository has been deprecated in favor of specialized downstream projects that handle model distribution, safety, tooling, and agentic systems separately.

Distributed inference support

Uses torchrun for multi-GPU inference with configurable model parallelism, allowing users to adjust nproc_per_node based on model size requirements.

Minimal reference implementation

Designed as a lightweight example rather than a comprehensive framework, with basic utilities for model loading and tokenization that can be extended or integrated into other systems.

Direct model access

Provides download scripts and integration with Hugging Face for accessing model weights and tokenizers after license approval, with support for multiple model variants.

from llama import Llama

generator = Llama.build(
    ckpt_dir="llama-2-7b/",
    tokenizer_path="tokenizer.model",
    max_seq_len=128,
    max_batch_size=4
)

prompts = ["The future of AI is"]
results = generator.text_completion(prompts, max_gen_len=64, temperature=0.6)
print(results[0]['generation'])

Top in AI & ML

Trending Repos

Pi Mono

17,222#1

OpenClaw

233,443#2

Zvec

8,089#3

Claude Code

70,649#4

Heretic

9,761#5

See all →

LIVE RANKINGS • 12:30 PM • STEADY

OVERALL

#350

AI & ML

#96

30 DAY RANKING TREND

ovr#350

·AI#96

STARS

59.2K

FORKS

9.8K

7D STARS

+28

7D FORKS

-3

[ EXPLORE MORE ]

Related Repositories

Discover similar tools and frameworks used by developers

Llama: Inference code for language models (deprecated)

Learn more about Llama

What is Llama for?

What makes Llama different?

Distributed inference support

Minimal reference implementation

Direct model access

Example code snippets

Top in AI & ML

Pi Mono

OpenClaw

Claude Code

Heretic

Rowboat

Trending Repos

Pi Mono

OpenClaw

Zvec

Claude Code

Heretic

Related Repositories

Summarize

OpenClaw

X Recommendation Algorithm

Video2X

MLX

Product

Company

Helpful Links