Groq Pricing Overview

Explore token-based, subscription, credit, and compute pricing for Groq. Data snapshot: 2025-12-14.

Text models: 0Curated plans: 3

Marketplace & Subscription Offerings

Llama 3 8B (Groq)

Text tokens

Groq

Token • per 1M tokens

Input $0.050 · Output $0.080

Ultra-low latency tier

Latency guarantees typically <200ms for 8B; pricing captured 2024-09.

View provider catalog

low-latency

Llama 3 70B (Groq)

Text tokens

Groq

Token • per 1M tokens

Input $0.590 · Output $0.790

High-accuracy, hardware-accelerated tier

Compute • $0.000 per second

Effective per-second compute with 500 tok/s target

Pricing captured 2024-09; compute estimate derived from Groq docs.

View provider catalog

latency-slareasoning

Mixtral 8x7B (Groq)

Text tokens

Groq

Token • per 1M tokens

Input $0.270 · Output $0.400

Mixture-of-experts with deterministic throughput

Pricing captured 2024-09.

View provider catalog

mixture-of-experts

Curated data is maintained manually; upstream plan changes may require confirmation before automation catches them.

← Back to catalog

Groq Pricing Overview

Explore token-based, subscription, credit, and compute pricing for Groq. Data snapshot: 2025-12-14.

Text models: 0Curated plans: 3

Marketplace & Subscription Offerings

Llama 3 8B (Groq)

Text tokens

Groq

Token • per 1M tokens

Input $0.050 · Output $0.080

Ultra-low latency tier

Latency guarantees typically <200ms for 8B; pricing captured 2024-09.

View provider catalog

low-latency

Llama 3 70B (Groq)

Text tokens

Groq

Token • per 1M tokens

Input $0.590 · Output $0.790

High-accuracy, hardware-accelerated tier

Compute • $0.000 per second

Effective per-second compute with 500 tok/s target

Pricing captured 2024-09; compute estimate derived from Groq docs.

View provider catalog

latency-slareasoning

Mixtral 8x7B (Groq)

Text tokens

Groq

Token • per 1M tokens

Input $0.270 · Output $0.400

Mixture-of-experts with deterministic throughput

Pricing captured 2024-09.

View provider catalog

mixture-of-experts

Curated data is maintained manually; upstream plan changes may require confirmation before automation catches them.