Fireworks is currently available in three tiers.

Tier SpecificationsDeveloper FreeDeveloper PROEnterprise
Text Pricing$1 free creditsAPI usage: $/1M tokensCustom
Image Pricing$1 free creditsAPI usage: $/stepCustom
Request rate limit10 requests/min100 requests/minUnlimited
Max # deployed models5100Unlimited
Custom PEFT addons
Custom base models
Dedicated deployments
Self hosted deployments
Community support
Dedicated enterprise support
Get Started!Get Started!Contact [email protected]

Developer pricing

Text (language, chat, code) models

Base model parameter count$/1M input tokens$/1M output tokens
up to 16B$0.20$0.80
16.1B - 80B$0.70$2.80
Mixtral 8x7B$0.40$1.60

Per-token pricing is applied only for non-dedicated deployments. Contact us for dedicated deployment pricing options.

Input tokens are determined from the prompt you supply in the request. Output tokens are the completions generated by the model. See Tokens section for more details.

Image models

For image generation models like SDXL, we charge based on the number of inference steps (denoising iterations).

SDXL, $/stepSDXL w/ ControlNet, $/step

Multi-modal models

For multi-modal models like LLaVA, each image is treated as 576 prompt tokens. The pricing is otherwise identical to text models.


We will not be charging for embedding models until March. Here is our pricing plan when it kicks in:

Base model parameter count$/1M input tokens
up to 150M$0.008
150M - 350m$0.016

Fine-tuning jobs

The Fireworks fine-tuning service is currently in an experimental alpha stage so usage is 100% free.

Developer PRO

Your account will be automatically enrolled in the Developer PRO tier by adding a valid payment method.

Developer PRO tier is currently prepaid only. You pay in advance to gain credits that can be used anywhere on the Fireworks platform. Credits will be deducted as you use our services.

An automatic top-up option will be available soon.

Enterprise pricing

Please contact us at [email protected] for a custom quote.