Does Fireworks offer a fine-tuning service?

Yes, we offer both a fine-tuning service and allow external fine-tuned models to be uploaded for inference. View docs for our fine-tuning service and docs for uploading external fine-tuned models. Fireworks lets you simultaneously deploy 100 fine tuned models for fast, serverless inference.

What models are supported for fine-tuning? Is Llama 3 supported for fine-tuning?

Yes, Llama 3 (8B and 70B) is supported for fine-tuning LoRA adapters and serving them via our serverless and dedicated deployments for inference. You can see the full list of models available for fine-tuning in our docs.

How much does FireFunction cost?

FireFunction function calling is currently in an experimental alpha stage, usage is 100% free.


Are there extra fees for serving fine-tuned models?

There are no extra costs for fine-tuned models outside of the initial tuning cost. Fine-tuned models are served at the same price as base text models. See pricing page for details.

How much does Fireworks cost?

Fireworks AI is pay-as-you-go for all non-Enterprise usage and new users automatically receive free credits. You pay per token for serverless inference, per GPU usage time for on-demand deployments and per token of training data for tuning. For customers that require Enterprise-grade security and reliability, please reach out to us at [email protected] to discuss options.

Head over to Pricing to see more details.

What are spend limits? How do I increase my limits?

Spend limits (a.k.a Usage limits) restrict how much you can spend on Fireworks every month, it caps the accrued usage of the month. API requests will be rejected when your account's usage exceeds that limit. This helps prevent customers from getting unexpectedly high bills if their app goes viral.

We enforce different spend limit based on usage tiers and will automatically increase your spend limit quota to the next tier as your historic spend on Fireworks API goes up. The historical spend includes payments for both credits and past invoices. Head over to Pricing to see how much you need to spend in order to move to the next tier.

To increase your usage limit, you can buy prepaid credits at https://fireworks.ai/billing to increase your historic spend. For example, if your account is on tier 1 with $50/mo spend limit, you can buy $100+ credit and your spend limit will be increased to $500/mo automatically. Note: There could be a propagation delay after credit payment is complete. It's possible that you may still see "monthly usage exceeded error" persists for a few minutes after topup, please retry again later.

Why am I getting a "monthly usage exceeded error"? Do credits count against spend limits?

Yes credits count against spend/usage limits. For example, on tier 1, you can purchase $60 in credits but still have your usage stopped after $50 in usage. You'll need to purchase a larger volume of prepaid credits to advance tiers. If you exceed your account's usage limit, API requests will be rejected. Visit https://fireworks.ai/billing to add your payment method and monitor your usage and invoices.

If you do not have a payment method on file, your account will be suspended after your credits are depleted. Failure to pay a past invoice may also result in account suspension. Your usage limit will be set to $0/month in both cases

Serverless Inference

Are there discounts for bulk spend on serverless deployments?

Our publicly accessible services have standard rates for all customers.

Do you offer SLAs for Serverless usage?

Our multi-tenant Serverless offering does not come with Service Level Agreements (SLAs).

Other Questions

Does Fireworks support custom base models?

Custom base models are currently available only to enterprise accounts.

If there's an open-source base model that you would like to use, let us know on Discord and we'll see if we can deploy it for you!

There's a model I would like to use that isn't available on Fireworks.

We are actively taking feature requests for new, popular models to deploy on our platform. Head over to our Discord server and let us know which models you would like to see us deploy next!

If it's a PEFT add-on of a base model already available on our platform, you can deploy it yourself today!

What’s logged by Fireworks?

No prompt or generated data is logged on Fireworks, only meta-data like the number of tokens in a request, as required to deliver the service. The one exception to this rule is our proprietary FireFunction model where input/output data is logged to be able to view bulk analytics like the number of functions provided to the model.

I have another question or an issue.

We have an active Discord community where you can post questions, request features, and file bug reports.

Do you host your deployments in EU or Asia?

We are currently deployed in multiple US based locations. However, we are open to hear more to understand your specific requirements. Kindly join our Discord community or write us at [email protected]

If you're an Enterprise customer, please contact your dedicated customer support representative to ensure a timely response.

What support options exist?

Enterprise accounts receive dedicated support. For developer tier, please go to Discord to interact directly with the team and community