Fine-tuning models

The guide will walk you through the steps to fine-tune a Fireworks supported base model.

The fine-tuning service is currently in its pre-alpha phase and is 100% free to use. Please give it a try and share your feedback in the #fine-tuning Discord channel.

We utilize LoRA (Low-Rank Adaptation) for efficient and effective fine-tuning of large language models. Take advantage of this opportunity to enhance your models with our cutting-edge technology!


Fine-tuning a model with a targeted dataset is crucial for several key reasons:

  1. Enhanced Precision: It allows the model to adapt to the unique attributes and trends within the dataset, leading to significantly improved precision and effectiveness.
  2. Domain Adaptation: While many models are developed with general data, fine-tuning them with specialized, domain-specific datasets ensures they are finely attuned to the specific requirements of that field.
  3. Bias Reduction: General models may carry inherent biases. Fine-tuning with a well-curated, diverse dataset aids in reducing these biases, fostering fairer and more balanced outcomes.
  4. Contemporary Relevance: Information evolves rapidly, and fine-tuning with the latest data keeps the model current and relevant.
  5. Customization for Specific Applications: This process allows for the tailoring of the model to meet unique objectives and needs, an aspect not achievable with standard models.

In essence, fine-tuning a model with a specific dataset is a pivotal step in ensuring its enhanced accuracy, relevance, and suitability for specific applications. Let's hop on a journey of fine-tuning a model!

Installing firectl

The firectl command-line interface (CLI) will be used to manage your LLM models.

curl -o firectl.gz
gzip -d firectl.gz && chmod a+x firectl
sudo mv firectl /usr/local/bin/firectl
sudo chown root: /usr/local/bin/firectl
curl -o firectl.gz
gzip -d firectl.gz && chmod a+x firectl
sudo mv firectl /usr/local/bin/firectl
sudo chown root: /usr/local/bin/firectl
wget -O firectl.gz
gunzip firectl.gz
sudo install -o root -g root -m 0755 firectl /usr/local/bin/firectl

Signing in

Run the following command to sign into Fireworks:

firectl signin

Confirm that you have successfully signed in by listing your account:

firectl list accounts

You should see your account ID.

Preparing your dataset

To fine-tune a model, we need to first upload a dataset. Once uploaded, this dataset can be used to create one or more fine-tuning jobs. A dataset consists of a single JSONL file, where each line is a separate training example.

  1. Create a new directory, e.g. /tmp/my-dataset/
  2. Create a single .jsonl file inside the directory containing the training examples.


  • Minimum number of examples is 1
  • Maximum number of examples is 20,000

To create dataset named by the identifier and upload the files:

firectl create dataset <DATASET_ID> /path/to/dataset.jsonl

and you can check the dataset with:

firectl get dataset <DATASET_ID>

To use an existing HuggingFace dataset, please refer to the script below for conversion. Datasets are private and cannot be viewed by other accounts.

Preparing your fine-tuning job settings

To kick-off a fine-tuning job, you need to specify the settings, and save it locally in a new directory.

Define a yaml file with following fields:

  • recipe: specifies the fine-tuning mode, now we havetext_completion for text completion, and classification for classification.
  • when the recipe is text_completion, specify the following fields:
    • input_template the text to serve as the prompt to models.
    • output_template the text for completions.
  • when the recipe is classification, specify the following fields:
    • text: the field in the json object as input prompt
    • label: the field in the json object as classification label.
  • modelthe base model identifier that the tuning job based on. The list of supported base models are specified below.
  • epochs: the total number of epochs to run over your dataset resource. Please be aware that the total number of examples to use in each fine-tuning job must be less than 20K examples(total examples in dataset * epochs).
  • learning_rate: The learning rate for the fine-tuning process.
  • wandb_project: Your Weights and Biases (wandb) project name.
  • wandb_key: Your Weights and Biases (wandb) API key.
  • wandb_entity: The entity under which your Weights and Biases (wandb) project is.




For this example, we'll use databricks/databricks-dolly-15k dataset focused on instruction following. Each record in this jsonl dataset consists of a category, instruction, an optional context, and the expected response. Here are a couple of sample records:

{"instruction": "When did Virgin Australia start operating?", "context": "Virgin Australia, the trading name of Virgin Australia Airlines Pty Ltd, is an Australian-based airline. It is the largest airline by fleet size to use the Virgin brand. It commenced services on 31 August 2000 as Virgin Blue, with two aircraft on a single route. It suddenly found itself as a major airline in Australia's domestic market after the collapse of Ansett Australia in September 2001. The airline has since grown to directly serve 32 cities in Australia, from hubs in Brisbane, Melbourne and Sydney.", "response": "Virgin Australia commenced services on 31 August 2000 as Virgin Blue, with two aircraft on a single route.", "category": "closed_qa"}
{"instruction": "Which is a species of fish? Tope or Rope", "context": "", "response": "Tope", "category": "classification"}
{"instruction": "Why can camels survive for long without water?", "context": "", "response": "Camels use the fat in their humps to keep them filled with energy and hydration for long periods of time.", "category": "open_qa"}
{"instruction": "Alice's parents have three daughters: Amy, Jessy, and what\u2019s the name of the third daughter?", "context": "", "response": "The name of the third daughter is Alice", "category": "open_qa"}
{"instruction": "When was Tomoaki Komorida born?", "context": "Komorida was born in Kumamoto Prefecture on July 10, 1981. After graduating from high school, he joined the J1 League club Avispa Fukuoka in 2000. Although he debuted as a midfielder in 2001, he did not play much and the club was relegated to the J2 League at the end of the 2001 season. In 2002, he moved to the J2 club Oita Trinita. He became a regular player as a defensive midfielder and the club won the championship in 2002 and was promoted in 2003. He played many matches until 2005. In September 2005, he moved to the J2 club Montedio Yamagata. In 2006, he moved to the J2 club Vissel Kobe. Although he became a regular player as a defensive midfielder, his gradually was played less during the summer. In 2007, he moved to the Japan Football League club Rosso Kumamoto (later Roasso Kumamoto) based in his local region. He played as a regular player and the club was promoted to J2 in 2008. Although he did not play as much, he still played in many matches. In 2010, he moved to Indonesia and joined Persela Lamongan. In July 2010, he returned to Japan and joined the J2 club Giravanz Kitakyushu. He played often as a defensive midfielder and center back until 2012 when he retired.", "response": "Tomoaki Komorida was born on July 10,1981.", "category": "closed_qa"}

Setting Up Fine-Tuning

Next, to set up how you'd like to use the dataset for fine-tuning, the text completion tuning setting should be configured with following fields:

  1. recipe: text_completion - this is to set the tuning goal to complete text, aka generic fine-tuning
  2. model: Set the base model identifier to be llama2-7b, which is LLaMA2 7B
  3. input_template: Craft an input template for the model, integrating the json object fields using {} placeholders.
  4. output_template: Define the expected response format, again using {} to inject json object fields.

A typical settings.yaml would look like:

recipe: text_completion

model: llama2-7b

input_template: "### GIVEN THE CONTEXT:{context}  ### INSTRUCTION: {sample}  ### RESPONSE IS: "
output_template: "ANSWER: {response}"


Now, let's try a different scenario to tune the model for multi-class classification specifically.


We will utilize the "symptom_to_diagnosis" dataset available on Hugging Face. This dataset is structured to associate medical symptoms with potential diagnoses. Records in the dataset might look like this:

{"output_text": "cervical spondylosis", "input_text": "I've been having a lot of pain in my neck and back. I've also been having trouble with my balance and coordination. I've been coughing a lot and my limbs feel weak."}
{"output_text": "impetigo", "input_text": "I have a rash on my face that is getting worse. It is red, inflamed, and has blisters that are bleeding clear pus. It is really painful."}
{"output_text": "urinary tract infection", "input_text": "I have been urinating blood. I sometimes feel sick to my stomach when I urinate. I often feel like I have a fever."}
{"output_text": "arthritis", "input_text": "I have been having trouble with my muscles and joints. My neck is really tight and my muscles feel weak. I have swollen joints and it is hard to move around without becoming stiff. It is also really uncomfortable to walk."}
{"output_text": "dengue", "input_text": "I have been feeling really sick. My body hurts a lot and I have no appetite. I have also developed rashes on my arms and face. The back of my eyes hurt a lot."}

Setting Up Fine-Tuning

To fine-tune a model for classifying symptoms to diagnoses, configure your settings.yaml file as follows:

  1. recipe: text_completion - this is to set the tuning goal to complete text, aka generic fine-tuning
  2. model: Set the base model identifier to be llama2-7b, which is LLaMA2 7B
  3. text: Craft an input template for the model, integrating the json object fields using {} placeholders.
  4. label: Define the expected response format, again using {} to inject json object fields.

A typical settings.yaml for this dataset would be:

recipe: classification

model: llama2-7b

text: symptoms
label: diagnosis

Kick-off fine-tuning job

You can kick off a fine-tuning job via firectl create fine-tuning-job command, by passing in the settings.yaml and dataset by following the example below:

firectl create fine-tuning-job <path-to-settings-dir>/settings.yaml --dataset-name accounts/<account-name>/datasets/<dataset-name>

Optionally, you can also pass in a unique model-id via flag to override the model name.

--model-id my-fine-tuned-model

If not specified, model will be named by default as:<recipe-job-id>-<recipe name>.

after the kick off, you can check the status of the recipe-job with:

firectl get fine-tuning-job <recipe-job-id>

after the fine-tuning is finished, the tuned model will be uploaded directly, and you can see the model via:

firectl list models

Deploy for inference/Clean up

To further deploy, or delete the model, please refer to the deploy fine-tuned models guide.

Supported models

  • mistral-7b: accounts/fireworks/models/mistral-7b
  • mixtral-8x7b: accounts/fireworks/models/mixtral-8x7b
  • mixtral-8x7b-instruct: accounts/fireworks/models/mixtral-8x7b-instruct
  • llama2-7b: accounts/fireworks/models/llama-v2-7b
  • llama2-7b-chat: accounts/fireworks/models/llama-v2-7b-chat
  • llama2-13b: accounts/fireworks/models/llama-v2-13b
  • llama2-13b-chat: accounts/fireworks/models/llama-v2-13b-chat
  • llama2-70b: accounts/fireworks/models/llama-v2-70b
  • llama2-70b-chat: accounts/fireworks/models/llama-v2-70b-chat
  • codellama-34b: accounts/fireworks/models/llama-v2-34b-code

HuggingFace to JSONL

If you'd like to use a dataset from huggingface in our fine-tuning service, the code snippet below should convert any dataset to a JSONL file:

import json

# Load your dataset
from datasets import load_dataset
dataset = load_dataset('<your_dataset_name>')

# Replace 'dataset_split' with the appropriate split you want to export, e.g., 'train', 'test', etc.
split_data = dataset['<dataset_split_name>']

# Open the file in write mode
counter = 0
with open('<your_output_file>.jsonl', 'w') as f:
    for item in split_data:
        # Write each item as a JSON object and then add a newline
        json.dump(item, f)
        counter +=f.write('\n')
print(f"{counter} lines converted")