To upload a custom base model from your local computer, you need to create the fireworks.json file for Fireworks to understand your model, and then run the following command:
firectl create base-model <model-id> "/path/to/model"
Fireworks container requires a few bits of metadata to be added to the checkpoint. These need to be stored in fireworks.json file in the model checkpoint directory (next to config.json).
Define a JSON file with the following fields:
model_arch: model architecture family, e.g. “llama”
model_config_name: concrete instantiation of model config, e.g. “v2-7b” for “llama”.
- weight-quantized variants of the model have “-w8a16” suffix. This architecture may be specified even if the weights are stored in 16bits. If this options is provided, the model weights will be converted to int8 at loading time (similarly to Huggingface’s load_in_8bit)
checkpoint_format: must be “huggingface”
use_huggingface_tokenizer: can be set to true in order to user HuggingFace tokenizer instead of Fireworks implementation. It might be slightly slower, but is helpful if the model was fine-tuned with additional tokens.
vocab_size: the container knows model’s vocabulary size based on model_arch and model_config_name. However, if the model was fine-tuned using additional tokens the vocab_size might need to be increased. In this case specify the value matching HuggingFace’s config.json.
If your model is prohibitively large or otherwise not conveniently accessible from your local computer, please contact your Fireworks representative for alternative upload methods.
Updated 4 months ago