This configuration file provides a structured way to define and expose large language models for use with Fireworks.

Schema

  • base_model - (required) The name of the base model. Either the Fireworks model name (e.g. accounts/fireworks/models/falcon-7b) or Hugging Face model ID (e.g. tiiuae/falcon-7b) may be provided.
  • conversation_config - (optional) The conversation config of the model. If specified, the /chat/completions API becomes enabled for this model.
    • style (required) - The conversation style. See Enabling chat completions API for details. Must be one of ["alpaca", "llama-chat", "llama-infill", "mistral-chat", "passthrough", "vicuna", "jinja"].
    • args(optional) - additional arguments to our conversation template
      • system: custom default system prompt for your model

Example

{
  "base_model": "meta-llama/Llama-2-70b-chat-hf", 
  "conversation_config": {
    "style": "llama-chat"
  }
}

Example for vicuna models with custom system message

{
  "base_model": "meta-llama/Llama-2-70b-chat-hf", 
  "conversation_config": {
    "style": "vicuna",
    "args": {
      "system": "your custom system message here"
    }
  }
}

Hugging Face to Fireworks model name

We map the Hugging Face model IDs to their corresponding Fireworks base model names.

Hugging Face Model IDFireworks Model Name
meta-llama/Llama-2-7b-hfaccounts/fireworks/models/llama-v2-7b
meta-llama/Llama-2-13b-hfaccounts/fireworks/models/llama-v2-13b
meta-llama/Llama-2-7b-chat-hfaccounts/fireworks/models/llama-v2-7b-chat
meta-llama/Llama-2-13b-chat-hfaccounts/fireworks/models/llama-v2-13b-chat
meta-llama/Llama-2-70b-hfaccounts/fireworks/models/llama-v2-70b
meta-llama/Llama-2-70b-chat-hfaccounts/fireworks/models/llama-v2-70b-chat
mistralai/Mistral-7B-v0.1accounts/fireworks/models/mistral-7b
mistralai/Mistral-7B-Instruct-v0.1accounts/fireworks/models/mistral-7b-instruct-4k
mistralai/Mixtral-8x7B-v0.1accounts/fireworks/models/mixtral-8x7b
mistralai/Mixtral-8x7B-Instruct-v0.1accounts/fireworks/models/mixtral-8x7b-instruct-hf
tiiuae/falcon-7baccounts/fireworks/models/falcon-7b
tiiuae/falcon-40baccounts/fireworks/models/falcon-40b

Advanced: define your own jinja template

We also allow for arbitrary formatting, like jinja template, for you to be able to handle advanced templating techniques. This is the full example fireworks itself can support, which handles

  • Uses vicuna template as the baseline
  • All inputs will be OpenAI messages and tools, and you can treat them as dictionaries
  • Remember to use the mode=generate if you want to share the same template across training and inference.
  • Example to raise exception, e.g. raise_exception('Expected non-empty messages')
  • function calling
  • trailing ASSISTANT: message, so that the model will generate the content
      {%- set _mode = mode | default('generate', true) -%}
      {%- set message_roles = ['SYSTEM', 'USER', 'ASSISTANT', 'TOOL'] -%}
      {%- set ns = namespace(seen_non_system=false, messages=messages, content='', system_suffix='') -%}
      {%- if _mode == 'generate' -%}
        {{ bos_token }}
        {%- set ns.system_suffix=' Today is ' + datetime.now().strftime('%Y-%m-%d %H:%M:%S') + '.' -%}
      {%- endif -%}
      {#- Basic consistency checks -#}
      {%- if not messages -%}
        {{ raise_exception('Expected non-empty messages') }}
      {%- endif -%}
      {%- if messages[0]['role'] | upper != 'SYSTEM' -%}
        {%- set ns.messages = [{'role': 'SYSTEM', 'content': 'You are a helpful assistant with access to functions. Use them if required.' + ns.system_suffix}] + messages -%}
      {%- endif -%}
      {%- for message in ns.messages -%}
        {%- set role = message['role'] | upper -%}
        {%- set ns.content = message['content'] if message.get('content') else '' -%}
        {%- if _mode == 'generate' -%}
          {#- Move tool calls inside the content -#}
          {%- if 'tool_calls' in message -%}
            {%- for call in message['tool_calls'] -%}
              {%- if not loop.first -%}
                {%- set ns.content = ns.content + ' ' -%}
              {%- endif -%}
              {%- set ns.content = ns.content + '<functioncall>{"name": "' + call['function']['name'] + '", "arguments": ' + call['function']['arguments'] + '}' -%}
            {%- endfor -%}
          {%- endif -%}
        {%- endif -%}
        {#- Validation -#}
        {%- if role not in message_roles -%}
          {{ raise_exception('Invalid role ' + message['role'] + '. Only ' + message_roles + ' are supported.') }}
        {%- endif -%}
        {%- if role == 'SYSTEM' and ns.seen_non_system -%}
          {{ raise_exception('SYSTEM messages have to be at the front') }}
        {%- endif -%}
        {#- First message is guaranteed to be a SYSTEM message per earlier checks -#}
        {%- if loop.first -%}
          SYSTEM: {{ ns.content }}
          {%- continue -%}
        {%- endif -%}
        {%- if role == ns.messages[loop.index0 - 1]['role'] | upper -%}
          {{ ns.content }}
          {%- if role == 'ASSISTANT' and (loop.last or ns.messages[loop.index0 + 1]['role'] | upper != 'ASSISTANT') -%}
            {{ eos_token }}
          {%- endif -%}
          {%- continue -%}
        {%- endif -%}
        {%- if role == 'ASSISTANT' and '<functioncall>' not in ns.content -%}
          {%- set ns.content = '<plain>' + ns.content -%}
        {%- endif -%}
        {#- First message after the SYSTEM section -#}
        {%- if not ns.seen_non_system and role != 'SYSTEM' -%}
          {%- set ns.seen_non_system = true -%}
          {{ '\n\n' }}FUNCTIONS: {{ functions }}{{ '\n\n' }}
          {#- Prompt masking separator -#}
          {%- if _mode == 'train' -%}
            {{ unk_token }}
          {%- endif -%}
          {{ role }}: {{ ns.content }}
          {%- if role == 'ASSISTANT' and (loop.last or ns.messages[loop.index0 + 1]['role'] | upper != 'ASSISTANT') -%}
            {{ eos_token }}
          {%- endif -%}
          {%- continue -%}
        {%- endif -%}
        {{ '\n\n' }}{{ role }}: {{ ns.content }}
        {%- if role == 'ASSISTANT' and (loop.last or ns.messages[loop.index0 + 1]['role'] | upper != 'ASSISTANT') -%}
          {{ eos_token }}
        {%- endif -%}
      {%- endfor -%}
      {%- if _mode == 'generate' -%}
        {{ '\n\n' }}ASSISTANT:{{ ' ' }}
      {%- endif -%}

And then you can put the template into fireworks.json like so (I recommend any yaml to json tool you like):

{
    "base_model": "accounts/fireworks/models/mixtral-8x7b-instruct-hf",
    "conversation_config": {
        "style": "jinja",
        "args": {
            "special_tokens_map": {"bos_token": "<s>", "eos_token": "</s>", "unk_token": "<unk>"},
            "template": "{%- set _mode = mode | default('generate', true) -%}\n{%- set message_roles = ['SYSTEM', 'USER', 'ASSISTANT', 'TOOL'] -%}\n{%- set ns = namespace(seen_non_system=false, messages=messages, content='', system_suffix='') -%}\n{%- if _mode == 'generate' -%}\n  {{ bos_token }}\n  {%- set ns.system_suffix=' Today is ' + datetime.now().strftime('%Y-%m-%d %H:%M:%S') + '.' -%}\n{%- endif -%}\n{#- Basic consistency checks -#}\n{%- if not messages -%}\n  {{ raise_exception('Expected non-empty messages') }}\n{%- endif -%}\n{%- if messages[0]['role'] | upper != 'SYSTEM' -%}\n  {%- set ns.messages = [{'role': 'SYSTEM', 'content': 'You are a helpful assistant with access to functions. Use them if required.' + ns.system_suffix}] + messages -%}\n{%- endif -%}\n{%- for message in ns.messages -%}\n  {%- set role = message['role'] | upper -%}\n  {%- set ns.content = message['content'] if message.get('content') else '' -%}\n  {%- if _mode == 'generate' -%}\n    {#- Move tool calls inside the content -#}\n    {%- if 'tool_calls' in message -%}\n      {%- for call in message['tool_calls'] -%}\n        {%- if not loop.first -%}\n          {%- set ns.content = ns.content + ' ' -%}\n        {%- endif -%}\n        {%- set ns.content = ns.content + '<functioncall>{\"name\": \"' + call['function']['name'] + '\", \"arguments\": ' + call['function']['arguments'] + '}' -%}\n      {%- endfor -%}\n    {%- endif -%}\n  {%- endif -%}\n  {#- Validation -#}\n  {%- if role not in message_roles -%}\n    {{ raise_exception('Invalid role ' + message['role'] + '. Only ' + message_roles + ' are supported.') }}\n  {%- endif -%}\n  {%- if role == 'SYSTEM' and ns.seen_non_system -%}\n    {{ raise_exception('SYSTEM messages have to be at the front') }}\n  {%- endif -%}\n  {#- First message is guaranteed to be a SYSTEM message per earlier checks -#}\n  {%- if loop.first -%}\n    SYSTEM: {{ ns.content }}\n    {%- continue -%}\n  {%- endif -%}\n  {%- if role == ns.messages[loop.index0 - 1]['role'] | upper -%}\n    {{ ns.content }}\n    {%- if role == 'ASSISTANT' and (loop.last or ns.messages[loop.index0 + 1]['role'] | upper != 'ASSISTANT') -%}\n      {{ eos_token }}\n    {%- endif -%}\n    {%- continue -%}\n  {%- endif -%}\n  {%- if role == 'ASSISTANT' and '<functioncall>' not in ns.content -%}\n    {#- TODO: make it work for duplicate messages where function call is in a subsequent message -#}\n    {%- set ns.content = '<plain>' + ns.content -%}\n  {%- endif -%}\n  {#- First message after the SYSTEM section -#}\n  {%- if not ns.seen_non_system and role != 'SYSTEM' -%}\n    {%- set ns.seen_non_system = true -%}\n    {{ '\\n\\n' }}FUNCTIONS: {{ functions }}{{ '\\n\\n' }}\n    {#- Prompt masking separator -#}\n    {%- if _mode == 'train' -%}\n      {{ unk_token }}\n    {%- endif -%}\n    {{ role }}: {{ ns.content }}\n    {%- if role == 'ASSISTANT' and (loop.last or ns.messages[loop.index0 + 1]['role'] | upper != 'ASSISTANT') -%}\n      {{ eos_token }}\n    {%- endif -%}\n    {%- continue -%}\n  {%- endif -%}\n  {{ '\\n\\n' }}{{ role }}: {{ ns.content }}\n  {%- if role == 'ASSISTANT' and (loop.last or ns.messages[loop.index0 + 1]['role'] | upper != 'ASSISTANT') -%}\n    {{ eos_token }}\n  {%- endif -%}\n{%- endfor -%}\n{%- if _mode == 'generate' -%}\n  {{ '\\n\\n' }}ASSISTANT:{{ ' ' }}\n{%- endif -%}\n"
        }
    },
    "has_teft": true
}

We understand that it is not the easiest thing to test these advanced jinja template right now, we are working hard to open source these code and then provide more testing code around this feature.