Ollama

Ollama Logo

Ollama is an easy-to-use solution for downloading and running popular large language models and can be downloaded from https://ollama.com.

Ollama Server

There are multiple ways to run Ollama using the ollama serve command, including through Docker, a systemd service, or other means. Please refer to the Ollama documentation to download, configure, and install Ollama.

Once Ollama is running as a server, it's typically accessible on port 11434 of your assigned host. If you're running Ollama on your current device, you can most likely use http://localhost:11434/ as a host in the ollama.hosts array in your config.jsonc file, though make sure you specify the correct host and port if that's not the case.

{
  "bots": [
    {
      // ...
      "ollama": {
        "hosts": ["http://localhost:11434/"]
      }
    }
  ]
}

Make sure to substitute your machine's hostname or IP address instead and that it's accessible from your Musebot instance(s).

Models

You can browse models for Ollama at https://ollama.com/search. If you're not certain which model is best for your use case, we recommend trying out mistral-nemo. It should perform well on most hardware and is flexible enough to answer most questions and adopt most personas. If mistral-nemo is still too slow, consider using the regular mistral model instead. You can, of course, use any large language model that you prefer.

Ollama provides a CLI for downloading these models:

ollama pull mistral-nemo

Once Ollama downloads your preferred model, you can specify it in your config.jsonc file by adding it to the ollama.models array.

Image Attachment Support

If you also integrate Musebot with a ComfyUI instance with mode set to "chat", Musebot will use the large language model response as a prompt for an image and attach it to its response asynchronously, providing a visual for the response.