We’ve spent a lot of effort making Skald easy to self-host, but we’re still very early. New features are being added multiple times a day and our documentation can’t always keep up. We also don’t have a self-hosted release schedule yet, and all changes are just automatically available.If you’re interested in self-hosting, we highly recommend you talk to us on Slack so we can help you out and work together to get you a solid Skald deployment. We’re happy to help as much as possible.
Deployment types
The only supported way to deploy Skald today is with Docker Compose, but we have separate instructions depending on the type of deploy you want to do. Refer to the links below depending on your use case:- Local testing: use the Quickstart setup below
- Production self-hosted deploy
- Deployment with no third-party services (experimental)
Quickstart
The best way to try out the Skald self-hosted version is the following:http://localhost:3000 and the API at http://localhost:8000.
This setup will get you started quickly while only requiring one API key for an external service. We’ll spin up and configure all other services for you, including RabbitMQ and Postgres.
The one caveat of this deploy is that it relies entirely on OpenAI for the whole stack, which makes for a slightly slower API. The reason for this is that OpenAI doesn’t provide a reranking API, so we use a slower mechanism that uses LLM calls to rerank chunks. If you don’t understand what this means, that’s ok — you don’t have to. But just know that your API will be slower if you use OpenAI exclusively.
In our Cloud version, we use Voyage AI for both embeddings and reranking, and that’s what we recommend you do as well for the best performance (reranking is faster and better and the embedding models are arguably better too). That means also setting VOYAGE_API_KEY=<your_key> and EMBEDDING_PROVIDER=voyage (this will also apply to re-ranking).
Configuration
LLM
You can configure Skald to use multiple LLM providers, but you still need to set anLLM_PROVIDER environment variable. This is likely to change in the future, or transform into DEFAULT_LLM_PROVIDER but as of today, the provider defined by this env var will be used for:
- Chat responses as the default provider if not overridden in the
rag_config - Extracting the summary and tags for new memos
- LLM-as-a-Judge feature in Experiments
LLM_PROVIDER, those will be available for use in chat.
Embeddings
Configuration for embeddings works similarly to the LLM config, with the following vars:openai, we actually recommend using voyage. The reason this is not the default is simply because most people already have an OpenAI account these days, while VoyageAI is not used as widely. However, we use Voyage on our Cloud deployment and strongly recommend it — the embedding models are great.
The additional benefit of using Voyage embeddings is that you also will get the Voyage re-ranker, which is both really good and really fast. We currently don’t support configuring a re-ranker separately to the embedding provider but may do so in the future.
Document extraction
If you want to use document extraction features, you need to set appropriate environment variables for connecting to S3 or an S3-compatible object storage service. This is where documents will be stored.- You can set
DATALAB_API_KEY(from https://datalab.to) - You can set
DOCUMENT_EXTRACTION_PROVIDER=doclingand run the stack with thelocalprofile. This will spin up a local Docling server that works very well for document extraction, but not as well as Datalab. Docling is MIT-licensed though, and will be running on your own infrastructure.
Postgres
By default we will spin up a Postgres instance as part of the Docker Compose stack for you, and we will installpgvector on it. If you’re running a production deploy you would ideally host and manage Postgres yourself. If you do so, you just need to set the DATABASE_URL env var to point to your instance, and run the stack without starting the Postgres service.
If you do host Postgres elsewhere, the one thing you need to remember is to install the
pgvector extension on the instance.