DocumentationAPI Reference📓 Tutorials🧑‍🍳 Cookbook🤝 Integrations💜 Discord🎨 Studio
Documentation

Docker

Learn how to deploy your Haystack pipelines through Docker starting from the basic Docker container to a complex application using Hayhooks.

Running Haystack in Docker

The most basic form of Haystack deployment happens through Docker containers. Becoming familiar with running and customizing Haystack Docker images is useful as they form the basis for more advanced deployment.

Haystack releases are officially distributed through the deepset/haystack Docker image. Haystack images come in different flavors depending on the specific components they ship and the Haystack version.

📘

At the moment, the only flavor available for Haystack is base, which ships exactly what you would get by installing Haystack locally with pip install haystack-ai.

You can pull a specific Haystack flavor using Docker tags: for example, to pull the image containing Haystack 2.12.1, you can run the command:

docker pull deepset/haystack:base-v2.12.1

Although the base flavor is meant to be customized, it can also be used to quickly run Haystack scripts locally without the need to set up a Python environment and its dependencies. For example, this is how you would print Haystack’s version running a Docker container:

docker run -it --rm deepset/haystack:base-v2.12.1 python -c"from haystack.version import __version__; print(__version__)"

Customizing the Haystack Docker Image

Chances are your application will be more complex than a simple script, and you’ll need to install additional dependencies inside the Docker image along with Haystack.

For example, you might want to run a simple indexing pipeline using Chroma as your Document Store using a Docker container. The base image only contains a basic install of Haystack, but you need to install the Chroma integration (chroma-haystack) package additionally. The best approach would be to create a custom Docker image shipping the extra dependency.

Assuming you have a main.py script in your current folder, the Dockerfile would look like this:

FROM deepset/haystack:base-v2.12.1

RUN pip install chroma-haystack

COPY ./main.py /usr/src/myapp/main.py

ENTRYPOINT ["python", "/usr/src/myapp/main.py"]

Then you can create your custom Haystack image with:

docker build . -t my-haystack-image

Complex Application with Docker Compose

A Haystack application running in Docker can go pretty far: with an internet connection, the container can reach external services providing vector databases, inference endpoints, and observability features.

Still, you might want to orchestrate additional services for your Haystack container locally, for example, to reduce costs or increase performance. When your application runtime depends on more than one Docker container, Docker Compose is a great tool to keep everything together.

As an example, let’s say your application wraps two pipelines: one to index documents into a Qdrant instance and the other to query those documents at a later time. This setup would require two Docker containers: one to run the pipelines as REST APIs using Hayhooks and a second to run a Qdrant instance.

For building the Hayhooks image, we can easily customize the base image of one of the latest versions of Hayhooks, adding required dependencies required by QdrantDocumentStore. The Dockerfile would look like this:

FROM deepset/hayhooks:v0.6.0

RUN pip install qdrant-haystack sentence-transformers

CMD ["hayhooks", "run", "--host", "0.0.0.0"]

We wouldn’t need to customize Qdrant, so their official Docker image would work perfectly. The docker-compose.yml file would then look like this:

services:
  qdrant:
    image: qdrant/qdrant:latest
    restart: always
    container_name: qdrant
    ports:
      - 6333:6333
      - 6334:6334
    expose:
      - 6333
      - 6334
      - 6335
    configs:
      - source: qdrant_config
        target: /qdrant/config/production.yaml
    volumes:
      - ./qdrant_data:/qdrant_data

  hayhooks:
    build: . # Build from local Dockerfile
    container_name: hayhooks
    ports:
      - "1416:1416"
    volumes:
      - ./pipelines:/pipelines
    environment:
      - HAYHOOKS_PIPELINES_DIR=/pipelines
      - LOG=DEBUG
    depends_on:
      - qdrant

configs:
  qdrant_config:
    content: |
      log_level: INFO

For a functional example of a Docker Compose deployment, check out the “Qdrant Indexing” demo from GitHub.


OSZAR »