vault backup: 2024-01-07 11:07:55
This commit is contained in:
parent
46cdc2b01d
commit
3b66b0f209
99
2023_simplifying_local_prefect_2_testing.md
Normal file
99
2023_simplifying_local_prefect_2_testing.md
Normal file
@ -0,0 +1,99 @@
|
|||||||
|
# Simplifying Local Prefect 2 Testing
|
||||||
|
|
||||||
|
One thing I don't like about a lot of modern orchestration systems is that they do not have easy local testing setup instructions. [Prefect 2](https://www.prefect.io/) is no exception to this. One thing that Prefect does have going for it, is a nice [Docker Hub image](https://hub.docker.com/r/prefecthq/prefect). Though this image is not very well documented, at least it gives us a starting point for local Prefect docker setup.
|
||||||
|
|
||||||
|
Recently, I was working on a [FastAPI](https://fastapi.tiangolo.com/) app that uses [prefect-client](https://pypi.org/project/prefect-client/) to talk to the Prefect Cloud API, and I needed a way to test locally to ensure that my app could kick off a Flow Run, wait for the Flow to finish and then retrieve the results back from the Prefect API. The best way to test this would be to have a local Prefect 2 server running with a dummy flow deployed to it that I can use to test with.
|
||||||
|
|
||||||
|
Usually when there is a lack of documentation online for a solution, everyone comes up with their own bespoke way of doing it. This was true here as well. I have some colleagues who also needed a local Prefect setup, and they created a nice README on how to set up a local Prefect server running in a Python venv. Their tutorial had me spinning up a local [kubernetes](https://kubernetes.io/) cluster inside [Docker Desktop](https://www.docker.com/products/docker-desktop/) and Prefect would run the flows on it, just like it does in production. It was a really cool local setup, and I'm sure that it is worth it for certain purposes where you need to test compute-heavy, complicated workflows.
|
||||||
|
|
||||||
|
However, this was not the case for me, and the kubernetes approach was highly complicated. It also had the drawback of not being containerized and you had to run a bunch of different commands. If something got messed up, you had to run more commands to try and reset it. This didn't jive with how I like to work. When I write an app, I want my fellow developers to be able to just pull my repo, `docker compose up`, and boom, working local app. So I set out to make my own bespoke way of running Prefect 2 locally, and my intention was to make it as simple as possible.
|
||||||
|
|
||||||
|
The approach I will describe below is available on my gitlab page [here](https://gitlab.com/fizzizist/toy-local-prefect).
|
||||||
|
|
||||||
|
## Docker Compose
|
||||||
|
|
||||||
|
I love docker compose. It's the ultimate way to set it and forget it. So, the first thing I did when looking to fix this problem was to Google if there were any existing Prefect 2 docker compose setups online. Indeed there are, and I found [this one](https://github.com/rpeden/prefect-docker-compose) to be particularly helpful. You'll find that the components I use in my `docker-compose.yml` are almost identical to that example, but there are some notable differences that I should explain.
|
||||||
|
|
||||||
|
### The FastAPI app
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
web:
|
||||||
|
build:
|
||||||
|
context: .
|
||||||
|
target: web
|
||||||
|
command: uvicorn main:app --host 0.0.0.0 --port 8000 --reload
|
||||||
|
ports:
|
||||||
|
- 8001:8000
|
||||||
|
env_file:
|
||||||
|
- .env
|
||||||
|
volumes:
|
||||||
|
- ./:/app
|
||||||
|
- ./.prefect:/root/.prefect
|
||||||
|
```
|
||||||
|
|
||||||
|
This looks pretty much like your typical FastAPI docker compose. We use the `Dockerfile` to install our Python dependencies, and [uvicorn](https://www.uvicorn.org/) to host the server. The only thing to note here is that we are mounting the `.prefect` folder to this container. You will see that we will mount this same folder to all of the prefect containers as well. This is because, since we want to keep this setup as simple as possible, I opted to just use the local disk for all storage. If you have bucket storage readily available, that would work too, but this tutorial assumes that you are working with nothing other than your local system. The other option would be to use another s3 container like [MinIO](https://min.io/), as you'll see in the [rpeden example](https://github.com/rpeden/prefect-docker-compose), but again, that adds more complexity to our setup that we don't really need.
|
||||||
|
|
||||||
|
### The Prefect 2 Server
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
prefect-server:
|
||||||
|
image: prefecthq/prefect:2-python3.11
|
||||||
|
ports:
|
||||||
|
- 4200:4200
|
||||||
|
command: prefect server start
|
||||||
|
volumes:
|
||||||
|
- ./.prefect:/root/.prefect
|
||||||
|
environment:
|
||||||
|
- PREFECT_UI_URL=http://127.0.0.1:4200/api
|
||||||
|
- PREFECT_API_URL=http://127.0.0.1:4200/api
|
||||||
|
- PREFECT_SERVER_API_HOST=0.0.0.0
|
||||||
|
```
|
||||||
|
|
||||||
|
The [prefecthq docker hub](https://hub.docker.com/u/prefecthq) has a bunch of images, but most of them have gone years without updates. There is now only 1 image that can be used as a server, a worker, and even a client, and that's `prefecthq/prefect`. Be sure to choose the tag wisely though as `latest` may not point to what you're looking for. Here we are using `2-python3.11` which will update to any `2.*` prefect version but pin the Python version since everything else in our project is on Python 3.11.
|
||||||
|
|
||||||
|
Port 4200 allows us to access the Prefect 2 dashboard while we are running the app. Again, we are mounting the same `.prefect` volume, and we have extra environment variables for the server. This configuration is pretty much a copy from the [rpeden example](https://github.com/rpeden/prefect-docker-compose).
|
||||||
|
|
||||||
|
### The Prefect 2 Worker
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
prefect-worker:
|
||||||
|
build:
|
||||||
|
context: .
|
||||||
|
target: worker
|
||||||
|
command: prefect worker start --pool test-process-pool
|
||||||
|
depends_on:
|
||||||
|
- prefect-server
|
||||||
|
environment:
|
||||||
|
- PREFECT_API_URL=http://prefect-server:4200/api
|
||||||
|
volumes:
|
||||||
|
- ./flows:/root/flows
|
||||||
|
- ./.prefect:/root/.prefect
|
||||||
|
restart: always
|
||||||
|
```
|
||||||
|
|
||||||
|
Same volume mount here, but we also mount the flow code. This is because we are using the disk as the storage for the flow code as well as the output. The flow code is in the `flows` directory, and the flow output will be dumped into `.prefect/storage`. That's why `.prefect` needs to be available to the fastapi app, so that it can be read by `prefect-client`. The reason we have a build step here, is just to install any Python dependencies that are needed inside the flow.
|
||||||
|
|
||||||
|
## Two Commands and That's It
|
||||||
|
|
||||||
|
I wanted to be able to just `docker compose up`, and maybe that is still possible, but for now 2 commands will suffice. Just like provisioning a database, you don't really want to be redeploying your flow every time that you `docker compose up`, so I wrote a short `bash` script that needs to be run to spin up the worker pool and deploy the flow.
|
||||||
|
```bash
|
||||||
|
echo creating worker pool
|
||||||
|
prefect work-pool create -t process test-process-pool
|
||||||
|
|
||||||
|
echo deploying test flow
|
||||||
|
prefect --no-prompt deploy --name "return a df" --pool test-process-pool /root/flows/test_flows.py:return_a_df
|
||||||
|
```
|
||||||
|
The `prefect deploy` command is pretty nice. It will infer most stuff, so you don't have to explicitly define too much, especially for a basic setup like this.
|
||||||
|
|
||||||
|
So then, all one needs to do to run this whole thing is:
|
||||||
|
```bash
|
||||||
|
docker compose run prefect-worker bash /root/flows/deploy.sh
|
||||||
|
docker compose up
|
||||||
|
```
|
||||||
|
That's just how I like it. Nice and simple. No kubernetes, no giant yaml files, and no "it works on my machine" issues.
|
||||||
|
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
I'm not saying that this is the right setup for everybody. If you are testing a large system that spins up a bunch of nested flows and tasks on k8s, then you probably want to replicate that locally. But if you are testing something that just talks to Prefect 2, then I would argue that this is all you really need. Just add those 2 prefect containers, make sure they all share that `.prefect` directory, and you're pretty much good to go.
|
||||||
|
|
||||||
|
Feel free to reach out if you found this helpful, have questions, or if you want to share an even better setup :)
|
Loading…
Reference in New Issue
Block a user