User:Kosta Harlan/Add Link Deployment Charts Notes
Notes on testing out a patch for the link recommendation service with operations/deployment-charts. (https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/660394)
Local charts
I used local-charts (https://gerrit.wikimedia.org/g/releng/local-charts). You could maybe avoid using this tool and use helm + kubectl directly, but using local-charts probably simplifies setting up the link recommendation service (python app + mariadb).
Note that you need to make sure you install helm version 2, not helm version 3. (Until recently the script for installing prerequisites installed v3.)
I ran into problems with Minikube so I used the Kubernetes support in the Docker for Mac application. YMMV. The instructions below assume Docker for Mac with its Kubernetes support switched on.
Build your image
The patch I wanted to test adds a cron job to load datasets. The job calls the `load-datasets.py` script in the research/mwaddlink repo. Because the patch which introduces load-datasets.py is not merged and built as a docker image I can already pull, I need to build that image on my laptop so its available for use in my laptop's kubernetes instance.
cd ~/src/research/mwaddlink git review -d https://gerrit.wikimedia.org/r/c/research/mwaddlink/+/660334 # Download blubber from https://wikitech.wikimedia.org/wiki/Blubber/Download blubber .pipeline/blubber.yaml production > Dockerfile docker build -t mw/mwaddlink:stable .
Setup local-charts
- Clone the local-charts repo, if you haven't already.
- Clone the operations/deployment-charts repo
Take note of the relative paths of these two repositories, as local-charts is going to need to access files in the operations/deployment-charts repo. On my laptop, I have them cloned to these two paths:
- ~/src/operations/deployment-charts
- ~/src/releng/local-charts
Now in the local-charts repo, check out the link recommendation service example patch (git review -d https://gerrit.wikimedia.org/r/c/releng/local-charts/+/639347)
Then, in the operations/deployment-charts repo, checkout the cron job patch (git review -d https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/660394)
Deploy chart
Now, from the local-charts repo, run:
- helm init
- helm dependency update ./helm
- helm install --name "default" -f "values.example.yaml" ./helm
That's it. The cronjob should now be deployed.
Note that I modified the values in cronjob.yaml from `@hourly` to run instead every minute, so I could get faster feedback:
spec:
# schedule: @hourly
schedule: "*/1 * * * *"
Note that by specifying "every minute", the first container creation attempts will fail because the mariadb service is not yet initialized. But the third or fourth container creation attempt should be successful.
If you want to make changes to the cronjob spec in the operations/deployment-charts repo, you make your changes to the cronjob.yaml file, and then back from the local-charts repo run:
- helm del --purge default
- helm dependency update ./helm
- helm install --name "default" -f "values.example.yaml" ./helm
Debugging
If you want to see what is going on, you can run `kubectl get pods`:
==> v1/Service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default-mariadb ClusterIP 10.106.200.98 <none> 3306/TCP 1s linkrecommendation-default NodePort 10.110.104.205 <none> 8000:31074/TCP 1s ==> v1/StatefulSet NAME READY AGE default-mariadb 0/1 1s ==> v1beta1/CronJob NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE linkrecommendation-default-load-datasets @hourly False 0 <none> 1s
You can also look at just the cronjob:
❯ kubectl get cronjob --watch NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE linkrecommendation-default-load-datasets @hourly False 0 <none> 102s
Once a cronjob is running, you can take its ID and inspect it with `kubectl describe cronjob/linkrecommendation-default-load-datasets-{id}`.
You can also view its logs to figure out why it's not working: `kubectl logs -f linkrecommendation-default-load-datasets-{id}`.
❯ kubectl get pods NAME READY STATUS RESTARTS AGE default-mariadb-0 1/1 Running 0 59s linkrecommendation-default-76d574bcf4-9wcq9 1/1 Running 0 59s linkrecommendation-default-load-datasets-1613384640-lz9c8 1/1 Running 0 19s ❯ kubectl logs -f linkrecommendation-default-load-datasets-1613384640-lz9c8Ensuring checksum table exists...[OK] Ensuring model table exists...[OK] Beginning process to load datasets for cswiki, simplewikiInitializing
Ensuring anchors table exists...[OK] No checksum found for anchors in local database, will attempt to download Downloading dataset https://analytics.wikimedia.org/published/datasets/one-off/research-mwaddlink/cswiki/lr_cswiki_anchors.sql.gz...[OK] [...]Attempting to download datasets (anchors, redirects, pageids, w2vfiltered, model) for cswiki
Finally, you can view the application by using kubectl port-forward.
❯ kubectl get pods NAME READY STATUS RESTARTS AGE default-mariadb-0 1/1 Running 0 22m linkrecommendation-default-76d574bcf4-9wcq9 1/1 Running 0 22m linkrecommendation-default-load-datasets-1613384640-8j4tg 0/1 Completed 0 4m44s linkrecommendation-default-load-datasets-1613384640-lz9c8 0/1 Error 0 22m linkrecommendation-default-load-datasets-1613385960-ngphk 0/1 Completed 0 12s ❯ kubectl port-forward linkrecommendation-default-76d574bcf4-9wcq9 8000:8000 Forwarding from 127.0.0.1:8000 -> 8000 Forwarding from [::1]:8000 -> 8000