Calico
Calico is a virtual network infrastructure that we use to manage Kubernetes networking.
It provides IPAM for Kubernetes workloads and pods. It also manages iptables
rules, routing tables, and BGP peering for Kubernetes nodes.
IPAM
We configure IP pools per cluster (via the calico helm-chart) that Calico splits up in blocks (CRD resource: ipamblocks.crd.projectcalico.org
). One node can have zero or more IPAM blocks assigned (the first one will be assigned as soon as the first Pod is scheduled on a node).
On the nodes, the network of assigned IPAM blocks get blackholed and specific (/32
) routes are added for every Pod running on the node:
kubestage1003:~# ip route
default via 10.64.16.1 dev eno1 onlink
10.64.16.0/22 dev eno1 proto kernel scope link src 10.64.16.55
10.64.75.64 dev caliabad5f15937 scope link
blackhole 10.64.75.64/26 proto bird
10.64.75.65 dev cali13b43f910f6 scope link
10.64.75.66 dev cali8bc45095644 scope link
...
This way, the nodes will be authoritative for and announce the assigned IPAM blocks networks to their BGP peers.
The IPAM blocks, affinities are stored in Kubernetes CRD objects and can be viewed and modified using the Kubernetes API, kubectl
or calicoctl
:
calicoctl ipam show --show-blocks
kubectl get ipamblocks.crd.projectcalico.org,blockaffinities.crd.projectcalico.org
Calico IPAM also supports a concept of borrowing IPs from IP blocks of foreign nodes in case a node has used up all it's attached IP blocks and can't get another one from the IP pool. We disable this feature by configuring calico IPAM with StrictAffinity (see task T296303) as it only works in a node-to-node mesh configuration.
Operations
Calico should be running via a Daemonset on every node of a Kubernetes cluster, establishing a BGP peering with the core routers (see IP and AS allocations#Private AS).
Unfortunately, Calico currently does not set the NetworkUnavailable
condition to true on nodes where it is not running or failing, although that will ultimately render the node unusable. Therefore a Prometheus alert will fire in case if fails to scrape Calico metrics from a node.
If you are reading this page because you've seen such an alert:
- Check the nodes state with:
kubectl describe node <node fqdn>
- Take a look at the latest events in the cluster: https://logstash.wikimedia.org/app/dashboards#/view/d43f9bf0-17b5-11eb-b848-090a7444f26c
- Check the logs of calico components (use the component filter at the right): https://logstash.wikimedia.org/app/dashboards#/view/f6a5b090-0020-11ec-81e9-e1226573bad4
- On the node itself:
sudo calicoctl node status
Typha
Calico Typha can be considered a "smart proxy" between calico-node (Felix) and the Kubernetes API. It's purpose is to maintain a single connection to the Kubernetes API while serving multiple instances of calico-node with relevant data and (filtered) events. In large clusters this reduces the load on the Kubernetes API as well as on the calico-node instances (which don't have to deal with all events as they only get the relevant ones, filtered by Typha). Unfortunately this makes Typha a hard dependency which, when not available, will bring down the whole cluster networking.
There usually are 3 replica per cluster (1 for small clusters), running in the kube-system
namespace.
- Check the state with:
kubectl -n kube-system get po -l k8s-app=calico-typha
- Grafana dashboard: https://grafana-rw.wikimedia.org/d/p8RgaNXGk
- For logs, see the generic logstash link above
Kube Controllers
The Calico Kubernetes Controllers are a couple of different control loops (all in one container/binary) that monitor objects in the Kubernetes API (like network policies, endpoints, nodes etc.) and perform necessary actions.
There usually is one replica per cluster (a maximum of one can be active at any given time anyways), running in the kube-system
namespace.
- Check the state with:
kubectl -n kube-system get po -l k8s-app=calico-kube-controllers
- Grafana dashboard: https://grafana-rw.wikimedia.org/d/-OQgQZOSk
- For logs, see the generic logstash link above
Resource Usage
We have had multiple incidents in the past that originated from calico components reaching their resource limits (and being OOM killed or throttled).
Calico resource usage does increase organically due to events like:
- Pods being added (new deployments with high number of replicas or the like)
- Nodes being added
- Network Policies being added
The above could be verified via the "etcd object" panels on the Kubernetes API Grafana dashboard.
See also:
- mwscript-k8s creates too many resources
- 2024-04-03 calico/typha down
- Incidents/2021-09-29 eqiad-kubernetes
Packaging
We don't actually build calico but package it's components from upstream binary releases.
Because of that, you will need to set HTTP proxy variables for internet access on the build host.
The general process to follow is:
- Check out operations/debs/calico on your workstation
- Decide if you want to package a new master (production) or future (potential next production) version
- Create a patch to bump the debian changelog
export NEW_VERSION=3.16.5 # Calico version you want to package
dch -v ${NEW_VERSION}-1 -D unstable "Update to v${NEW_VERSION}"
git commit debian/changelog
# Make sure to submit the patch to the correct version branch
git review vX.Y
- Merge
- Check out operations/debs/calico on the build host
- Build the packages:
# If you want to build a specific version
git checkout vX.Y
# Ensure you allow networking in pbuilder
# This option needs to be in the file, an environment variable will *not* work!
echo "USENETWORK=yes" >> ~/.pbuilderrc
# Build the package
https_proxy=http://webproxy.$(hostname -d):8080 DIST=<dist> pdebuild
Updating helm charts
There are two helm charts that might need updating, depending on the changes in a newly packaged calico version:
Publishing
The Debian Packages
# On apt1001, copy the packages from the build host
rsync -vaz build2001.codfw.wmnet::pbuilder-result/<dist>-amd64/calico*<PACKAGE VERSION>* .
# If you want to import a new production version, import to component main
sudo -i reprepro -C main --ignore=wrongdistribution include <dist>-wikimedia /path/to/<PACKAGE>.changes
# If you want to import a test/pre-production version, import to component calico-future
sudo -i reprepro -C component/calicoXY --ignore=wrongdistribution include <dist>-wikimedia /path/to/<PACKAGE>.changes
The Docker Images
Calico also includes a bunch of docker images which need to be published into our docker registry. To simplify the process, the packaging generates a debian package named "calico-images" that includes the images as well as a script to publish them:
# On the build host, extract the calico-images debian package
tmpd=$(mktemp -d)
dpkg -x /var/cache/pbuilder/result/<dist>-amd64/calico-images_<PACKAGE_VERSION>_amd64.deb $tmpd
# Load and push the images
sudo -i CALICO_IMAGE_DIR=${tmpd}/usr/share/calico ${tmpd}/usr/share/calico/push-calico-images.sh
rm -rf $tmpd
Updating
- Update debian packages calicoctl and calico-cni on kubernetes nodes using Debdeploy
- Update
image.tag
version inhelmfile.d/admin_ng/values/<Cluster>/calico-values.yaml
- Deploy to the cluster(s) that you want updated