Kubernetes/Add a new service
All steps below assume you want to deploy a new service named service-foo to the clusters of the main (wikikube) group.
The ml-serve and dse-k8s groups are modelled closely on the main clusters, so many of the steps below are also applicable to these groups. Where there are specific differences they may be highlighted in the steps below.
Accessibility of service-foo from outside of Kubernetes can be achieved via Kubernetes Ingress or LVS. The method of choice will have some impact on the following steps.
Prepare the clusters for the new service
Tell the deployment server how to set up the kubeconfig files.
This is done by modifying the profile::kubernetes::deployment_server::services
hiera key (hieradata/common/profile/kubernetes/deployment_server.yaml
) as in the example below:
profile::kubernetes::deployment_server::services:
main:
mathoid:
usernames:
- name: mathoid
- name: mathoid-deploy
...
+ service-foo:
+ usernames:
+ - name: service-foo
+ - name: service-foo-deploy
Please note that the file permissions of your kubeconfig file (/etc/kubernetes/service-foo-<cluster_name>.config
) are inherited from the defaults at profile::kubernetes::deployment_server::user_defaults
. Typically you won't need to override them. If you do need to, you can specify the keys owner, group and mode for each element in the usernames array.
puppet-merge
and then the appropriate run-puppet-agent
as detailed below) before running helm/helmfile. Otherwise things might look okay at first sight but end up in a broken state.Add a Kubernetes namespace
Namespaces are used to isolate kubernetes services from each other.
In order to create a new namespace, prepare a change to the relevant values file in the the deployment-charts repo.
i.e. for the wikikube clusters this is: helmfile.d/admin_ng/values/main.yaml
but namespaces for the ml-serve
, dse-k8s
, and aux-k8s
cluster groups are managed in their own files. Here is an example commit for adding a namespace to the wikikube clusters.
At this point, you can safely merge the changes (after somebody from SRE/Service_Operations validates).
After merging, it is important to deploy your changes to avoid impacting other people rolling out changes later on.
Deploy changes to helmfile.d/admin_ng
The following example shows how to deploy these changes to the wikikube clusters. If you are working with a different cluster group, substitute the relevant environment names.
ssh to deploy1002 and then run the following:
sudo run-puppet-agent
sudo -i
cd /srv/deployment-charts/helmfile.d/admin_ng/
helmfile -e staging-codfw -i apply
# if that went fine
helmfile -e staging-eqiad -i apply
helmfile -e codfw -i apply
helmfile -e eqiad -i apply
The command above should show you a diff in namespaces/quotas/etc.. related to your new service. If you don't see a diff, ping somebody from the Service Ops team! Check that everything is ok:
sudo -i
kube_env admin staging-codfw
kubectl describe ns service-foo
You should be able to see info about your namespace.
Leaving undeployed things will impede further operations by other people.
If a change is left undeployed for too long, SRE will receive an alert.Create certificates (for the services proxy)
Manual creation/management of certificates is no longer required as of task T300033. Automatic cert management is enabled by default.
Add private data/secrets (optional)
Ask Service Ops to add the private data for your service.
This is done by adding an entry for service-foo under profile::kubernetes::deployment_server_secrets::services
in the private repository (hieradata/role/common/deployment_server/kubernetes.yaml
). Secrets will most likely be needed for all clusters, including staging.
Setting up Ingress
This is only needed if service-foo should be accessed via Ingress.
Follow Ingress#Add a new service under Ingress to create Ingress related config, DNS records etc.
Set resource requests and limits for your service/containers
Your chart probably comes with some default settings regarding resource usage. Please benchmark your application and change the defaults accordingly. See Resource requests and limits for some background.
Deploy the service
At this point you should have a a Chart for your service (Creating a Helm Chart), and will need to setup a helmfile.d/services
directory in the operations/deployment-charts repository for the deployment. You can copy the structure (helmfile.yaml, values.yaml, values-staging.yaml, etc.) from helmfile.d/services/_example_ and customize as needed.
If this service will be accessed directly via LVS (no ingress): Ensure the service has its ports registered at Service ports ($SERVICE-FOO-PORT)
You can proceed to deploy the new service to staging for real.
On deploy1002:
cd /srv/deployment-charts/helmfile.d/services/service-foo
helmfile -e staging -i apply
The command above will show a diff related to the new service, make sure that everything looks fine and then hit Yes to proceed.
Testing a service
- Now we can test the service in staging. Use the very handy endpoint:
https://staging.svc.eqiad.wmnet:$SERVICE-FOO-PORT
to quickly test if everything works as expected.
Deploy a service to production
- Ensure you have enabled TLS support via
tls.enabled
in your values.yaml - Then the final step, namely deploying the new service. On deploy1002:
cd /srv/deployment-charts/helmfile.d/services/service-foo helmfile -e codfw -i apply # if that went fine helmfile -e eqiad -i apply
The service can now be accessed via the registered port on any of the kubernetes nodes (for manual testing).
Monitor the Service
Copy this template to a new one named after your new service, and edit accordingly. Please do not edit the original template!
Setting up LVS
This is only needed if service-foo should be accessed via LVS.
Follow LVS#Add_a_new_load_balanced_service to create a new LVS service on $SERVICE-FOO-PORT.
Add in Service Mesh
If other services will be reaching your new service via the service mesh (aka via envoy), then this service will need an entry in the services proxy listeners list in https://gerrit.wikimedia.org/g/operations/puppet:
hieradata/common/profile/services_proxy/envoy.yaml: profile::services_proxy::envoy::listeners: # First, the discovery enabled services - name: parsoid-php port: 6002 timeout: "30s" service: parsoid-php keepalive: "4s" retry: retry_on: "5xx" num_retries: 1 # A service behind ingress - name: image-suggestion port: 6030 service: image-suggestion timeout: "10s" keepalive: "4s" sets_sni: true <snip> # default listeners list used by the MW installations profile::services_proxy::envoy::enabled_listeners: - parsoid-php - image-suggestion