User:Effie Mouzeli (WMF)/Howtos/New LVS Kubernetes Service
Appearance
Before adding a new LVS service, the service should be:
- running and responding properly to healthcheck
- listening to TLS
tls.enabled
Puppet Private - Certs
Certificates
Follow the process in Kubernetes/Enabling_TLS#Create_and_place_certificates
DNS and Netbox Patch #1
Make an allocation [DNS/Netbox#How_to_manually_allocate_a_special_purpose_IP_address_in_Netbox]
Note: change the netmask to /32,
Add svc records in operations/dns
templates/10.in-addr.arpa
templates/wmnet
- Review and merge
- login to ns0.wikimedia.org, and run
sudo authdns-update
. This will pull from operations/dns and generate zonefiles and gnsd configs on each nameserver. - Verify:
for i in 0 1 2 ; do dig @ns${i}.wikimedia.org -t any my-changed-record.wikimedia.org ; done
Finish up DNS
cumin1001:~$ sudo cookbook sre.dns.netbox "Add VIPs for X services"
Puppet Patch #1 - LVS prep
'''hieradata/common/service.yaml:'''
service::catalog:
echostore:
description: Echo store, echostore.svc.%{::site}.wmnet <-- '''change'''
encryption: true
ip:
codfw:
default: 10.2.1.49 <-- '''change'''
eqiad:
default: 10.2.2.49 <-- '''change'''
lvs: # Properties that are related to LVS setup.
class: low-traffic
conftool:
cluster: kubernetes
service: kubesvc
depool_threshold: '.5'
enabled: true
monitors:
IdleConnection:
max-delay: 300
timeout-clean-reconnect: 3
scheduler: wrr
protocol: tcp
monitoring:
check_command: check_https_port_status!8082!200!/healthz # <-- '''change PORT''' command for the check in icinga
critical: false # True in Prod
sites:
codfw:
hostname: echostore.svc.codfw.wmnet <-- '''change'''
eqiad:
hostname: echostore.svc.eqiad.wmnet <-- '''change'''
port: 8082 # <-- '''change''' https://wikitech.wikimedia.org/wiki/Service_ports
sites:
- eqiad
- codfw
state: service_setup
discovery:
- dnsdisc: echostore # <-- '''change'''
active_active: true
'''hieradata/role/common/kubernetes/worker.yaml:'''
profile::lvs::realserver::pools:
echostore: {}
'''conftool-data/discovery/services.yaml'''
foo: [eqiad,codfw]
sudo cumin 'O:lvs::balancer' 'run-puppet-agent'
Puppet Patch #2 - LVS setup
'''hieradata/common/service.yaml:'''
[...]
sites:
- eqiad
- codfw
state: lvs_setup <-----
discovery:
- dnsdisc: echostore
active_active: true
- Run puppet
sudo cumin 'O:lvs::balancer' 'run-puppet-agent'
- Ack PyBal diff checks on Icinga
- Find primary and secondary low traffic LVSs in
modules/lvs/manifests/configuration.pp
- Log on SAL and restart pybal (secondaries first)
sudo systemctl restart pybal
- Checks:
sudo ipvsadm -L -n
andcurl -v -k http://eventgate-analytics.svc.eqiad.wmnet:31192/_info
Puppet Patch #3 - LVS production
'''hieradata/common/service.yaml:'''
[...]
sites:
- eqiad
- codfw
state: production <-----
discovery:
- dnsdisc: echostore
active_active: true
sudo cumin 'A:icinga or A:dns-auth' run-puppet-agent
DNS Patch #2
Add discovery (DYNA) records in operations/dns
templates/wmnet
utils/mock_etc/discovery-geo-resources
- Review and merge
- login to ns0.wikimedia.org, and run
sudo authdns-update
. This will pull from operations/dns and generate zonefiles and gnsd configs on each nameserver.
Pool!
$ confctl --object-type discovery select 'dnsdisc=echostore' set/pooled=true