User:Razzi/alertname: Icinga/Check correctness of the icinga configuration
Appearance
Saw alert on alerts.wikimedia.org
alertname: Icinga/Check correctness of the icinga configuration
summary: Icinga configuration contains errors
Runbook at https://wikitech.wikimedia.org/wiki/Icinga
Has section https://wikitech.wikimedia.org/wiki/Icinga#Check_validity_of_the_Icinga's_config
so I run
sudo /usr/sbin/icinga -v /etc/icinga/icinga.cfg
It returns
Total Warnings: 0 Total Errors: 2
Looking up for the errors:
Error: Service check command 'check_https_on_port:8090' specified in service 'puppetdb-api codfw port 8090/tcp - Puppetdb api microservice IPv4' for host 'puppetdb1002' (file '/etc/nagios/nagios_service.cfg', line 24535) not defined anywhere! Error: Service check command 'check_https_on_port:8090' specified in service 'puppetdb-api eqiad port 8090/tcp - Puppetdb api microservice IPv4' for host 'puppetdb2002' (file '/etc/nagios/nagios_service.cfg', line 24557) not defined anywhere!
maybe I can find these lines
Yes, they are on icinga.wikimedia.org
Here are the lines
define service { ## --PUPPET_NAME-- (called '_naginator_name' in the manifest) alert1001 puppetdb1002_puppetdb-api active_checks_enabled 1 check_command check_https_on_port:8090 check_freshness 0 check_interval 1 check_period 24x7 contact_groups admins host_name puppetdb1002 is_volatile 0 max_check_attempts 3 notes_url https://wikitech.wikimedia.org/wiki/Puppet#Micro_Service notification_interval 0 notification_options c,r,f notification_period 24x7 notifications_enabled 1 passive_checks_enabled 1 retry_interval 1 service_description puppetdb-api codfw port 8090/tcp - Puppetdb api microservice IPv4 servicegroups lvs }
define service { ## --PUPPET_NAME-- (called '_naginator_name' in the manifest) alert1001 puppetdb2002_puppetdb-api active_checks_enabled 1 check_command check_https_on_port:8090 check_freshness 0 check_interval 1 check_period 24x7 contact_groups admins host_name puppetdb2002 is_volatile 0 max_check_attempts 3 notes_url https://wikitech.wikimedia.org/wiki/Puppet#Micro_Service notification_interval 0 notification_options c,r,f notification_period 24x7 notifications_enabled 1 passive_checks_enabled 1 retry_interval 1 service_description puppetdb-api eqiad port 8090/tcp - Puppetdb api microservice IPv4 servicegroups lvs }
This was patched: https://gerrit.wikimedia.org/r/c/operations/puppet/+/693496
I didn't find the original string because I hadn't pulled the latest puppet code, and the error was only on a later tree!