Jump to content

User:Razzi/2021-04-20

From Wikitech

upgrade flerovium and furud to buster: https://phabricator.wikimedia.org/T278421

made PR for consolidated sqoop csvs: https://phabricator.wikimedia.org/T280549

finished adding victorops alerts for critical alerts: https://phabricator.wikimedia.org/T273064

note:

define nrpe::monitor_service
                             Boolean $critical = false,
/usr/lib/nagios/plugins/check_procs -c 1:1
/usr/lib/nagios/plugins/check_procs --help
RANGEs are specified 'min:max' or 'min:' or ':max' (or 'max'). If
specified 'max:min', a warning status will be generated if the
count is inside the specified range

Looks like camus doesn't have critical alerts?

Now I can see why all systemd timers have a generic notes url:

   if $monitoring_enabled {
       systemd::monitor{$title:
           ensure        => $ensure,
           notes_url     => 'https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers',
           contact_group => $monitoring_contact_groups,
       }
   }

Do we conditionally contact victorops? https://phabricator.wikimedia.org/T273064#7022728