Nova Resource:Metricsinfra/SAL
Appearance
2025-04-23
- 09:43 taavi: updating security group rules to include IPv6 terms
2025-01-31
- 11:38 dhinus: rebooting VM metricsinfra-prometheus-3 T385262
- 11:30 dhinus: systemctl restart prometheus@cloud on metricsinfra-prometheus-2 T385262
- 11:16 dhinus: systemctl restart prometheus@cloud on metricsinfra-prometheus-3 T385262
2025-01-20
- 13:01 dcaro: stopping and starting metricsinfra-alertmanager-3 to try to get the right network
2024-06-24
- 20:09 andrew@cloudcumin1001: END (FAIL) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=1)
- 19:56 andrew@cloudcumin1001: START - Cookbook wmcs.openstack.migrate_project_to_ovs
2024-03-13
- 12:14 taavi: MariaDB [prometheusconfig]> delete from alerts where name = 'GridQueueProblem'; # T314664
2023-11-30
- 18:53 taavi: no longer send quarry alerts to cloud services team
2023-11-18
- 14:09 taavi: reboot metricsinfra-alertmanager-1 to see if it stops flapping a puppet alert
2023-09-29
- 08:24 wm-bot2: dcaro@urcuchillay END (PASS) - Cookbook wmcs.openstack.cloudvirt.vm_console (exit_code=0)
- 08:17 wm-bot2: dcaro@urcuchillay START - Cookbook wmcs.openstack.cloudvirt.vm_console
- 08:17 wm-bot2: dcaro@urcuchillay END (PASS) - Cookbook wmcs.openstack.cloudvirt.vm_console (exit_code=0)
- 08:16 wm-bot2: dcaro@urcuchillay START - Cookbook wmcs.openstack.cloudvirt.vm_console
2023-05-10
- 17:17 wm-bot2: Increased quotas by 8 cores, 16384 ram (T336423) - cookbook ran by taavi@runko
2023-05-04
- 15:11 dcaro: rebooting metricsinfra-prometheus-2 as it was unresponsive
2023-04-24
- 14:16 dcaro: rebooting metricsinfra-prometheus-2, it's in a non-responsive state (no ssh, console hangs)
2023-04-21
- 21:58 andrewbogott: added raymond-ndibe as project member
2023-03-07
- 16:31 wm-bot2: removed instance metricsinfra-controller-1 - cookbook ran by dcaro@vulcanus
2023-02-13
- 23:37 bd808: metricsinfra-db-1.trove.eqiad1.wikimedia.cloud restarted via Horizon
- 23:35 bd808: metricsinfra-db-1.trove.eqiad1.wikimedia.cloud not responsive to ssh
- 23:32 bd808: grafana.wmcloud.org offline with db connection error. Investigating.
2022-12-20
- 15:59 dcaro: rebooting prometheus-2 due to being non-responsive
2022-06-16
- 14:18 taavi: add 'gitlab-runners' project to list of scraped projects
2022-03-01
- 11:38 dcaro: Reloading alertmanager to refresh new config (T302702)
- 11:37 dcaro: Adding runbook url annotation to GridQueueProblem alert on DB at metricsinfra-crontroller-1 (T302702)
2022-01-22
- 11:32 taavi: added project-proxy VMs to prometheus targets
2021-12-14
- 09:27 majavah: drop "analytics" project from current beta coverage, the setup is currently not compatible with pontoon
2021-09-11
- 08:41 majavah: silence deployment-prep alerts yet again
2021-07-12
- 15:45 bstorm: silenced deployment prep alerts for another 60 days
2021-06-15
- 16:12 balloons: add 8 CPU/16G RAM to quota T284973
2021-06-14
- 18:40 balloons: Add majavah as projectadmin T284938
2021-03-11
- 18:33 bstorm: silenced alerts from deploymentprep for another 60 days
2021-01-04
- 15:50 bstorm: silencing all alerts from deployment-prep for 60 more days
2020-09-29
- 16:53 bstorm: silence all the deployment-prep alerts for another 30 days