Jump to content

Server Admin Log/Archive 80

From Wikitech

2024-05-31

  • 23:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:43 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:43 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:26 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:26 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:26 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:23 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:23 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:22 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:22 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:21 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:21 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:19 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:17 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:17 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:13 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:13 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:10 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:01 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:35 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:35 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:30 logmsgbot: nshahquinn-wmf@deploy1002 Finished deploy [airflow-dags/analytics_product@f0284c6]: (no justification provided) (duration: 00m 03s)
  • 22:30 logmsgbot: nshahquinn-wmf@deploy1002 Started deploy [airflow-dags/analytics_product@f0284c6]: (no justification provided)
  • 22:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:27 logmsgbot: nshahquinn-wmf@deploy1002 Finished deploy [airflow-dags/analytics_product@f0284c6]: (no justification provided) (duration: 00m 07s)
  • 22:27 logmsgbot: nshahquinn-wmf@deploy1002 Started deploy [airflow-dags/analytics_product@f0284c6]: (no justification provided)
  • 22:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:18 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2199.codfw.wmnet with reason: Maintenance
  • 22:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2199.codfw.wmnet with reason: Maintenance
  • 22:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T364069)', diff saved to https://phabricator.wikimedia.org/P63803 and previous config saved to /var/cache/conftool/dbconfig/20240531-220920-marostegui.json
  • 22:07 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:07 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:05 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:05 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:03 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:03 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:03 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:03 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:01 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:01 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:59 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:57 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:57 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:57 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:57 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:55 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P63802 and previous config saved to /var/cache/conftool/dbconfig/20240531-215412-marostegui.json
  • 21:54 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:54 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:52 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:52 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P63801 and previous config saved to /var/cache/conftool/dbconfig/20240531-213904-marostegui.json
  • 21:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T364069)', diff saved to https://phabricator.wikimedia.org/P63800 and previous config saved to /var/cache/conftool/dbconfig/20240531-212356-marostegui.json
  • 21:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1228 (T364299)', diff saved to https://phabricator.wikimedia.org/P63799 and previous config saved to /var/cache/conftool/dbconfig/20240531-212101-marostegui.json
  • 21:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 21:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 21:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T364299)', diff saved to https://phabricator.wikimedia.org/P63798 and previous config saved to /var/cache/conftool/dbconfig/20240531-212038-marostegui.json
  • 21:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:07 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:07 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P63797 and previous config saved to /var/cache/conftool/dbconfig/20240531-210530-marostegui.json
  • 21:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:05 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:05 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:03 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:03 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:03 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:03 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:58 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:57 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P63796 and previous config saved to /var/cache/conftool/dbconfig/20240531-205022-marostegui.json
  • 20:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:46 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:46 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T364299)', diff saved to https://phabricator.wikimedia.org/P63795 and previous config saved to /var/cache/conftool/dbconfig/20240531-203514-marostegui.json
  • 20:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:26 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:24 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:22 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:11 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:11 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:10 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:01 rzl: sudo -i reprepro -C main include bullseye-wikimedia /home/rzl/httpbb/buster/httpbb_0.0.5-1+deb11u1_amd64.changes
  • 20:00 rzl: sudo -i reprepro -C main include buster-wikimedia /home/rzl/httpbb/buster/httpbb_0.0.5-1_amd64.changes
  • 19:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:53 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:53 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:51 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:51 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P63794 and previous config saved to /var/cache/conftool/dbconfig/20240531-194131-ladsgroup.json
  • 19:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2204 (T352010)', diff saved to https://phabricator.wikimedia.org/P63793 and previous config saved to /var/cache/conftool/dbconfig/20240531-194037-ladsgroup.json
  • 19:40 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2204.codfw.wmnet with reason: Maintenance
  • 19:40 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2204.codfw.wmnet with reason: Maintenance
  • 19:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P63792 and previous config saved to /var/cache/conftool/dbconfig/20240531-192625-ladsgroup.json
  • 19:26 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:22 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:12 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:12 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P63791 and previous config saved to /var/cache/conftool/dbconfig/20240531-191119-ladsgroup.json
  • 19:10 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:10 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:06 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:06 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:03 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:03 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:02 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P63790 and previous config saved to /var/cache/conftool/dbconfig/20240531-190138-root.json
  • 19:00 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:00 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:58 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:58 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:57 mutante: Phabricator - added 'JoelyRooke-WMDE (Jo)' to group WMF-NDA (https://phabricator.wikimedia.org/project/profile/61/) (T366145)
  • 18:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2205 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P63789 and previous config saved to /var/cache/conftool/dbconfig/20240531-185613-ladsgroup.json
  • 18:55 mutante: LDAP - added uid joelyrookewmde to groups wmde and nda (T366145)
  • 18:55 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:55 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:53 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:53 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:50 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P63788 and previous config saved to /var/cache/conftool/dbconfig/20240531-184632-root.json
  • 18:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:35 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:35 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P63787 and previous config saved to /var/cache/conftool/dbconfig/20240531-183125-root.json
  • 18:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:26 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:22 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:22 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:18 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P63785 and previous config saved to /var/cache/conftool/dbconfig/20240531-181619-root.json
  • 18:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:07 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:07 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:03 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:03 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:01 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P63782 and previous config saved to /var/cache/conftool/dbconfig/20240531-180113-root.json
  • 17:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P63781 and previous config saved to /var/cache/conftool/dbconfig/20240531-174607-root.json
  • 17:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:35 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:35 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P63780 and previous config saved to /var/cache/conftool/dbconfig/20240531-173101-root.json
  • 17:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:19 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:51 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:51 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:32 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:32 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:26 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T364299)', diff saved to https://phabricator.wikimedia.org/P63778 and previous config saved to /var/cache/conftool/dbconfig/20240531-161807-marostegui.json
  • 16:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 16:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 16:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T364299)', diff saved to https://phabricator.wikimedia.org/P63777 and previous config saved to /var/cache/conftool/dbconfig/20240531-161744-marostegui.json
  • 16:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P63775 and previous config saved to /var/cache/conftool/dbconfig/20240531-160236-marostegui.json
  • 16:00 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:00 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:57 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:57 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P63774 and previous config saved to /var/cache/conftool/dbconfig/20240531-154728-marostegui.json
  • 15:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:44 cgoubert@cumin1002: conftool action : set/pooled=yes; selector: name=parse1002.eqiad.wmnet,cluster=kubernetes,service=kubesvc
  • 15:43 claime: pooling and uncordoning parse1002 - T363086
  • 15:39 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 15:39 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 15:39 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 15:39 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 15:36 claime: homer 'cr*eqiad*' commit 'T363086'
  • 15:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T364299)', diff saved to https://phabricator.wikimedia.org/P63773 and previous config saved to /var/cache/conftool/dbconfig/20240531-153220-marostegui.json
  • 15:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:10 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:07 cgoubert@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:05 cgoubert@cumin1002: START - Cookbook sre.dns.netbox
  • 14:56 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:56 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:47 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:43 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:43 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:40 vriley@cumin1002: START - Cookbook sre.hosts.provision for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:38 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:37 klausman@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 14:37 vriley@cumin1002: START - Cookbook sre.hosts.provision for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:35 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:35 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:32 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 14:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:24 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 14:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:10 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P63772 and previous config saved to /var/cache/conftool/dbconfig/20240531-135629-root.json
  • 13:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:53 dcausse@deploy1002: Finished deploy [airflow-dags/search@b2f7795]: search: fix NTripleGenerator arguments (duration: 00m 21s)
  • 13:53 dcausse@deploy1002: Started deploy [airflow-dags/search@b2f7795]: search: fix NTripleGenerator arguments
  • 13:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:49 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 13:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:46 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 13:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:41 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P63771 and previous config saved to /var/cache/conftool/dbconfig/20240531-134122-root.json
  • 13:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:37 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 13:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:28 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 13:27 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:27 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:26 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P63770 and previous config saved to /var/cache/conftool/dbconfig/20240531-132616-root.json
  • 13:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:17 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 13:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:11 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P63769 and previous config saved to /var/cache/conftool/dbconfig/20240531-131110-root.json
  • 13:07 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:07 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:02 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:00 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:00 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:58 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:58 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P63768 and previous config saved to /var/cache/conftool/dbconfig/20240531-125604-root.json
  • 12:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:40 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P63767 and previous config saved to /var/cache/conftool/dbconfig/20240531-124058-root.json
  • 12:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2173', diff saved to https://phabricator.wikimedia.org/P63766 and previous config saved to /var/cache/conftool/dbconfig/20240531-123903-root.json
  • 12:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1209.eqiad.wmnet with OS bookworm
  • 12:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P63765 and previous config saved to /var/cache/conftool/dbconfig/20240531-122552-root.json
  • 12:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:21 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2038.codfw.wmnet with OS bookworm
  • 12:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:15 dcausse@deploy1002: Finished deploy [airflow-dags/search@45de44b]: search: bump rdf-spark-tools to 0.3.141 (duration: 00m 21s)
  • 12:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:15 dcausse@deploy1002: Started deploy [airflow-dags/search@45de44b]: search: bump rdf-spark-tools to 0.3.141
  • 12:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
  • 12:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
  • 12:03 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2038.codfw.wmnet with reason: host reimage
  • 12:00 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2038.codfw.wmnet with reason: host reimage
  • 12:00 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:00 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:58 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:58 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:53 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:53 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 11:52 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 11:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T352010)', diff saved to https://phabricator.wikimedia.org/P63764 and previous config saved to /var/cache/conftool/dbconfig/20240531-115244-ladsgroup.json
  • 11:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:51 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1209.eqiad.wmnet with OS bookworm
  • 11:51 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:46 jiji@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1039.eqiad.wmnet with OS bookworm
  • 11:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:43 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:43 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:42 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2038.codfw.wmnet with OS bookworm
  • 11:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P63763 and previous config saved to /var/cache/conftool/dbconfig/20240531-113735-ladsgroup.json
  • 11:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:27 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:26 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2039.codfw.wmnet with OS bookworm
  • 11:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P63762 and previous config saved to /var/cache/conftool/dbconfig/20240531-112227-ladsgroup.json
  • 11:22 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T364299)', diff saved to https://phabricator.wikimedia.org/P63761 and previous config saved to /var/cache/conftool/dbconfig/20240531-111833-marostegui.json
  • 11:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 11:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 11:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T364299)', diff saved to https://phabricator.wikimedia.org/P63760 and previous config saved to /var/cache/conftool/dbconfig/20240531-111809-marostegui.json
  • 11:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:09 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2039.codfw.wmnet with reason: host reimage
  • 11:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T352010)', diff saved to https://phabricator.wikimedia.org/P63759 and previous config saved to /var/cache/conftool/dbconfig/20240531-110719-ladsgroup.json
  • 11:06 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2039.codfw.wmnet with reason: host reimage
  • 11:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2179 (T364069)', diff saved to https://phabricator.wikimedia.org/P63758 and previous config saved to /var/cache/conftool/dbconfig/20240531-110347-marostegui.json
  • 11:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 11:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 11:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T364069)', diff saved to https://phabricator.wikimedia.org/P63757 and previous config saved to /var/cache/conftool/dbconfig/20240531-110324-marostegui.json
  • 11:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P63756 and previous config saved to /var/cache/conftool/dbconfig/20240531-110301-marostegui.json
  • 10:55 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:55 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:54 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2205.codfw.wmnet with reason: Maintenance
  • 10:53 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2205.codfw.wmnet with reason: Maintenance
  • 10:53 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:53 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P63755 and previous config saved to /var/cache/conftool/dbconfig/20240531-104816-marostegui.json
  • 10:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P63754 and previous config saved to /var/cache/conftool/dbconfig/20240531-104753-marostegui.json
  • 10:47 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2039.codfw.wmnet with OS bookworm
  • 10:47 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1039.eqiad.wmnet with OS bookworm
  • 10:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:35 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:35 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P63753 and previous config saved to /var/cache/conftool/dbconfig/20240531-103308-marostegui.json
  • 10:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T364299)', diff saved to https://phabricator.wikimedia.org/P63752 and previous config saved to /var/cache/conftool/dbconfig/20240531-103245-marostegui.json
  • 10:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:21 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2040.codfw.wmnet with OS bookworm
  • 10:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T364069)', diff saved to https://phabricator.wikimedia.org/P63751 and previous config saved to /var/cache/conftool/dbconfig/20240531-101800-marostegui.json
  • 10:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:14 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1040.eqiad.wmnet with OS bookworm
  • 10:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:03 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2040.codfw.wmnet with reason: host reimage
  • 10:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:02 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:00 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2040.codfw.wmnet with reason: host reimage
  • 09:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:59 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:58 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1040.eqiad.wmnet with reason: host reimage
  • 09:57 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:57 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:55 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1040.eqiad.wmnet with reason: host reimage
  • 09:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:42 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2040.codfw.wmnet with OS bookworm
  • 09:41 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1040.eqiad.wmnet with OS bookworm
  • 09:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:27 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:27 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:25 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2041.codfw.wmnet with OS bookworm
  • 09:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:24 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:22 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:22 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:19 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1041.eqiad.wmnet with OS bookworm
  • 09:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:08 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2041.codfw.wmnet with reason: host reimage
  • 09:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:06 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2041.codfw.wmnet with reason: host reimage
  • 09:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:03 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1041.eqiad.wmnet with reason: host reimage
  • 09:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:02 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:00 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1041.eqiad.wmnet with reason: host reimage
  • 08:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:47 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2041.codfw.wmnet with OS bookworm
  • 08:47 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1041.eqiad.wmnet with OS bookworm
  • 08:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:12 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1042.eqiad.wmnet with OS bookworm
  • 08:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:03 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:03 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:02 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2042.codfw.wmnet with reason: host reimage
  • 07:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:58 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:58 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2042.codfw.wmnet with reason: host reimage
  • 07:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:56 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1042.eqiad.wmnet with reason: host reimage
  • 07:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:54 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1042.eqiad.wmnet with reason: host reimage
  • 07:53 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:40 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2042.codfw.wmnet with OS bookworm
  • 07:40 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1042.eqiad.wmnet with OS bookworm
  • 07:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:32 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 13335
  • 07:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:30 mvernon@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on moss-fe1002.eqiad.wmnet with reason: in development
  • 07:30 mvernon@cumin1002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on moss-fe1002.eqiad.wmnet with reason: in development
  • 07:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:18 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:01 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:59 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:58 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2043.codfw.wmnet with OS bookworm
  • 06:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:52 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1043.eqiad.wmnet with OS bookworm
  • 06:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:41 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2043.codfw.wmnet with reason: host reimage
  • 06:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:38 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2043.codfw.wmnet with reason: host reimage
  • 06:36 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1043.eqiad.wmnet with reason: host reimage
  • 06:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:33 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 13335
  • 06:33 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1043.eqiad.wmnet with reason: host reimage
  • 06:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:20 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1043.eqiad.wmnet with OS bookworm
  • 06:20 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2043.codfw.wmnet with OS bookworm
  • 06:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1207 (T364299)', diff saved to https://phabricator.wikimedia.org/P63750 and previous config saved to /var/cache/conftool/dbconfig/20240531-061219-marostegui.json
  • 06:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 06:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 06:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T364299)', diff saved to https://phabricator.wikimedia.org/P63749 and previous config saved to /var/cache/conftool/dbconfig/20240531-061156-marostegui.json
  • 06:00 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:00 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:58 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:58 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P63748 and previous config saved to /var/cache/conftool/dbconfig/20240531-055647-marostegui.json
  • 05:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:43 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:43 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P63747 and previous config saved to /var/cache/conftool/dbconfig/20240531-054139-marostegui.json
  • 05:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T364299)', diff saved to https://phabricator.wikimedia.org/P63746 and previous config saved to /var/cache/conftool/dbconfig/20240531-052631-marostegui.json
  • 05:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:18 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:07 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:07 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:55 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:55 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:53 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:53 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:51 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T364299)', diff saved to https://phabricator.wikimedia.org/P63745 and previous config saved to /var/cache/conftool/dbconfig/20240531-042604-marostegui.json
  • 04:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 04:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 04:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T364299)', diff saved to https://phabricator.wikimedia.org/P63744 and previous config saved to /var/cache/conftool/dbconfig/20240531-042540-marostegui.json
  • 04:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T352010)', diff saved to https://phabricator.wikimedia.org/P63743 and previous config saved to /var/cache/conftool/dbconfig/20240531-042414-ladsgroup.json
  • 04:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 04:23 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 04:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T352010)', diff saved to https://phabricator.wikimedia.org/P63742 and previous config saved to /var/cache/conftool/dbconfig/20240531-042350-ladsgroup.json
  • 04:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P63741 and previous config saved to /var/cache/conftool/dbconfig/20240531-041032-marostegui.json
  • 04:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P63740 and previous config saved to /var/cache/conftool/dbconfig/20240531-040842-ladsgroup.json
  • 04:07 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:07 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:59 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P63739 and previous config saved to /var/cache/conftool/dbconfig/20240531-035524-marostegui.json
  • 03:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P63738 and previous config saved to /var/cache/conftool/dbconfig/20240531-035334-ladsgroup.json
  • 03:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:43 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:43 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T364299)', diff saved to https://phabricator.wikimedia.org/P63737 and previous config saved to /var/cache/conftool/dbconfig/20240531-034016-marostegui.json
  • 03:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T352010)', diff saved to https://phabricator.wikimedia.org/P63736 and previous config saved to /var/cache/conftool/dbconfig/20240531-033826-ladsgroup.json
  • 03:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:43 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:26 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:24 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:24 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:18 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:51 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:51 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:27 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:27 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply

2024-05-30

  • 23:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T364069)', diff saved to https://phabricator.wikimedia.org/P63735 and previous config saved to /var/cache/conftool/dbconfig/20240530-235640-marostegui.json
  • 23:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 23:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 23:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T364069)', diff saved to https://phabricator.wikimedia.org/P63734 and previous config saved to /var/cache/conftool/dbconfig/20240530-235617-marostegui.json
  • 23:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:43 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P63733 and previous config saved to /var/cache/conftool/dbconfig/20240530-234109-marostegui.json
  • 23:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:28 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:28 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P63732 and previous config saved to /var/cache/conftool/dbconfig/20240530-232600-marostegui.json
  • 23:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:18 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T364069)', diff saved to https://phabricator.wikimedia.org/P63731 and previous config saved to /var/cache/conftool/dbconfig/20240530-231052-marostegui.json
  • 23:10 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T364299)', diff saved to https://phabricator.wikimedia.org/P63730 and previous config saved to /var/cache/conftool/dbconfig/20240530-230212-marostegui.json
  • 23:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 23:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 23:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 23:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 23:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T364299)', diff saved to https://phabricator.wikimedia.org/P63729 and previous config saved to /var/cache/conftool/dbconfig/20240530-230129-marostegui.json
  • 22:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:53 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:51 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:51 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P63728 and previous config saved to /var/cache/conftool/dbconfig/20240530-224621-marostegui.json
  • 22:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P63727 and previous config saved to /var/cache/conftool/dbconfig/20240530-223112-marostegui.json
  • 22:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:26 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:26 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T364299)', diff saved to https://phabricator.wikimedia.org/P63726 and previous config saved to /var/cache/conftool/dbconfig/20240530-221604-marostegui.json
  • 22:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:21 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:21 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:08 cjming: end of UTC late backport window
  • 21:07 cjming@deploy1002: Finished scap: Backport for Revert "IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath" (T361884) (duration: 11m 43s)
  • 21:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:04 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 21:04 Amir1: dropping old replication user from backup sources
  • 21:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:02 jclark@cumin1002: START - Cookbook sre.hosts.provision for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:59 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:59 cjming@deploy1002: tchanders and cjming: Continuing with sync
  • 20:58 cjming@deploy1002: tchanders and cjming: Backport for Revert "IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath" (T361884) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:56 jclark@cumin1002: START - Cookbook sre.hosts.provision for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:55 cjming@deploy1002: Started scap: Backport for Revert "IPInfo: Remove $wgIPInfoGeoIP2EnterprisePath" (T361884)
  • 20:55 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:53 cjming@deploy1002: Finished scap: Backport for Add a stream for tracking the API of WikiLambda (T356228 T360369) (duration: 28m 08s)
  • 20:51 jclark@cumin1002: START - Cookbook sre.hosts.provision for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:44 cjming@deploy1002: cjming and dmartin: Continuing with sync
  • 20:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:35 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:35 robh@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:35 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:32 robh@cumin1002: START - Cookbook sre.hosts.provision for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:29 cjming@deploy1002: cjming and dmartin: Backport for Add a stream for tracking the API of WikiLambda (T356228 T360369) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:27 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:27 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:25 cjming@deploy1002: Started scap: Backport for Add a stream for tracking the API of WikiLambda (T356228 T360369)
  • 20:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:18 cjming@deploy1002: Finished scap: Backport for Popups setting should be string not integer (T364347) (duration: 13m 39s)
  • 20:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:10 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:10 cjming@deploy1002: cjming and jdlrobson: Continuing with sync
  • 20:09 cjming@deploy1002: cjming and jdlrobson: Backport for Popups setting should be string not integer (T364347) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:05 cjming@deploy1002: Started scap: Backport for Popups setting should be string not integer (T364347)
  • 20:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:02 cdanis@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:02 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:59 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device ssw1-d8-codfw.mgmt.codfw.wmnet
  • 19:57 cdanis@cumin1002: START - Cookbook sre.hosts.provision for host parse1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T352010)', diff saved to https://phabricator.wikimedia.org/P63725 and previous config saved to /var/cache/conftool/dbconfig/20240530-193717-ladsgroup.json
  • 19:37 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 19:37 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 19:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T352010)', diff saved to https://phabricator.wikimedia.org/P63724 and previous config saved to /var/cache/conftool/dbconfig/20240530-193653-ladsgroup.json
  • 19:35 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.43.0-wmf.7 refs T361401
  • 19:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:28 jhathaway: bounce exim on mx2001
  • 19:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:27 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-d8-codfw - pt1979@cumin2002"
  • 19:26 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-d8-codfw - pt1979@cumin2002"
  • 19:24 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 19:24 pt1979@cumin2002: START - Cookbook sre.network.provision for device ssw1-d8-codfw.mgmt.codfw.wmnet
  • 19:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P63723 and previous config saved to /var/cache/conftool/dbconfig/20240530-192145-ladsgroup.json
  • 19:20 dancy@deploy1002: Finished scap: Backport for Temporarily silence noisy new warnings (T366268) (duration: 15m 39s)
  • 19:19 jhathaway: bouncing exim on mx1001
  • 19:11 dancy@deploy1002: jforrester and dancy: Continuing with sync
  • 19:11 dancy@deploy1002: jforrester and dancy: Backport for Temporarily silence noisy new warnings (T366268) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:08 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device ssw1-d1-codfw.mgmt.codfw.wmnet
  • 19:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P63722 and previous config saved to /var/cache/conftool/dbconfig/20240530-190633-ladsgroup.json
  • 19:05 dancy@deploy1002: Started scap: Backport for Temporarily silence noisy new warnings (T366268)
  • 18:59 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-d8-codfw.mgmt.codfw.wmnet
  • 18:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T352010)', diff saved to https://phabricator.wikimedia.org/P63720 and previous config saved to /var/cache/conftool/dbconfig/20240530-185125-ladsgroup.json
  • 18:41 cdanis: T365571 💙root@deploy1002.eqiad.wmnet ~ 🕝⁉ kubectl delete node kubernetes2032.codfw.wmnet
  • 18:36 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:35 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-d1-codfw - pt1979@cumin2002"
  • 18:35 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for ssw1-d1-codfw - pt1979@cumin2002"
  • 18:31 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 18:31 pt1979@cumin2002: START - Cookbook sre.network.provision for device ssw1-d1-codfw.mgmt.codfw.wmnet
  • 18:29 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-d7-codfw.mgmt.codfw.wmnet
  • 18:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:27 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d8-codfw - pt1979@cumin2002"
  • 18:26 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d8-codfw - pt1979@cumin2002"
  • 18:18 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 18:18 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-d8-codfw.mgmt.codfw.wmnet
  • 18:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:08 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-d6-codfw.mgmt.codfw.wmnet
  • 17:58 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:58 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d7-codfw - pt1979@cumin2002"
  • 17:57 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d7-codfw - pt1979@cumin2002"
  • 17:57 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:57 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:50 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 17:50 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-d7-codfw.mgmt.codfw.wmnet
  • 17:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:49 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:40 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-d5-codfw.mgmt.codfw.wmnet
  • 17:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:37 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d6-codfw - pt1979@cumin2002"
  • 17:36 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d6-codfw - pt1979@cumin2002"
  • 17:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:35 joal@deploy1002: Finished deploy [airflow-dags/analytics@e74e164]: Regular analytics weekly train HOTFIX [airflow-dags/analytics@e74e164f] (duration: 00m 27s)
  • 17:34 joal@deploy1002: Started deploy [airflow-dags/analytics@e74e164]: Regular analytics weekly train HOTFIX [airflow-dags/analytics@e74e164f]
  • 17:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:33 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 17:33 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-d6-codfw.mgmt.codfw.wmnet
  • 17:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:30 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-d4-codfw.mgmt.codfw.wmnet
  • 17:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:14 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:09 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:09 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d5-codfw - pt1979@cumin2002"
  • 17:08 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d5-codfw - pt1979@cumin2002"
  • 17:06 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 17:06 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-d5-codfw.mgmt.codfw.wmnet
  • 17:04 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-d3-codfw.mgmt.codfw.wmnet
  • 17:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 17:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 17:01 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:59 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:59 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d4-codfw - pt1979@cumin2002"
  • 16:58 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d4-codfw - pt1979@cumin2002"
  • 16:57 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:57 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1209', diff saved to https://phabricator.wikimedia.org/P63719 and previous config saved to /var/cache/conftool/dbconfig/20240530-165615-root.json
  • 16:55 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 16:55 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-d4-codfw.mgmt.codfw.wmnet
  • 16:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T364299)', diff saved to https://phabricator.wikimedia.org/P63718 and previous config saved to /var/cache/conftool/dbconfig/20240530-165120-marostegui.json
  • 16:51 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-c7-codfw.mgmt.codfw.wmnet
  • 16:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 16:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T364299)', diff saved to https://phabricator.wikimedia.org/P63717 and previous config saved to /var/cache/conftool/dbconfig/20240530-165057-marostegui.json
  • 16:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1209 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P63716 and previous config saved to /var/cache/conftool/dbconfig/20240530-165034-root.json
  • 16:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P63715 and previous config saved to /var/cache/conftool/dbconfig/20240530-163549-marostegui.json
  • 16:34 kamila@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
  • 16:33 kamila@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
  • 16:33 kamila@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d3-codfw - pt1979@cumin2002"
  • 16:32 kamila@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:32 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d3-codfw - pt1979@cumin2002"
  • 16:32 kamila@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
  • 16:31 kamila@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
  • 16:31 kamila@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:31 kamila@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 16:31 kamila@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
  • 16:31 kamila@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
  • 16:25 kamila@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
  • 16:25 kamila@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
  • 16:25 kamila@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:24 kamila@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:24 kamila@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
  • 16:23 kamila@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
  • 16:23 kamila@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:23 kamila@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 16:23 kamila@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 16:23 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 16:23 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-d3-codfw.mgmt.codfw.wmnet
  • 16:22 kamila@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 16:22 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-d2-codfw.mgmt.codfw.wmnet
  • 16:22 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 16:21 kamila@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 16:21 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P63714 and previous config saved to /var/cache/conftool/dbconfig/20240530-162040-marostegui.json
  • 16:20 kamila@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 16:20 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 16:20 sukhe: [correction] sudo homer cr*magru* commit "add 198.35.27.0/24 for magru to announce ns2.wikimedia.org": T346722
  • 16:20 kamila@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 16:20 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:20 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:20 kamila@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 16:19 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 16:19 kamila@deploy1002: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 16:18 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 16:17 sukhe: sudo homer asw*magru* commit "add 198.35.27.0/24 for magru to announce ns2.wikimedia.org": T346722
  • 16:15 kamila@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:14 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 16:14 kamila@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 16:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 16:12 dancy@deploy1002: Finished scap: Backport for Migrate `wmfstatic` metrics to Prometheus store (T359255) (duration: 17m 19s)
  • 16:09 kamila@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:08 kamila@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 16:08 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 16:07 kamila@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T364299)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240530-160528-marostegui.json
  • 16:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:04 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:03 dancy@deploy1002: denisse and dancy: Continuing with sync
  • 15:59 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns700[1-2].wikimedia.org,service=authdns-ns2
  • 15:58 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
  • 15:57 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:57 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:57 dancy@deploy1002: denisse and dancy: Backport for Migrate `wmfstatic` metrics to Prometheus store (T359255) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:55 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
  • 15:54 dancy@deploy1002: Started scap: Backport for Migrate `wmfstatic` metrics to Prometheus store (T359255)
  • 15:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d2-codfw - pt1979@cumin2002"
  • 15:50 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-d2-codfw - pt1979@cumin2002"
  • 15:48 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 15:48 dancy@deploy1002: Finished scap: Testing (duration: 10m 43s)
  • 15:46 ejegg: payments-wiki upgraded from 8ff002ef to 0174d89c
  • 15:45 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:45 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:45 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-d2-codfw.mgmt.codfw.wmnet
  • 15:43 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 15:43 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-c7-codfw.mgmt.codfw.wmnet
  • 15:42 arnaudb@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 100%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63713 and previous config saved to /var/cache/conftool/dbconfig/20240530-154208-arnaudb.json
  • 15:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63712 and previous config saved to /var/cache/conftool/dbconfig/20240530-154127-arnaudb.json
  • 15:38 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 15:38 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-c5-codfw.mgmt.codfw.wmnet
  • 15:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:37 joal@deploy1002: Finished deploy [airflow-dags/analytics@3659547]: Regular analytics weekly train [airflow-dags/analytics@3659547f] (duration: 00m 29s)
  • 15:37 dancy@deploy1002: Started scap: Testing
  • 15:37 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-c6-codfw.mgmt.codfw.wmnet
  • 15:36 joal@deploy1002: Started deploy [airflow-dags/analytics@3659547]: Regular analytics weekly train [airflow-dags/analytics@3659547f]
  • 15:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:34 dancy@deploy1002: Finished scap: Backport for rdbms: Pass array values to makeList on insert/upsert (T366268) (duration: 11m 55s)
  • 15:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 75%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63710 and previous config saved to /var/cache/conftool/dbconfig/20240530-152703-arnaudb.json
  • 15:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 75%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63709 and previous config saved to /var/cache/conftool/dbconfig/20240530-152619-arnaudb.json
  • 15:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:24 dancy@deploy1002: umherirrender and dancy: Continuing with sync
  • 15:24 dancy@deploy1002: umherirrender and dancy: Backport for rdbms: Pass array values to makeList on insert/upsert (T366268) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:22 dancy@deploy1002: Started scap: Backport for rdbms: Pass array values to makeList on insert/upsert (T366268)
  • 15:19 aborrero@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1041
  • 15:19 aborrero@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1041
  • 15:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 50%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63708 and previous config saved to /var/cache/conftool/dbconfig/20240530-151155-arnaudb.json
  • 15:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 50%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63707 and previous config saved to /var/cache/conftool/dbconfig/20240530-151113-arnaudb.json
  • 15:06 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:06 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:06 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c6-codfw - pt1979@cumin2002"
  • 15:05 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c6-codfw - pt1979@cumin2002"
  • 15:05 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 15:01 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:59 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 14:59 pt1979@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 14:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:59 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:59 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-c6-codfw.mgmt.codfw.wmnet
  • 14:59 pt1979@cumin2002: END (FAIL) - Cookbook sre.network.provision (exit_code=99) for device lsw1-c2-codfw.mgmt.codfw.wmnet
  • 14:58 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:58 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for lsw1-c2-codfw - pt1979@cumin2002"
  • 14:58 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for lsw1-c2-codfw - pt1979@cumin2002"
  • 14:58 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 14:58 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-c5-codfw.mgmt.codfw.wmnet
  • 14:57 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:57 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 25%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63706 and previous config saved to /var/cache/conftool/dbconfig/20240530-145648-arnaudb.json
  • 14:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 25%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63705 and previous config saved to /var/cache/conftool/dbconfig/20240530-145607-arnaudb.json
  • 14:54 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:54 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:52 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 14:51 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:51 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:49 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:42 hnowlan: Running `decommission` on 5 eqiad api appservers
  • 14:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db1165 (re)pooling @ 10%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63704 and previous config saved to /var/cache/conftool/dbconfig/20240530-144142-arnaudb.json
  • 14:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 10%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63703 and previous config saved to /var/cache/conftool/dbconfig/20240530-144101-arnaudb.json
  • 14:31 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1165.eqiad.wmnet
  • 14:26 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1165.eqiad.wmnet
  • 14:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 5%: post upgrade repool', diff saved to https://phabricator.wikimedia.org/P63702 and previous config saved to /var/cache/conftool/dbconfig/20240530-142555-arnaudb.json
  • 14:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db1165 depool for T356240', diff saved to https://phabricator.wikimedia.org/P63701 and previous config saved to /var/cache/conftool/dbconfig/20240530-142519-arnaudb.json
  • 14:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db[1155,1165].eqiad.wmnet with reason: upgrade db1165
  • 14:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db[1155,1165].eqiad.wmnet with reason: upgrade db1165
  • 14:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T366123)', diff saved to https://phabricator.wikimedia.org/P63700 and previous config saved to /var/cache/conftool/dbconfig/20240530-141914-marostegui.json
  • 14:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:11 dcausse: backport window done
  • 14:10 dcausse@deploy1002: Finished scap: Backport for Add UpdateGroup for weighted tags (duration: 11m 51s)
  • 14:05 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 14:05 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P63699 and previous config saved to /var/cache/conftool/dbconfig/20240530-140404-marostegui.json
  • 14:02 dcausse@deploy1002: dcausse: Continuing with sync
  • 14:01 dcausse@deploy1002: dcausse: Backport for Add UpdateGroup for weighted tags synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:58 dcausse@deploy1002: Started scap: Backport for Add UpdateGroup for weighted tags
  • 13:55 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:52 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:52 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P63698 and previous config saved to /var/cache/conftool/dbconfig/20240530-134856-marostegui.json
  • 13:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:46 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 13:45 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 13:43 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:43 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:43 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2044.codfw.wmnet with OS bookworm
  • 13:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:38 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1044.eqiad.wmnet with OS bookworm
  • 13:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T366123)', diff saved to https://phabricator.wikimedia.org/P63697 and previous config saved to /var/cache/conftool/dbconfig/20240530-133348-marostegui.json
  • 13:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1173.eqiad.wmnet with reason: upgrade
  • 13:27 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1173.eqiad.wmnet with reason: upgrade
  • 13:27 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:27 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:25 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2044.codfw.wmnet with reason: host reimage
  • 13:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:22 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2044.codfw.wmnet with reason: host reimage
  • 13:20 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1044.eqiad.wmnet with reason: host reimage
  • 13:20 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1173.eqiad.wmnet
  • 13:19 dcausse@deploy1002: Finished scap: Backport for cirrus: Send weighted tags to known clusters (duration: 12m 43s)
  • 13:17 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:17 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1044.eqiad.wmnet with reason: host reimage
  • 13:15 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:15 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1173.eqiad.wmnet
  • 13:14 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1173.eqiad.wmnet with reason: upgrade
  • 13:14 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1173.eqiad.wmnet with reason: upgrade
  • 13:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1173 T356240', diff saved to https://phabricator.wikimedia.org/P63695 and previous config saved to /var/cache/conftool/dbconfig/20240530-131349-arnaudb.json
  • 13:12 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:12 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:10 dcausse@deploy1002: dcausse and ebernhardson: Continuing with sync
  • 13:10 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2220 (T366123)', diff saved to https://phabricator.wikimedia.org/P63694 and previous config saved to /var/cache/conftool/dbconfig/20240530-131012-marostegui.json
  • 13:10 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 13:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 13:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T366123)', diff saved to https://phabricator.wikimedia.org/P63693 and previous config saved to /var/cache/conftool/dbconfig/20240530-130946-marostegui.json
  • 13:09 dcausse@deploy1002: dcausse and ebernhardson: Backport for cirrus: Send weighted tags to known clusters synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:08 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:06 dcausse@deploy1002: Started scap: Backport for cirrus: Send weighted tags to known clusters
  • 13:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:04 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1044.eqiad.wmnet with OS bookworm
  • 13:04 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2044.codfw.wmnet with OS bookworm
  • 13:01 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:01 joal@deploy1002: Finished deploy [analytics/refinery@ac0b789] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ac0b789b] (duration: 02m 54s)
  • 13:01 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:58 joal@deploy1002: Started deploy [analytics/refinery@ac0b789] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ac0b789b]
  • 12:58 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:58 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:55 joal@deploy1002: Finished deploy [analytics/refinery@ac0b789] (thin): Regular analytics weekly train THIN [analytics/refinery@ac0b789b] (duration: 04m 27s)
  • 12:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P63692 and previous config saved to /var/cache/conftool/dbconfig/20240530-125438-marostegui.json
  • 12:53 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 12:53 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 12:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T364069)', diff saved to https://phabricator.wikimedia.org/P63691 and previous config saved to /var/cache/conftool/dbconfig/20240530-125232-marostegui.json
  • 12:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 12:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 12:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 12:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 12:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T364069)', diff saved to https://phabricator.wikimedia.org/P63690 and previous config saved to /var/cache/conftool/dbconfig/20240530-125204-marostegui.json
  • 12:50 joal@deploy1002: Started deploy [analytics/refinery@ac0b789] (thin): Regular analytics weekly train THIN [analytics/refinery@ac0b789b]
  • 12:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:44 dcausse@deploy1002: Finished deploy [airflow-dags/search@0faf248]: search: use discolytics 0.23 (duration: 00m 26s)
  • 12:43 dcausse@deploy1002: Started deploy [airflow-dags/search@0faf248]: search: use discolytics 0.23
  • 12:43 joal@deploy1002: Finished deploy [analytics/refinery@ac0b789]: Regular analytics weekly train [analytics/refinery@ac0b789b] (duration: 12m 58s)
  • 12:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:41 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:41 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P63689 and previous config saved to /var/cache/conftool/dbconfig/20240530-123930-marostegui.json
  • 12:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P63688 and previous config saved to /var/cache/conftool/dbconfig/20240530-123655-marostegui.json
  • 12:30 joal@deploy1002: Started deploy [analytics/refinery@ac0b789]: Regular analytics weekly train [analytics/refinery@ac0b789b]
  • 12:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T366123)', diff saved to https://phabricator.wikimedia.org/P63687 and previous config saved to /var/cache/conftool/dbconfig/20240530-122422-marostegui.json
  • 12:22 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2208 (T366123)', diff saved to https://phabricator.wikimedia.org/P63686 and previous config saved to /var/cache/conftool/dbconfig/20240530-122206-marostegui.json
  • 12:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 12:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P63685 and previous config saved to /var/cache/conftool/dbconfig/20240530-122146-marostegui.json
  • 12:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 12:08 aikochou@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 12:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T364069)', diff saved to https://phabricator.wikimedia.org/P63684 and previous config saved to /var/cache/conftool/dbconfig/20240530-120638-marostegui.json
  • 12:04 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2205', diff saved to https://phabricator.wikimedia.org/P63683 and previous config saved to /var/cache/conftool/dbconfig/20240530-120455-root.json
  • 12:01 marostegui: Deploy schema changes on old s3 codfw master (db2205) dbmaint T364069
  • 11:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:59 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 11:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 11:48 cgoubert@deploy1002: helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply
  • 11:47 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:47 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:47 cgoubert@deploy1002: helmfile [eqiad] START helmfile.d/services/push-notifications: apply
  • 11:47 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/push-notifications: apply
  • 11:47 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/push-notifications: apply
  • 11:46 cgoubert@deploy1002: helmfile [staging] DONE helmfile.d/services/push-notifications: apply
  • 11:46 cgoubert@deploy1002: helmfile [staging] START helmfile.d/services/push-notifications: apply
  • 11:44 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:44 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:35 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 11:34 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 11:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:33 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 11:32 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 11:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:32 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 11:31 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 11:30 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:30 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:26 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 11:26 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 11:25 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 11:24 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 11:23 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 11:23 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 11:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 11:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T366123)', diff saved to https://phabricator.wikimedia.org/P63682 and previous config saved to /var/cache/conftool/dbconfig/20240530-112047-marostegui.json
  • 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P63681 and previous config saved to /var/cache/conftool/dbconfig/20240530-110539-marostegui.json
  • 11:05 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:05 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:52 hnowlan: switched mw2300 to be an api canary + scap_proxy, removed mw228[34] as canaries
  • 10:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P63680 and previous config saved to /var/cache/conftool/dbconfig/20240530-105031-marostegui.json
  • 10:50 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:50 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:43 marostegui: Deploy schema changes on old s3 codfw master (db2205) dbmaint T364299
  • 10:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T364299)', diff saved to https://phabricator.wikimedia.org/P63679 and previous config saved to /var/cache/conftool/dbconfig/20240530-104034-marostegui.json
  • 10:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 10:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 10:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T364299)', diff saved to https://phabricator.wikimedia.org/P63678 and previous config saved to /var/cache/conftool/dbconfig/20240530-104011-marostegui.json
  • 10:38 dcausse@deploy1002: Finished deploy [airflow-dags/search@ded0f17]: search: fix alter table command (duration: 00m 20s)
  • 10:38 dcausse@deploy1002: Started deploy [airflow-dags/search@ded0f17]: search: fix alter table command
  • 10:36 jiji@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-ctrl1003.eqiad.wmnet
  • 10:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T366123)', diff saved to https://phabricator.wikimedia.org/P63677 and previous config saved to /var/cache/conftool/dbconfig/20240530-103523-marostegui.json
  • 10:28 effie: homer "cr*eqiad*" commit 'Add wikikube-ctrl1003'
  • 10:26 claime: Restarted rsyslog on mw1479
  • 10:25 effie: label wikikube-ctrl1003 as master
  • 10:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P63676 and previous config saved to /var/cache/conftool/dbconfig/20240530-102503-marostegui.json
  • 10:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T352010)', diff saved to https://phabricator.wikimedia.org/P63675 and previous config saved to /var/cache/conftool/dbconfig/20240530-102439-ladsgroup.json
  • 10:24 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 10:24 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 10:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138 (T352010)', diff saved to https://phabricator.wikimedia.org/P63674 and previous config saved to /var/cache/conftool/dbconfig/20240530-102414-ladsgroup.json
  • 10:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P63673 and previous config saved to /var/cache/conftool/dbconfig/20240530-100955-marostegui.json
  • 10:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138', diff saved to https://phabricator.wikimedia.org/P63672 and previous config saved to /var/cache/conftool/dbconfig/20240530-100906-ladsgroup.json
  • 10:07 effie: add wikikube-ctrl1003 to etcd and run puppet - T353464
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T366123)', diff saved to https://phabricator.wikimedia.org/P63671 and previous config saved to /var/cache/conftool/dbconfig/20240530-100554-marostegui.json
  • 10:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 10:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T366123)', diff saved to https://phabricator.wikimedia.org/P63670 and previous config saved to /var/cache/conftool/dbconfig/20240530-100531-marostegui.json
  • 09:59 dcausse@deploy1002: Finished deploy [airflow-dags/search@66de0db]: search: add missing lexeme fields (duration: 00m 19s)
  • 09:59 dcausse@deploy1002: Started deploy [airflow-dags/search@66de0db]: search: add missing lexeme fields
  • 09:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T364299)', diff saved to https://phabricator.wikimedia.org/P63669 and previous config saved to /var/cache/conftool/dbconfig/20240530-095447-marostegui.json
  • 09:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138', diff saved to https://phabricator.wikimedia.org/P63668 and previous config saved to /var/cache/conftool/dbconfig/20240530-095358-ladsgroup.json
  • 09:53 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:53 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P63667 and previous config saved to /var/cache/conftool/dbconfig/20240530-095021-marostegui.json
  • 09:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2205 mirror former candidate master weight T366242', diff saved to https://phabricator.wikimedia.org/P63666 and previous config saved to /var/cache/conftool/dbconfig/20240530-094936-root.json
  • 09:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2127 to s3 primary T366242', diff saved to https://phabricator.wikimedia.org/P63665 and previous config saved to /var/cache/conftool/dbconfig/20240530-094632-arnaudb.json
  • 09:45 arnaudb: Starting s3 codfw failover from db2205 to db2127 - T366242
  • 09:42 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:42 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:40 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:40 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138 (T352010)', diff saved to https://phabricator.wikimedia.org/P63664 and previous config saved to /var/cache/conftool/dbconfig/20240530-093850-ladsgroup.json
  • 09:38 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:38 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P63663 and previous config saved to /var/cache/conftool/dbconfig/20240530-093514-marostegui.json
  • 09:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2127 with weight 0 T366242', diff saved to https://phabricator.wikimedia.org/P63662 and previous config saved to /var/cache/conftool/dbconfig/20240530-093007-arnaudb.json
  • 09:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 24 hosts with reason: Primary switchover s3 T366242
  • 09:29 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 24 hosts with reason: Primary switchover s3 T366242
  • 09:25 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:25 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:22 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:22 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1215.eqiad.wmnet with OS bookworm
  • 09:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T366123)', diff saved to https://phabricator.wikimedia.org/P63661 and previous config saved to /var/cache/conftool/dbconfig/20240530-092004-marostegui.json
  • 09:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:18 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:17 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168 (T366123)', diff saved to https://phabricator.wikimedia.org/P63660 and previous config saved to /var/cache/conftool/dbconfig/20240530-091751-marostegui.json
  • 09:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 09:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 09:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T366123)', diff saved to https://phabricator.wikimedia.org/P63659 and previous config saved to /var/cache/conftool/dbconfig/20240530-091728-marostegui.json
  • 09:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:15 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2204 with weight 500 revert T366241', diff saved to https://phabricator.wikimedia.org/P63658 and previous config saved to /var/cache/conftool/dbconfig/20240530-091323-arnaudb.json
  • 09:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:09 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2204 with weight 0 T366241', diff saved to https://phabricator.wikimedia.org/P63656 and previous config saved to /var/cache/conftool/dbconfig/20240530-090840-arnaudb.json
  • 09:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s2 T366241
  • 09:07 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s2 T366241
  • 09:07 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:07 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
  • 09:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P63655 and previous config saved to /var/cache/conftool/dbconfig/20240530-090220-marostegui.json
  • 09:02 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
  • 08:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:47 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1215.eqiad.wmnet with OS bookworm
  • 08:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P63654 and previous config saved to /var/cache/conftool/dbconfig/20240530-084712-marostegui.json
  • 08:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:39 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:39 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:35 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:35 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T366123)', diff saved to https://phabricator.wikimedia.org/P63653 and previous config saved to /var/cache/conftool/dbconfig/20240530-083204-marostegui.json
  • 08:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T366123)', diff saved to https://phabricator.wikimedia.org/P63652 and previous config saved to /var/cache/conftool/dbconfig/20240530-083054-marostegui.json
  • 08:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 08:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 08:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 08:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 08:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T366123)', diff saved to https://phabricator.wikimedia.org/P63651 and previous config saved to /var/cache/conftool/dbconfig/20240530-083025-marostegui.json
  • 08:29 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:29 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:23 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:23 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P63650 and previous config saved to /var/cache/conftool/dbconfig/20240530-081517-marostegui.json
  • 08:13 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:13 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:04 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:03 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:02 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:02 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P63649 and previous config saved to /var/cache/conftool/dbconfig/20240530-080009-marostegui.json
  • 07:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T366123)', diff saved to https://phabricator.wikimedia.org/P63648 and previous config saved to /var/cache/conftool/dbconfig/20240530-074501-marostegui.json
  • 07:34 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:34 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:32 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:32 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T366123)', diff saved to https://phabricator.wikimedia.org/P63645 and previous config saved to /var/cache/conftool/dbconfig/20240530-071559-marostegui.json
  • 07:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 07:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T366123)', diff saved to https://phabricator.wikimedia.org/P63644 and previous config saved to /var/cache/conftool/dbconfig/20240530-071535-marostegui.json
  • 07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P63643 and previous config saved to /var/cache/conftool/dbconfig/20240530-070027-marostegui.json
  • 06:57 marostegui: Deploy schema changes on old s8 eqiad master (db1209) dbmaint T364299
  • 06:56 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 49666
  • 06:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:56 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 49666
  • 06:55 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 8674
  • 06:53 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 8674
  • 06:50 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 8674
  • 06:49 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 8674
  • 06:48 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:48 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:48 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'configure' for AS: 8674
  • 06:47 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 8674
  • 06:46 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:46 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P63642 and previous config saved to /var/cache/conftool/dbconfig/20240530-064519-marostegui.json
  • 06:36 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:36 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:33 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:33 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:31 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:31 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T366123)', diff saved to https://phabricator.wikimedia.org/P63641 and previous config saved to /var/cache/conftool/dbconfig/20240530-063011-marostegui.json
  • 06:19 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:19 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:00 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T366123)', diff saved to https://phabricator.wikimedia.org/P63640 and previous config saved to /var/cache/conftool/dbconfig/20240530-060023-marostegui.json
  • 06:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 06:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 06:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T366123)', diff saved to https://phabricator.wikimedia.org/P63639 and previous config saved to /var/cache/conftool/dbconfig/20240530-055959-marostegui.json
  • 05:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P63638 and previous config saved to /var/cache/conftool/dbconfig/20240530-054451-marostegui.json
  • 05:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P63636 and previous config saved to /var/cache/conftool/dbconfig/20240530-052941-marostegui.json
  • 05:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1163 (T364299)', diff saved to https://phabricator.wikimedia.org/P63635 and previous config saved to /var/cache/conftool/dbconfig/20240530-052006-marostegui.json
  • 05:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 05:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 05:17 marostegui: Deploy schema changes on old s8 eqiad master (db1209) dbmaint T355609 T356166
  • 05:16 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:16 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T366123)', diff saved to https://phabricator.wikimedia.org/P63634 and previous config saved to /var/cache/conftool/dbconfig/20240530-051433-marostegui.json
  • 05:14 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T366123)', diff saved to https://phabricator.wikimedia.org/P63633 and previous config saved to /var/cache/conftool/dbconfig/20240530-051220-marostegui.json
  • 05:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 05:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 05:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1209 T364541', diff saved to https://phabricator.wikimedia.org/P63632 and previous config saved to /var/cache/conftool/dbconfig/20240530-051132-root.json
  • 05:10 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db1192 to s8 primary and set section read-write T364541', diff saved to https://phabricator.wikimedia.org/P63631 and previous config saved to /var/cache/conftool/dbconfig/20240530-051031-marostegui.json
  • 05:10 marostegui@cumin1002: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - T364541', diff saved to https://phabricator.wikimedia.org/P63630 and previous config saved to /var/cache/conftool/dbconfig/20240530-051012-marostegui.json
  • 05:09 marostegui: Starting s8 eqiad failover from db1209 to db1192 - T364541
  • 05:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 04:56 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:56 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:43 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db1192 from API/vslow/dump T364541', diff saved to https://phabricator.wikimedia.org/P63629 and previous config saved to /var/cache/conftool/dbconfig/20240530-044328-root.json
  • 04:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s8 T364541
  • 04:42 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1192 with weight 0 T364541', diff saved to https://phabricator.wikimedia.org/P63628 and previous config saved to /var/cache/conftool/dbconfig/20240530-044249-root.json
  • 04:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s8 T364541
  • 04:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 04:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 04:20 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:20 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:13 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:13 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:11 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:11 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:09 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:08 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:06 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:06 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T364299)', diff saved to https://phabricator.wikimedia.org/P63627 and previous config saved to /var/cache/conftool/dbconfig/20240530-025955-marostegui.json
  • 02:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P63626 and previous config saved to /var/cache/conftool/dbconfig/20240530-024447-marostegui.json
  • 02:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P63625 and previous config saved to /var/cache/conftool/dbconfig/20240530-022938-marostegui.json
  • 02:27 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-c4-codfw.mgmt.codfw.wmnet
  • 02:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T364299)', diff saved to https://phabricator.wikimedia.org/P63624 and previous config saved to /var/cache/conftool/dbconfig/20240530-021430-marostegui.json
  • 01:59 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:59 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:56 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 01:56 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c4-codfw - pt1979@cumin2002"
  • 01:55 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c4-codfw - pt1979@cumin2002"
  • 01:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2138 (T352010)', diff saved to https://phabricator.wikimedia.org/P63623 and previous config saved to /var/cache/conftool/dbconfig/20240530-014850-ladsgroup.json
  • 01:48 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 01:48 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 01:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T352010)', diff saved to https://phabricator.wikimedia.org/P63622 and previous config saved to /var/cache/conftool/dbconfig/20240530-014827-ladsgroup.json
  • 01:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2147 (T364069)', diff saved to https://phabricator.wikimedia.org/P63621 and previous config saved to /var/cache/conftool/dbconfig/20240530-014725-marostegui.json
  • 01:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 01:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 01:39 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 01:39 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-c4-codfw.mgmt.codfw.wmnet
  • 01:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P63620 and previous config saved to /var/cache/conftool/dbconfig/20240530-013319-ladsgroup.json
  • 01:28 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-c3-codfw.mgmt.codfw.wmnet
  • 01:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P63619 and previous config saved to /var/cache/conftool/dbconfig/20240530-011810-ladsgroup.json
  • 01:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2220 (T364299)', diff saved to https://phabricator.wikimedia.org/P63618 and previous config saved to /var/cache/conftool/dbconfig/20240530-011518-marostegui.json
  • 01:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 01:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 01:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T364299)', diff saved to https://phabricator.wikimedia.org/P63617 and previous config saved to /var/cache/conftool/dbconfig/20240530-011454-marostegui.json
  • 01:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T352010)', diff saved to https://phabricator.wikimedia.org/P63616 and previous config saved to /var/cache/conftool/dbconfig/20240530-010302-ladsgroup.json
  • 00:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P63615 and previous config saved to /var/cache/conftool/dbconfig/20240530-005946-marostegui.json
  • 00:57 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 00:57 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c3-codfw - pt1979@cumin2002"
  • 00:56 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c3-codfw - pt1979@cumin2002"
  • 00:54 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 00:54 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-c3-codfw.mgmt.codfw.wmnet
  • 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c2-codfw - pt1979@cumin2002"
  • 00:52 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c2-codfw - pt1979@cumin2002"
  • 00:50 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 00:50 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-c2-codfw.mgmt.codfw.wmnet
  • 00:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P63614 and previous config saved to /var/cache/conftool/dbconfig/20240530-004438-marostegui.json
  • 00:41 pt1979@cumin2002: END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-c1-codfw.mgmt.codfw.wmnet
  • 00:37 logmsgbot: @deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:37 logmsgbot: @deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T364299)', diff saved to https://phabricator.wikimedia.org/P63613 and previous config saved to /var/cache/conftool/dbconfig/20240530-002930-marostegui.json
  • 00:09 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 00:09 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c1-codfw - pt1979@cumin2002"
  • 00:08 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c1-codfw - pt1979@cumin2002"
  • 00:06 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 00:06 pt1979@cumin2002: START - Cookbook sre.network.provision for device lsw1-c1-codfw.mgmt.codfw.wmnet

2024-05-29

  • 23:43 eileen: * civicrm upgraded from 0e3c277e to 44900b8c
  • 23:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2208 (T364299)', diff saved to https://phabricator.wikimedia.org/P63612 and previous config saved to /var/cache/conftool/dbconfig/20240529-232924-marostegui.json
  • 23:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 23:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 22:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 22:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 22:16 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 22:06 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
  • 22:05 jclark@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
  • 22:05 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 22:05 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 22:03 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
  • 22:02 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
  • 22:01 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 22:00 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 21:47 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:47 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 21:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 21:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T364299)', diff saved to https://phabricator.wikimedia.org/P63611 and previous config saved to /var/cache/conftool/dbconfig/20240529-214338-marostegui.json
  • 21:42 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:41 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:21 jsn@deploy1002: Sync cancelled.
  • 21:21 jsn@deploy1002: jsn: Backport for Revert "feature(Popups): Conditional User Defaults Implementation" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:19 jsn@deploy1002: Started scap: Backport for Revert "feature(Popups): Conditional User Defaults Implementation"
  • 21:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P63609 and previous config saved to /var/cache/conftool/dbconfig/20240529-211321-marostegui.json
  • 21:05 eileen: config revision changed from 38360c6d to 9bbbf8d6
  • 20:59 eileen: civicrm upgraded from 755c7e7f to 0e3c277e
  • 20:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T364299)', diff saved to https://phabricator.wikimedia.org/P63608 and previous config saved to /var/cache/conftool/dbconfig/20240529-205813-marostegui.json
  • 20:56 eileen: civicrm upgraded from 8f236b05 to 755c7e7f
  • 20:53 eileen: civicrm upgraded from 5d536940 to 8f236b05
  • 20:49 jsn@deploy1002: Sync cancelled.
  • 20:45 eileen: config revision changed from 5b0b4d22 to d686119a
  • 20:44 jsn@deploy1002: jsn and jdlrobson: Backport for feature(Popups): Conditional User Defaults Implementation (T364347) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:41 jsn@deploy1002: Started scap: Backport for feature(Popups): Conditional User Defaults Implementation (T364347)
  • 20:40 jsn@deploy1002: Finished scap: Backport for Enable wmgUseSandboxLink for Swahili Wikipedia (T365970) (duration: 17m 08s)
  • 20:32 jsn@deploy1002: jsn and nmw03: Continuing with sync
  • 20:25 jsn@deploy1002: jsn and nmw03: Backport for Enable wmgUseSandboxLink for Swahili Wikipedia (T365970) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:23 jsn@deploy1002: Started scap: Backport for Enable wmgUseSandboxLink for Swahili Wikipedia (T365970)
  • 20:21 jsn@deploy1002: Finished scap: Backport for CommonSettings: correct AutoModerator load order (T366203) (duration: 11m 22s)
  • 20:12 jsn@deploy1002: jsn: Continuing with sync
  • 20:12 jsn@deploy1002: jsn: Backport for CommonSettings: correct AutoModerator load order (T366203) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:10 eileen: civicrm upgraded from cc402cd1 to 5d536940
  • 20:10 jsn@deploy1002: Started scap: Backport for CommonSettings: correct AutoModerator load order (T366203)
  • 19:58 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 19:51 cdanis@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 19:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T364299)', diff saved to https://phabricator.wikimedia.org/P63607 and previous config saved to /var/cache/conftool/dbconfig/20240529-194309-marostegui.json
  • 19:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 19:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 19:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T364299)', diff saved to https://phabricator.wikimedia.org/P63606 and previous config saved to /var/cache/conftool/dbconfig/20240529-194245-marostegui.json
  • 19:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 19:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 19:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T366123)', diff saved to https://phabricator.wikimedia.org/P63605 and previous config saved to /var/cache/conftool/dbconfig/20240529-194107-marostegui.json
  • 19:37 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@3287de9]: bump discolytics to 0.22.0 (duration: 00m 27s)
  • 19:36 ebernhardson@deploy1002: Started deploy [airflow-dags/search@3287de9]: bump discolytics to 0.22.0
  • 19:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P63604 and previous config saved to /var/cache/conftool/dbconfig/20240529-192735-marostegui.json
  • 19:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P63603 and previous config saved to /var/cache/conftool/dbconfig/20240529-192559-marostegui.json
  • 19:17 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.43.0-wmf.7 refs T361401
  • 19:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P63602 and previous config saved to /var/cache/conftool/dbconfig/20240529-191227-marostegui.json
  • 19:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P63601 and previous config saved to /var/cache/conftool/dbconfig/20240529-191049-marostegui.json
  • 18:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T364299)', diff saved to https://phabricator.wikimedia.org/P63600 and previous config saved to /var/cache/conftool/dbconfig/20240529-185719-marostegui.json
  • 18:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T366123)', diff saved to https://phabricator.wikimedia.org/P63599 and previous config saved to /var/cache/conftool/dbconfig/20240529-185541-marostegui.json
  • 18:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2126 (T352010)', diff saved to https://phabricator.wikimedia.org/P63598 and previous config saved to /var/cache/conftool/dbconfig/20240529-185035-ladsgroup.json
  • 18:50 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 18:50 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 18:50 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 18:50 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 18:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T352010)', diff saved to https://phabricator.wikimedia.org/P63597 and previous config saved to /var/cache/conftool/dbconfig/20240529-185006-ladsgroup.json
  • 18:41 cdanis: 💙cdanis@lvs1020.eqiad.wmnet ~ 🕝☕ sudo systemctl restart pybal.service
  • 18:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P63595 and previous config saved to /var/cache/conftool/dbconfig/20240529-183458-ladsgroup.json
  • 18:33 dancy@deploy1002: Finished scap: Backport for Revert "Wrap tables with JS" (T330527) (duration: 25m 10s)
  • 18:32 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 18:32 andrew@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1002"
  • 18:31 andrew@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1002"
  • 18:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T366123)', diff saved to https://phabricator.wikimedia.org/P63594 and previous config saved to /var/cache/conftool/dbconfig/20240529-182719-marostegui.json
  • 18:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 18:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 18:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T366123)', diff saved to https://phabricator.wikimedia.org/P63593 and previous config saved to /var/cache/conftool/dbconfig/20240529-182656-marostegui.json
  • 18:24 dancy@deploy1002: dancy and jdlrobson: Continuing with sync
  • 18:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P63592 and previous config saved to /var/cache/conftool/dbconfig/20240529-181950-ladsgroup.json
  • 18:16 rzl: evacuate cordoned node parse1002: kubectl -n linkrecommendation delete pod linkrecommendation-internal-load-datasets-28616700-7gsqs; kubectl -n linkrecommendation delete pod linkrecommendation-internal-load-datasets-28616700-xl7t4; kubectl -n toolhub delete pod toolhub-main-crawler-28616760-jrhbb # T363086
  • 18:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P63590 and previous config saved to /var/cache/conftool/dbconfig/20240529-181148-marostegui.json
  • 18:11 dancy@deploy1002: dancy and jdlrobson: Backport for Revert "Wrap tables with JS" (T330527) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 18:08 dancy@deploy1002: Started scap: Backport for Revert "Wrap tables with JS" (T330527)
  • 18:04 akosiaris: kubectl -n mw-debug delete pods mw-debug.eqiad.pinkunicorn-6d4d68cd79-nq695
  • 18:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T352010)', diff saved to https://phabricator.wikimedia.org/P63589 and previous config saved to /var/cache/conftool/dbconfig/20240529-180442-ladsgroup.json
  • 17:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P63588 and previous config saved to /var/cache/conftool/dbconfig/20240529-175640-marostegui.json
  • 17:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2168 (T364299)', diff saved to https://phabricator.wikimedia.org/P63587 and previous config saved to /var/cache/conftool/dbconfig/20240529-174829-marostegui.json
  • 17:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 17:48 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 17:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T364299)', diff saved to https://phabricator.wikimedia.org/P63586 and previous config saved to /var/cache/conftool/dbconfig/20240529-174806-marostegui.json
  • 17:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T366123)', diff saved to https://phabricator.wikimedia.org/P63585 and previous config saved to /var/cache/conftool/dbconfig/20240529-174132-marostegui.json
  • 17:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T366123)', diff saved to https://phabricator.wikimedia.org/P63584 and previous config saved to /var/cache/conftool/dbconfig/20240529-173921-marostegui.json
  • 17:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 17:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 17:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T366123)', diff saved to https://phabricator.wikimedia.org/P63583 and previous config saved to /var/cache/conftool/dbconfig/20240529-173857-marostegui.json
  • 17:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P63582 and previous config saved to /var/cache/conftool/dbconfig/20240529-173258-marostegui.json
  • 17:26 andrew@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1041.eqiad.wmnet with reason: host reimage
  • 17:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P63581 and previous config saved to /var/cache/conftool/dbconfig/20240529-172349-marostegui.json
  • 17:23 andrew@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1041.eqiad.wmnet with reason: host reimage
  • 17:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P63580 and previous config saved to /var/cache/conftool/dbconfig/20240529-171750-marostegui.json
  • 17:14 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 17:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P63579 and previous config saved to /var/cache/conftool/dbconfig/20240529-170841-marostegui.json
  • 17:08 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 17:08 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 17:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T364299)', diff saved to https://phabricator.wikimedia.org/P63578 and previous config saved to /var/cache/conftool/dbconfig/20240529-170242-marostegui.json
  • 16:59 stevemunene@deploy1002: Finished deploy [airflow-dags/analytics@229b278]: (no justification provided) (duration: 00m 26s)
  • 16:59 stevemunene@deploy1002: Started deploy [airflow-dags/analytics@229b278]: (no justification provided)
  • 16:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T366123)', diff saved to https://phabricator.wikimedia.org/P63577 and previous config saved to /var/cache/conftool/dbconfig/20240529-165333-marostegui.json
  • 16:52 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 16:51 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T366123)', diff saved to https://phabricator.wikimedia.org/P63576 and previous config saved to /var/cache/conftool/dbconfig/20240529-165121-marostegui.json
  • 16:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 16:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 16:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T366123)', diff saved to https://phabricator.wikimedia.org/P63575 and previous config saved to /var/cache/conftool/dbconfig/20240529-165057-marostegui.json
  • 16:50 dancy@deploy1002: Started scap: Backport for Revert "Wrap tables with JS" (T330527)
  • 16:40 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2045.codfw.wmnet with OS bookworm
  • 16:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P63574 and previous config saved to /var/cache/conftool/dbconfig/20240529-163549-marostegui.json
  • 16:35 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
  • 16:34 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1045.eqiad.wmnet with OS bookworm
  • 16:32 sukhe: restart pybal on lvs1019
  • 16:29 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 16:28 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 16:27 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:22 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2045.codfw.wmnet with reason: host reimage
  • 16:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P63573 and previous config saved to /var/cache/conftool/dbconfig/20240529-162040-marostegui.json
  • 16:19 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2045.codfw.wmnet with reason: host reimage
  • 16:18 ChrisDobbins901_: sudo cumin -b1 -s60 'A:cp and A:drmrs' 'run-puppet-agent --enable "merging CR 1037089"'
  • 16:17 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1045.eqiad.wmnet with reason: host reimage
  • 16:15 arnaudb@cumin1002: dbctl commit (dc=all): 'db1163 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63572 and previous config saved to /var/cache/conftool/dbconfig/20240529-161522-arnaudb.json
  • 16:15 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:14 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1045.eqiad.wmnet with reason: host reimage
  • 16:09 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T366123)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240529-160528-marostegui.json
  • 16:04 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:04 ChrisDobbins901_: sudo cumin 'A:cp and A:drmrs' 'disable-puppet "merging CR 1037089"'
  • 16:01 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1045.eqiad.wmnet with OS bookworm
  • 16:01 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2045.codfw.wmnet with OS bookworm
  • 16:00 arnaudb@cumin1002: dbctl commit (dc=all): 'db1163 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63570 and previous config saved to /var/cache/conftool/dbconfig/20240529-160016-arnaudb.json
  • 16:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 16:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 15:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T366123)', diff saved to https://phabricator.wikimedia.org/P63569 and previous config saved to /var/cache/conftool/dbconfig/20240529-155954-marostegui.json
  • 15:59 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:56 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:55 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:55 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 15:55 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 15:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T364299)', diff saved to https://phabricator.wikimedia.org/P63568 and previous config saved to /var/cache/conftool/dbconfig/20240529-155349-marostegui.json
  • 15:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 15:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 15:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 15:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 15:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T364299)', diff saved to https://phabricator.wikimedia.org/P63567 and previous config saved to /var/cache/conftool/dbconfig/20240529-155321-marostegui.json
  • 15:52 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['cloudvirt1041']
  • 15:49 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbprov2003.codfw.wmnet with reason: upgrade to 10.6
  • 15:49 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on dbprov2003.codfw.wmnet with reason: upgrade to 10.6
  • 15:49 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbprov1003.eqiad.wmnet with reason: upgrade to 10.6
  • 15:49 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on dbprov1003.eqiad.wmnet with reason: upgrade to 10.6
  • 15:48 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:48 jynus@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2141.codfw.wmnet with reason: upgrade to 10.6
  • 15:48 jynus@cumin1002: START - Cookbook sre.hosts.downtime for 5:00:00 on db2141.codfw.wmnet with reason: upgrade to 10.6
  • 15:48 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:45 arnaudb@cumin1002: dbctl commit (dc=all): 'db1163 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63566 and previous config saved to /var/cache/conftool/dbconfig/20240529-154510-arnaudb.json
  • 15:45 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 15:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P63565 and previous config saved to /var/cache/conftool/dbconfig/20240529-154446-marostegui.json
  • 15:39 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1041']
  • 15:38 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 15:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P63564 and previous config saved to /var/cache/conftool/dbconfig/20240529-153813-marostegui.json
  • 15:32 dancy@deploy1002: Finished scap: Backport for Remove the php symlink (v2) (T359643) (duration: 13m 03s)
  • 15:31 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1041']
  • 15:31 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 15:30 arnaudb@cumin1002: dbctl commit (dc=all): 'db1163 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63563 and previous config saved to /var/cache/conftool/dbconfig/20240529-153001-arnaudb.json
  • 15:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P63562 and previous config saved to /var/cache/conftool/dbconfig/20240529-152937-marostegui.json
  • 15:29 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cloudvirt1041']
  • 15:27 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:26 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:26 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: correct IPs for apus - mvernon@cumin2002"
  • 15:25 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: correct IPs for apus - mvernon@cumin2002"
  • 15:25 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:25 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:23 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 15:23 dancy@deploy1002: dancy: Continuing with sync
  • 15:23 mvernon@cumin2002: START - Cookbook sre.dns.netbox
  • 15:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P63561 and previous config saved to /var/cache/conftool/dbconfig/20240529-152305-marostegui.json
  • 15:22 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1041']
  • 15:22 dancy@deploy1002: dancy: Backport for Remove the php symlink (v2) (T359643) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:21 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 15:20 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-web: sync
  • 15:19 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: sync
  • 15:19 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: sync
  • 15:19 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: sync
  • 15:19 dancy@deploy1002: Started scap: Backport for Remove the php symlink (v2) (T359643)
  • 15:18 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-wikifunctions: sync
  • 15:18 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
  • 15:18 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-ext: sync
  • 15:18 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/mw-web: sync
  • 15:18 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/mw-parsoid: sync
  • 15:18 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: sync
  • 15:18 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/mw-wikifunctions: sync
  • 15:18 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
  • 15:17 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1041']
  • 15:17 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 15:14 arnaudb@cumin1002: dbctl commit (dc=all): 'db1163 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63560 and previous config saved to /var/cache/conftool/dbconfig/20240529-151455-arnaudb.json
  • 15:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T366123)', diff saved to https://phabricator.wikimedia.org/P63559 and previous config saved to /var/cache/conftool/dbconfig/20240529-151430-marostegui.json
  • 15:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1181 (T366123)', diff saved to https://phabricator.wikimedia.org/P63558 and previous config saved to /var/cache/conftool/dbconfig/20240529-151219-marostegui.json
  • 15:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 15:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63557 and previous config saved to /var/cache/conftool/dbconfig/20240529-151152-arnaudb.json
  • 15:11 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 15:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T366123)', diff saved to https://phabricator.wikimedia.org/P63556 and previous config saved to /var/cache/conftool/dbconfig/20240529-151145-marostegui.json
  • 15:09 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 15:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1163.eqiad.wmnet with OS bookworm
  • 15:08 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T364299)', diff saved to https://phabricator.wikimedia.org/P63555 and previous config saved to /var/cache/conftool/dbconfig/20240529-150757-marostegui.json
  • 15:07 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:07 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2046.codfw.wmnet with OS bookworm
  • 15:07 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:06 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:06 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cloudvirt1041']
  • 15:05 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 15:05 andrew@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirt1041']
  • 15:05 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cloudvirt1041']
  • 15:04 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 14:58 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1046.eqiad.wmnet with OS bookworm
  • 14:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63554 and previous config saved to /var/cache/conftool/dbconfig/20240529-145646-arnaudb.json
  • 14:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P63553 and previous config saved to /var/cache/conftool/dbconfig/20240529-145637-marostegui.json
  • 14:56 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: sync
  • 14:54 cdanis@deploy1002: helmfile [codfw] START helmfile.d/services/mw-api-int: sync
  • 14:54 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:53 andrew@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirt1041']
  • 14:52 klausman@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:52 klausman@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 14:50 klausman@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 14:50 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:49 klausman@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 14:49 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2046.codfw.wmnet with reason: host reimage
  • 14:47 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
  • 14:47 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:45 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:45 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:44 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2046.codfw.wmnet with reason: host reimage
  • 14:44 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:44 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Discovery IPs for apus service - mvernon@cumin2002"
  • 14:43 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:43 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
  • 14:43 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Discovery IPs for apus service - mvernon@cumin2002"
  • 14:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 14:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 14:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137 (T364069)', diff saved to https://phabricator.wikimedia.org/P63552 and previous config saved to /var/cache/conftool/dbconfig/20240529-144229-marostegui.json
  • 14:42 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1046.eqiad.wmnet with reason: host reimage
  • 14:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63551 and previous config saved to /var/cache/conftool/dbconfig/20240529-144140-arnaudb.json
  • 14:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P63550 and previous config saved to /var/cache/conftool/dbconfig/20240529-144129-marostegui.json
  • 14:41 mvernon@cumin2002: START - Cookbook sre.dns.netbox
  • 14:39 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1046.eqiad.wmnet with reason: host reimage
  • 14:37 fabfur: enabled puppet on A:cp as https://gerrit.wikimedia.org/r/c/operations/puppet/+/1036711 has been reverted (not applied anywhere but cp4037) (T365718)
  • 14:33 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 14:30 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1163.eqiad.wmnet with OS bookworm
  • 14:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1163.eqiad.wmnet with reason: reimage
  • 14:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1163.eqiad.wmnet with reason: reimage
  • 14:28 jforrester@deploy1002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 14:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1163 T364290', diff saved to https://phabricator.wikimedia.org/P63549 and previous config saved to /var/cache/conftool/dbconfig/20240529-142830-arnaudb.json
  • 14:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db1196 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63548 and previous config saved to /var/cache/conftool/dbconfig/20240529-142750-arnaudb.json
  • 14:27 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 14:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137', diff saved to https://phabricator.wikimedia.org/P63547 and previous config saved to /var/cache/conftool/dbconfig/20240529-142721-marostegui.json
  • 14:27 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:26 jforrester@deploy1002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 14:26 jforrester@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 14:26 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2046.codfw.wmnet with OS bookworm
  • 14:26 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:26 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1046.eqiad.wmnet with OS bookworm
  • 14:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63546 and previous config saved to /var/cache/conftool/dbconfig/20240529-142627-arnaudb.json
  • 14:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T366123)', diff saved to https://phabricator.wikimedia.org/P63545 and previous config saved to /var/cache/conftool/dbconfig/20240529-142619-marostegui.json
  • 14:25 cdanis@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 14:24 brouberol@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:24 brouberol@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:24 jforrester@deploy1002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 14:24 jforrester@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 14:22 brouberol@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:22 jforrester@deploy1002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 14:22 brouberol@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 14:21 brouberol@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 14:20 brouberol@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 14:19 brouberol@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 14:19 brouberol@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 14:18 jforrester@deploy1002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 14:18 brouberol@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:17 brouberol@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 14:17 brouberol@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:17 jforrester@deploy1002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 14:16 jforrester@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 14:16 brouberol@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 14:15 jforrester@deploy1002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 14:15 jforrester@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 14:14 jforrester@deploy1002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 14:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db1196 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63544 and previous config saved to /var/cache/conftool/dbconfig/20240529-141244-arnaudb.json
  • 14:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137', diff saved to https://phabricator.wikimedia.org/P63543 and previous config saved to /var/cache/conftool/dbconfig/20240529-141213-marostegui.json
  • 14:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63542 and previous config saved to /var/cache/conftool/dbconfig/20240529-141114-arnaudb.json
  • 14:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1169.eqiad.wmnet with OS bookworm
  • 13:57 arnaudb@cumin1002: dbctl commit (dc=all): 'db1196 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63541 and previous config saved to /var/cache/conftool/dbconfig/20240529-135738-arnaudb.json
  • 13:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137 (T364069)', diff saved to https://phabricator.wikimedia.org/P63540 and previous config saved to /var/cache/conftool/dbconfig/20240529-135706-marostegui.json
  • 13:55 jiji@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-ctrl1002.eqiad.wmnet
  • 13:55 effie: label wikikube-ctrl1002 as master
  • 13:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T364299)', diff saved to https://phabricator.wikimedia.org/P63539 and previous config saved to /var/cache/conftool/dbconfig/20240529-135300-marostegui.json
  • 13:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 13:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 13:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T364299)', diff saved to https://phabricator.wikimedia.org/P63538 and previous config saved to /var/cache/conftool/dbconfig/20240529-135237-marostegui.json
  • 13:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1169.eqiad.wmnet with reason: host reimage
  • 13:43 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1169.eqiad.wmnet with reason: host reimage
  • 13:42 arnaudb@cumin1002: dbctl commit (dc=all): 'db1196 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63537 and previous config saved to /var/cache/conftool/dbconfig/20240529-134232-arnaudb.json
  • 13:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P63536 and previous config saved to /var/cache/conftool/dbconfig/20240529-133729-marostegui.json
  • 13:36 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2047.codfw.wmnet with OS bookworm
  • 13:30 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1169.eqiad.wmnet with OS bookworm
  • 13:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1169.eqiad.wmnet with reason: reimage
  • 13:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1169.eqiad.wmnet with reason: reimage
  • 13:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1169 T364290', diff saved to https://phabricator.wikimedia.org/P63535 and previous config saved to /var/cache/conftool/dbconfig/20240529-132818-arnaudb.json
  • 13:27 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1047.eqiad.wmnet with OS bookworm
  • 13:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db1196 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63534 and previous config saved to /var/cache/conftool/dbconfig/20240529-132726-arnaudb.json
  • 13:26 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1196.eqiad.wmnet with OS bookworm
  • 13:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T366123)', diff saved to https://phabricator.wikimedia.org/P63533 and previous config saved to /var/cache/conftool/dbconfig/20240529-132553-marostegui.json
  • 13:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 13:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 13:24 otto@deploy1002: Finished scap: Backport for Create eventlogging-processor legacy converter to proxy to eventgate for mediawiki.org (T353817 T323828) (duration: 18m 25s)
  • 13:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P63532 and previous config saved to /var/cache/conftool/dbconfig/20240529-132221-marostegui.json
  • 13:16 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2047.codfw.wmnet with reason: host reimage
  • 13:16 fabfur: temporary disabling puppet on A:cp to rollout https://gerrit.wikimedia.org/r/c/operations/puppet/+/1036711 (T365718)
  • 13:14 otto@deploy1002: otto: Continuing with sync
  • 13:13 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2047.codfw.wmnet with reason: host reimage
  • 13:11 moritzm: installing apache2 security updates
  • 13:10 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1047.eqiad.wmnet with reason: host reimage
  • 13:08 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1047.eqiad.wmnet with reason: host reimage
  • 13:08 otto@deploy1002: otto: Backport for Create eventlogging-processor legacy converter to proxy to eventgate for mediawiki.org (T353817 T323828) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T364299)', diff saved to https://phabricator.wikimedia.org/P63531 and previous config saved to /var/cache/conftool/dbconfig/20240529-130713-marostegui.json
  • 13:05 otto@deploy1002: Started scap: Backport for Create eventlogging-processor legacy converter to proxy to eventgate for mediawiki.org (T353817 T323828)
  • 13:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
  • 13:02 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: mariadb::core
  • 13:01 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
  • 12:55 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2047.codfw.wmnet with OS bookworm
  • 12:54 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1047.eqiad.wmnet with OS bookworm
  • 12:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: mariadb::core
  • 12:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 12:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 12:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T366123)', diff saved to https://phabricator.wikimedia.org/P63530 and previous config saved to /var/cache/conftool/dbconfig/20240529-125255-marostegui.json
  • 12:46 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1196.eqiad.wmnet with OS bookworm
  • 12:45 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2048.codfw.wmnet with OS bookworm
  • 12:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db[1154,1196].eqiad.wmnet with reason: reimage db1196
  • 12:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db[1154,1196].eqiad.wmnet with reason: reimage db1196
  • 12:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1196 T364290', diff saved to https://phabricator.wikimedia.org/P63529 and previous config saved to /var/cache/conftool/dbconfig/20240529-124352-arnaudb.json
  • 12:43 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1196.eqiad.wmnet with reason: reimage
  • 12:43 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1196.eqiad.wmnet with reason: reimage
  • 12:42 marostegui: recreate triggers on s7 codfw db maint db2187:3317 T366167
  • 12:42 marostegui: recreate triggers on s7 codfw db maint db1155:3317 T366167
  • 12:40 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1048.eqiad.wmnet with OS bookworm
  • 12:39 elukey: move thanos-fe100[3,4] and thanos-fe2* to PKI TLS certs (envoy, backends for thanos-swift.discovery.wmnet) - T344324
  • 12:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P63528 and previous config saved to /var/cache/conftool/dbconfig/20240529-123746-marostegui.json
  • 12:37 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
  • 12:35 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
  • 12:34 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-staging2002.codfw.wmnet with OS bookworm
  • 12:30 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
  • 12:29 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/media-analytics: apply
  • 12:28 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2048.codfw.wmnet with reason: host reimage
  • 12:26 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
  • 12:25 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/media-analytics: apply
  • 12:24 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1048.eqiad.wmnet with reason: host reimage
  • 12:22 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2048.codfw.wmnet with reason: host reimage
  • 12:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P63527 and previous config saved to /var/cache/conftool/dbconfig/20240529-122239-marostegui.json
  • 12:21 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
  • 12:19 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
  • 12:19 slyngs: Failover idp.wikimedia.org for CAS upgrade to 6.6.15
  • 12:19 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1048.eqiad.wmnet with reason: host reimage
  • 12:18 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
  • 12:17 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-staging2002.codfw.wmnet with reason: host reimage
  • 12:17 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
  • 12:16 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
  • 12:15 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/geo-analytics: apply
  • 12:14 klausman@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-staging2002.codfw.wmnet with reason: host reimage
  • 12:12 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
  • 12:11 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
  • 12:10 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
  • 12:08 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
  • 12:07 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
  • 12:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T366123)', diff saved to https://phabricator.wikimedia.org/P63526 and previous config saved to /var/cache/conftool/dbconfig/20240529-120730-marostegui.json
  • 12:06 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/page-analytics: apply
  • 12:05 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1048.eqiad.wmnet with OS bookworm
  • 12:04 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2048.codfw.wmnet with OS bookworm
  • 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1235.eqiad.wmnet
  • 11:53 klausman@cumin2002: START - Cookbook sre.hosts.reimage for host ml-staging2002.codfw.wmnet with OS bookworm
  • 11:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T364299)', diff saved to https://phabricator.wikimedia.org/P63525 and previous config saved to /var/cache/conftool/dbconfig/20240529-115051-marostegui.json
  • 11:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 11:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 11:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T364299)', diff saved to https://phabricator.wikimedia.org/P63524 and previous config saved to /var/cache/conftool/dbconfig/20240529-115025-marostegui.json
  • 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Mabualruz out of all services on: 2198 hosts
  • 11:46 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Mabualruz out of all services on: 2198 hosts
  • 11:44 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 11:42 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 11:42 marostegui: recreate triggers on s7 eqiad db maint db1155:3317 T366167
  • 11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T366123)', diff saved to https://phabricator.wikimedia.org/P63523 and previous config saved to /var/cache/conftool/dbconfig/20240529-114153-marostegui.json
  • 11:41 hnowlan: homer "cr*eqiad*" commit 'adding bgp state for wikikube-ctrl1002'
  • 11:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 11:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 11:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T366123)', diff saved to https://phabricator.wikimedia.org/P63522 and previous config saved to /var/cache/conftool/dbconfig/20240529-114129-marostegui.json
  • 11:40 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 11:38 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 11:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P63521 and previous config saved to /var/cache/conftool/dbconfig/20240529-113517-marostegui.json
  • 11:26 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 11:26 akosiaris@cumin1002: conftool action : set/pooled=yes; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=wikikube-ctrl1001.eqiad.wmnet
  • 11:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P63520 and previous config saved to /var/cache/conftool/dbconfig/20240529-112621-marostegui.json
  • 11:26 akosiaris@cumin1002: conftool action : set/pooled=yes; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=kubemaster1002.eqiad.wmnet
  • 11:25 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 11:23 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
  • 11:23 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'sync'.
  • 11:23 akosiaris: T366094 re-undeploy otel-collector, it being around increased traffic to the API >50%
  • 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P63519 and previous config saved to /var/cache/conftool/dbconfig/20240529-112009-marostegui.json
  • 11:19 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: sync
  • 11:16 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: sync
  • 11:15 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 11:15 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 11:12 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: sync
  • 11:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P63518 and previous config saved to /var/cache/conftool/dbconfig/20240529-111112-marostegui.json
  • 11:10 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-wikifunctions: sync
  • 11:10 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: sync
  • 11:10 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: sync
  • 11:10 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: sync
  • 11:10 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 11:10 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 11:10 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: sync
  • 11:06 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
  • 11:06 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'sync'.
  • 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T364299)', diff saved to https://phabricator.wikimedia.org/P63517 and previous config saved to /var/cache/conftool/dbconfig/20240529-110501-marostegui.json
  • 11:04 akosiaris: redeploy opentelemetry collector T366094
  • 11:03 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
  • 11:03 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'sync'.
  • 11:03 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
  • 11:03 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'sync'.
  • 11:03 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1049.eqiad.wmnet with OS bookworm
  • 11:03 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: sync
  • 10:57 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 10:57 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 10:56 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: sync
  • 10:56 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-web: sync
  • 10:56 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-wikifunctions: sync
  • 10:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T366123)', diff saved to https://phabricator.wikimedia.org/P63516 and previous config saved to /var/cache/conftool/dbconfig/20240529-105604-marostegui.json
  • 10:56 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: sync
  • 10:56 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 10:55 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 10:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T366123)', diff saved to https://phabricator.wikimedia.org/P63515 and previous config saved to /var/cache/conftool/dbconfig/20240529-105454-marostegui.json
  • 10:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 10:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 10:52 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: sync
  • 10:51 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-wikifunctions: sync
  • 10:51 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-web: sync
  • 10:51 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 10:51 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: sync
  • 10:51 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 10:51 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: sync
  • 10:51 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: sync
  • 10:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 10:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 10:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 10:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 10:46 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1049.eqiad.wmnet with reason: host reimage
  • 10:45 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 10:45 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 10:43 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: sync
  • 10:43 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-wikifunctions: sync
  • 10:43 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1049.eqiad.wmnet with reason: host reimage
  • 10:43 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 10:43 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 10:43 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: sync
  • 10:43 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-debug: sync
  • 10:38 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 10:38 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 10:36 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 10:36 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 10:35 akosiaris@cumin1002: conftool action : set/pooled=inactive; selector: name=parse1002.eqiad.wmnet
  • 10:29 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1049.eqiad.wmnet with OS bookworm
  • 10:26 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 10:26 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 10:26 moritzm: installing intel-microcode security updates
  • 10:24 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 10:24 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 10:24 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 10:24 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 10:24 akosiaris@cumin1002: conftool action : set/pooled=inactive; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=kubemaster1002.eqiad.wmnet
  • 10:19 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 10:19 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 10:17 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubemaster1002.eqiad.wmnet with reason: disable puppet and k8s controlplane
  • 10:16 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubemaster1002.eqiad.wmnet with reason: disable puppet and k8s controlplane
  • 10:16 moritzm: installing python-idna security updates
  • 10:16 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
  • 10:15 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
  • 10:14 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
  • 10:12 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
  • 10:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1235.eqiad.wmnet
  • 10:10 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 10:09 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 10:07 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 10:07 moritzm: installing systemd security updates
  • 10:06 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 10:06 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:06 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:05 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:05 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:05 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 10:05 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 10:05 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl1002.eqiad.wmnet with reason: disable puppet and k8s controlplane
  • 10:05 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1002.eqiad.wmnet with reason: disable puppet and k8s controlplane
  • 10:04 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl1001.eqiad.wmnet with reason: disable puppet and k8s controlplane
  • 10:04 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1001.eqiad.wmnet with reason: disable puppet and k8s controlplane
  • 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1234.eqiad.wmnet
  • 10:02 sfaci@deploy1002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
  • 10:01 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:01 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:00 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:00 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:00 sfaci@deploy1002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
  • 09:59 sfaci@deploy1002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
  • 09:57 sfaci@deploy1002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
  • 09:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T364299)', diff saved to https://phabricator.wikimedia.org/P63514 and previous config saved to /var/cache/conftool/dbconfig/20240529-095437-marostegui.json
  • 09:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 09:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 09:51 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:50 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:47 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1234.eqiad.wmnet
  • 09:39 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 09:38 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 09:38 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 09:37 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: sync
  • 09:36 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 09:36 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: sync
  • 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1232.eqiad.wmnet
  • 09:29 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 09:27 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 09:16 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1232.eqiad.wmnet
  • 09:12 marostegui: Deploy schema change on s7 eqiad dbmaint T307501
  • 09:05 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 8674
  • 08:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 08:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 08:40 brouberol@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 08:39 brouberol@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 08:35 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 8674
  • 08:33 brouberol@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: sync
  • 08:31 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 49666
  • 08:29 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 49666
  • 08:22 brouberol@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: sync
  • 08:10 brouberol@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: sync
  • 08:00 brouberol@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: sync
  • 07:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1228.eqiad.wmnet
  • 07:54 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:54 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:47 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1228.eqiad.wmnet
  • 07:47 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1219.eqiad.wmnet
  • 07:35 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1219.eqiad.wmnet
  • 07:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 07:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T364299)', diff saved to https://phabricator.wikimedia.org/P63513 and previous config saved to /var/cache/conftool/dbconfig/20240529-073017-marostegui.json
  • 07:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P63512 and previous config saved to /var/cache/conftool/dbconfig/20240529-071509-marostegui.json
  • 07:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P63511 and previous config saved to /var/cache/conftool/dbconfig/20240529-070001-marostegui.json
  • 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1218.eqiad.wmnet
  • 06:49 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1218.eqiad.wmnet
  • 06:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T364299)', diff saved to https://phabricator.wikimedia.org/P63510 and previous config saved to /var/cache/conftool/dbconfig/20240529-064453-marostegui.json
  • 04:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T364299)', diff saved to https://phabricator.wikimedia.org/P63509 and previous config saved to /var/cache/conftool/dbconfig/20240529-042402-marostegui.json
  • 04:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 04:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 04:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T364299)', diff saved to https://phabricator.wikimedia.org/P63508 and previous config saved to /var/cache/conftool/dbconfig/20240529-042339-marostegui.json
  • 04:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P63507 and previous config saved to /var/cache/conftool/dbconfig/20240529-040831-marostegui.json
  • 04:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2137 (T364069)', diff saved to https://phabricator.wikimedia.org/P63506 and previous config saved to /var/cache/conftool/dbconfig/20240529-040259-marostegui.json
  • 04:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 04:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 04:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T364069)', diff saved to https://phabricator.wikimedia.org/P63505 and previous config saved to /var/cache/conftool/dbconfig/20240529-040236-marostegui.json
  • 03:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2125 (T352010)', diff saved to https://phabricator.wikimedia.org/P63504 and previous config saved to /var/cache/conftool/dbconfig/20240529-035538-ladsgroup.json
  • 03:55 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 03:55 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 03:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P63503 and previous config saved to /var/cache/conftool/dbconfig/20240529-035323-marostegui.json
  • 03:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P63502 and previous config saved to /var/cache/conftool/dbconfig/20240529-034728-marostegui.json
  • 03:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T364299)', diff saved to https://phabricator.wikimedia.org/P63501 and previous config saved to /var/cache/conftool/dbconfig/20240529-033814-marostegui.json
  • 03:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P63500 and previous config saved to /var/cache/conftool/dbconfig/20240529-033221-marostegui.json
  • 03:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T364069)', diff saved to https://phabricator.wikimedia.org/P63499 and previous config saved to /var/cache/conftool/dbconfig/20240529-031710-marostegui.json
  • 02:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T364299)', diff saved to https://phabricator.wikimedia.org/P63498 and previous config saved to /var/cache/conftool/dbconfig/20240529-023432-marostegui.json
  • 02:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 02:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 02:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T364299)', diff saved to https://phabricator.wikimedia.org/P63497 and previous config saved to /var/cache/conftool/dbconfig/20240529-023409-marostegui.json
  • 02:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P63496 and previous config saved to /var/cache/conftool/dbconfig/20240529-021901-marostegui.json
  • 02:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P63495 and previous config saved to /var/cache/conftool/dbconfig/20240529-020353-marostegui.json
  • 01:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T364299)', diff saved to https://phabricator.wikimedia.org/P63494 and previous config saved to /var/cache/conftool/dbconfig/20240529-014845-marostegui.json
  • 00:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T364299)', diff saved to https://phabricator.wikimedia.org/P63493 and previous config saved to /var/cache/conftool/dbconfig/20240529-004343-marostegui.json
  • 00:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 00:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 00:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T364299)', diff saved to https://phabricator.wikimedia.org/P63492 and previous config saved to /var/cache/conftool/dbconfig/20240529-004319-marostegui.json
  • 00:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P63491 and previous config saved to /var/cache/conftool/dbconfig/20240529-002811-marostegui.json
  • 00:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P63490 and previous config saved to /var/cache/conftool/dbconfig/20240529-001303-marostegui.json

2024-05-28

  • 23:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T364299)', diff saved to https://phabricator.wikimedia.org/P63489 and previous config saved to /var/cache/conftool/dbconfig/20240528-235755-marostegui.json
  • 22:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T364299)', diff saved to https://phabricator.wikimedia.org/P63488 and previous config saved to /var/cache/conftool/dbconfig/20240528-225541-marostegui.json
  • 22:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 22:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 22:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T364299)', diff saved to https://phabricator.wikimedia.org/P63487 and previous config saved to /var/cache/conftool/dbconfig/20240528-225516-marostegui.json
  • 22:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:50 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:48 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:48 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:45 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:45 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:43 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:43 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:41 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:41 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P63486 and previous config saved to /var/cache/conftool/dbconfig/20240528-224008-marostegui.json
  • 22:39 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:39 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:37 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:37 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:34 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:32 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:32 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:30 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:30 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P63485 and previous config saved to /var/cache/conftool/dbconfig/20240528-222500-marostegui.json
  • 22:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T364299)', diff saved to https://phabricator.wikimedia.org/P63484 and previous config saved to /var/cache/conftool/dbconfig/20240528-220950-marostegui.json
  • 21:10 ejegg: payments-wiki upgraded from 0bed1814 to 8ff002ef
  • 21:06 ejegg: donorwiki upgraded from fa7de70f to 8ff002ef
  • 21:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1181 (T364299)', diff saved to https://phabricator.wikimedia.org/P63483 and previous config saved to /var/cache/conftool/dbconfig/20240528-210533-marostegui.json
  • 21:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 21:05 ejegg: payments-wiki upgraded from 2bfd247a to 0bed1814
  • 21:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 21:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T364299)', diff saved to https://phabricator.wikimedia.org/P63482 and previous config saved to /var/cache/conftool/dbconfig/20240528-210510-marostegui.json
  • 21:01 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 10m 21s)
  • 20:51 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 11m 23s)
  • 20:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P63481 and previous config saved to /var/cache/conftool/dbconfig/20240528-205001-marostegui.json
  • 20:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P63479 and previous config saved to /var/cache/conftool/dbconfig/20240528-203453-marostegui.json
  • 20:34 cjming@deploy1002: Finished scap: Backport for cirrus: Move remaining public writes to SUP (T363475) (duration: 12m 11s)
  • 20:31 eileen: config revision changed from 6c4cd6c2 to 5b0b4d22 revert schedule
  • 20:25 cjming@deploy1002: cjming and ebernhardson: Continuing with sync
  • 20:25 cjming@deploy1002: cjming and ebernhardson: Backport for cirrus: Move remaining public writes to SUP (T363475) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:22 cjming@deploy1002: Started scap: Backport for cirrus: Move remaining public writes to SUP (T363475)
  • 20:21 eileen: revert to Smarty 2 revision changed from de15d068 to cdc89b59
  • 20:20 cjming@deploy1002: Finished scap: Backport for deploy(Popups): Make use of conditional user defaults (T364347) (duration: 15m 52s)
  • 20:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T364299)', diff saved to https://phabricator.wikimedia.org/P63478 and previous config saved to /var/cache/conftool/dbconfig/20240528-201945-marostegui.json
  • 20:11 cjming@deploy1002: mabualruz and cjming: Continuing with sync
  • 20:08 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 20:07 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 20:07 cjming@deploy1002: mabualruz and cjming: Backport for deploy(Popups): Make use of conditional user defaults (T364347) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:05 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 20:05 cjming@deploy1002: Started scap: Backport for deploy(Popups): Make use of conditional user defaults (T364347)
  • 19:45 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_eqiad
  • 19:45 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_eqiad
  • 19:36 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 19:30 herron: disable swap on grafana1002
  • 19:28 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:28 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:26 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:26 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:24 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:24 herron: ganeti1027:~$ sudo gnt-instance reboot grafana1002
  • 19:22 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:22 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:20 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:20 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:19 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 19:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:18 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:01 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host elastic1056.eqiad.wmnet
  • 19:00 marostegui@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P63476 and previous config saved to /var/cache/conftool/dbconfig/20240528-190021-root.json
  • 18:54 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host elastic1056.eqiad.wmnet
  • 18:54 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:54 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:53 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on elastic1056.eqiad.wmnet with reason: rebooting after abnormally high load
  • 18:53 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on elastic1056.eqiad.wmnet with reason: rebooting after abnormally high load
  • 18:51 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:51 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:50 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 18:47 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:45 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.43.0-wmf.7 refs T361401
  • 18:45 marostegui@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P63475 and previous config saved to /var/cache/conftool/dbconfig/20240528-184515-root.json
  • 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T364299)', diff saved to https://phabricator.wikimedia.org/P63474 and previous config saved to /var/cache/conftool/dbconfig/20240528-184110-marostegui.json
  • 18:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 18:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 18:37 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:36 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:35 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:30 marostegui@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P63473 and previous config saved to /var/cache/conftool/dbconfig/20240528-183009-root.json
  • 18:21 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:20 dancy@deploy1002: sync-world aborted: Backport for Remove the php symlink (T359643) (duration: 00m 30s)
  • 18:19 dancy@deploy1002: Started scap: Backport for Remove the php symlink (T359643)
  • 18:19 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:17 dancy@deploy1002: sync-world aborted: Backport for Remove the php symlink (T359643) (duration: 01m 00s)
  • 18:16 dancy@deploy1002: Started scap: Backport for Remove the php symlink (T359643)
  • 18:16 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 18:15 marostegui@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P63472 and previous config saved to /var/cache/conftool/dbconfig/20240528-181503-root.json
  • 17:59 marostegui@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P63471 and previous config saved to /var/cache/conftool/dbconfig/20240528-175954-root.json
  • 17:58 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 17:56 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 17:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63470 and previous config saved to /var/cache/conftool/dbconfig/20240528-175638-arnaudb.json
  • 17:44 marostegui@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P63469 and previous config saved to /var/cache/conftool/dbconfig/20240528-174448-root.json
  • 17:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63468 and previous config saved to /var/cache/conftool/dbconfig/20240528-174131-arnaudb.json
  • 17:41 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 17:41 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 17:39 ladsgroup@deploy1002: Finished scap: Backport for Set zhwiki to read new for pagelinks migration (T351237) (duration: 11m 48s)
  • 17:30 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 17:30 ladsgroup@deploy1002: ladsgroup: Backport for Set zhwiki to read new for pagelinks migration (T351237) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:29 marostegui@cumin1002: dbctl commit (dc=all): 'db1211 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P63467 and previous config saved to /var/cache/conftool/dbconfig/20240528-172942-root.json
  • 17:27 ladsgroup@deploy1002: Started scap: Backport for Set zhwiki to read new for pagelinks migration (T351237)
  • 17:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63466 and previous config saved to /var/cache/conftool/dbconfig/20240528-172625-arnaudb.json
  • 17:25 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic1056.eqiad.wmnet for ban highly-loaded node - bking@cumin2002
  • 17:25 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic1056.eqiad.wmnet for ban highly-loaded node - bking@cumin2002
  • 17:25 dduvall: removing blubberoid from eqiad, `helmfile -e eqiad destroy` (T365742)
  • 17:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: elastic1056 for ban highly-loaded node - bking@cumin2002
  • 17:25 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic1056 for ban highly-loaded node - bking@cumin2002
  • 17:24 sukhe: sudo -i puppet cert clean blubberoid.discovery.wmnet: T365742
  • 17:24 dduvall: removing blubberoid from codfw, `helmfile -e codfw destroy` (T365742)
  • 17:21 dduvall: removing blubberoid from staging, `helmfile -e staging destroy` (T365742)
  • 17:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1211.eqiad.wmnet with OS bookworm
  • 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63464 and previous config saved to /var/cache/conftool/dbconfig/20240528-171119-arnaudb.json
  • 17:09 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2014.codfw.wmnet
  • 17:09 sukhe@cumin1002: START - Cookbook sre.hosts.remove-downtime for lvs2014.codfw.wmnet
  • 17:09 sukhe@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1020.eqiad.wmnet
  • 17:09 sukhe@cumin1002: START - Cookbook sre.hosts.remove-downtime for lvs1020.eqiad.wmnet
  • 17:08 sukhe: sudo cumin 'A:lvs-secondary-eqiad or A:lvs-low-traffic-eqiad' 'ipvsadm --delete-service --tcp-service 10.2.2.31:4666': T365742
  • 17:03 sukhe: removing blubberoid's IP from ipvsadm: T365742
  • 16:59 sukhe@cumin1002: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-secondary-eqiad or A:lvs-secondary-codfw and A:lvs (T365742)
  • 16:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
  • 16:57 sukhe: sudo cumin 'A:lvs-low-traffic-eqiad or A:lvs-low-traffic-codfw' 'systemctl restart pybal.serice'
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63463 and previous config saved to /var/cache/conftool/dbconfig/20240528-165612-arnaudb.json
  • 16:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
  • 16:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2136 (T364069)', diff saved to https://phabricator.wikimedia.org/P63462 and previous config saved to /var/cache/conftool/dbconfig/20240528-165002-marostegui.json
  • 16:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 16:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 16:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 16:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 16:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T364299)', diff saved to https://phabricator.wikimedia.org/P63461 and previous config saved to /var/cache/conftool/dbconfig/20240528-164902-marostegui.json
  • 16:47 sukhe@cumin1002: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-secondary-eqiad or A:lvs-secondary-codfw and A:lvs (T365742)
  • 16:43 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 16:41 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1211.eqiad.wmnet with OS bookworm
  • 16:41 sukhe: sudo cumin 'O:lvs::balancer' 'run-puppet-agent': T365742
  • 16:41 sukhe: cumin 'O:lvs::balancer' 'run-puppet-agent'
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db1206 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63459 and previous config saved to /var/cache/conftool/dbconfig/20240528-164106-arnaudb.json
  • 16:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1211', diff saved to https://phabricator.wikimedia.org/P63458 and previous config saved to /var/cache/conftool/dbconfig/20240528-163810-marostegui.json
  • 16:36 arnaudb@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63457 and previous config saved to /var/cache/conftool/dbconfig/20240528-163647-arnaudb.json
  • 16:34 sukhe: running run-puppet-agent on A:dnsbox
  • 16:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1206.eqiad.wmnet with OS bookworm
  • 16:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P63455 and previous config saved to /var/cache/conftool/dbconfig/20240528-163353-marostegui.json
  • 16:33 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 16:21 arnaudb@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63454 and previous config saved to /var/cache/conftool/dbconfig/20240528-162141-arnaudb.json
  • 16:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P63453 and previous config saved to /var/cache/conftool/dbconfig/20240528-161845-marostegui.json
  • 16:17 ladsgroup@deploy1002: Finished scap: Backport for x-wikimedia-debug: add datacenter options for k8s (T365478) (duration: 12m 00s)
  • 16:14 Lucas_WMDE: lucaswerkmeister-wmde@stat1011:~$ sudo -u analytics-wmde rm -rf /srv/analytics-wmde/wdcm/ # T364965; contained src/ as a clean git clone as of c2b0a324e9 / I024691a148, and nothing else
  • 16:12 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
  • 16:12 hnowlan: kubectl node uncordon wikikube-worker2002.codfw.wmnet
  • 16:11 hnowlan@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-worker2002.codfw.wmnet
  • 16:10 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
  • 16:10 dani@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 16:09 dani@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 16:09 dani@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 16:08 dani@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 16:08 dani@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 16:08 dani@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 16:08 ladsgroup@deploy1002: ladsgroup and jiji: Continuing with sync
  • 16:08 ladsgroup@deploy1002: ladsgroup and jiji: Backport for x-wikimedia-debug: add datacenter options for k8s (T365478) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:06 arnaudb@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63452 and previous config saved to /var/cache/conftool/dbconfig/20240528-160635-arnaudb.json
  • 16:05 ladsgroup@deploy1002: Started scap: Backport for x-wikimedia-debug: add datacenter options for k8s (T365478)
  • 16:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T364299)', diff saved to https://phabricator.wikimedia.org/P63451 and previous config saved to /var/cache/conftool/dbconfig/20240528-160337-marostegui.json
  • 16:01 ladsgroup@deploy1002: Finished scap: Backport for Create electionadmin group on testwiki (T209892) (duration: 17m 48s)
  • 16:00 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 16:00 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 15:55 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1206.eqiad.wmnet with OS bookworm
  • 15:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1206.eqiad.wmnet with reason: reimage
  • 15:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1206.eqiad.wmnet with reason: reimage
  • 15:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1206 T364290', diff saved to https://phabricator.wikimedia.org/P63449 and previous config saved to /var/cache/conftool/dbconfig/20240528-155309-arnaudb.json
  • 15:52 ejegg: fundraising civicrm upgraded from 4dd78bcc to 3fee95bc
  • 15:51 arnaudb@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63448 and previous config saved to /var/cache/conftool/dbconfig/20240528-155129-arnaudb.json
  • 15:50 hnowlan: ran `sudo puppet node deactivate kubernetes2032.codfw.wmnet` to fix renamed host erroring in scap
  • 15:48 ladsgroup@deploy1002: tstarling and ladsgroup: Continuing with sync
  • 15:48 ladsgroup@deploy1002: tstarling and ladsgroup: Backport for Create electionadmin group on testwiki (T209892) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:45 sukhe: sudo cumin -b1 -s120 'A:dnsbox and not P{dns6001*}' 'run-puppet-agent --enable "merging CR 1034476"'
  • 15:45 brennen@deploy1002: Finished deploy [phabricator/deployment@e7093e2]: deploy phab1004 for T366075 (duration: 00m 32s)
  • 15:44 brennen@deploy1002: Started deploy [phabricator/deployment@e7093e2]: deploy phab1004 for T366075
  • 15:44 brennen@deploy1002: Finished deploy [phabricator/deployment@e7093e2]: deploy phab2002 for T366075 (duration: 00m 33s)
  • 15:44 ladsgroup@deploy1002: Started scap: Backport for Create electionadmin group on testwiki (T209892)
  • 15:43 brennen@deploy1002: Started deploy [phabricator/deployment@e7093e2]: deploy phab2002 for T366075
  • 15:41 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:30:00 on phabricator.wikimedia.org with reason: phabricator deploy
  • 15:41 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phabricator.wikimedia.org with reason: phabricator deploy
  • 15:40 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab.wmfusercontent.org with reason: phabricator deploy
  • 15:40 jiji@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: Kubernetes masters trouble - no deployments - serviceops (duration: 114m 39s)
  • 15:40 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab.wmfusercontent.org with reason: phabricator deploy
  • 15:39 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:30:00 on phabricator.wikimedia.org with reason: phabricator deploy
  • 15:39 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phabricator.wikimedia.org with reason: phabricator deploy
  • 15:38 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: phabricator deploy
  • 15:38 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab2002.codfw.wmnet with reason: phabricator deploy
  • 15:38 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: phabricator deploy
  • 15:38 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on phab1004.eqiad.wmnet with reason: phabricator deploy
  • 15:36 arnaudb@cumin1002: dbctl commit (dc=all): 'db1207 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63447 and previous config saved to /var/cache/conftool/dbconfig/20240528-153622-arnaudb.json
  • 15:35 sukhe: sudo cumin 'A:dnsbox' 'disable-puppet "merging CR 1034476"'
  • 15:32 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2002.codfw.wmnet with OS bullseye
  • 15:31 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 15:30 ejegg: fundraising civicrm upgraded from 7e998894 to 4dd78bcc
  • 15:29 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 15:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1206.eqiad.wmnet
  • 15:18 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1206.eqiad.wmnet
  • 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1186.eqiad.wmnet
  • 15:13 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 15:12 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2002.codfw.wmnet with reason: host reimage
  • 15:12 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 15:09 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2002.codfw.wmnet with reason: host reimage
  • 15:07 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1207.eqiad.wmnet with OS bookworm
  • 15:06 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 15:05 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1186.eqiad.wmnet
  • 15:05 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 14:56 akosiaris: migrate kubemaster1002 to ganeti1037
  • 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1184.eqiad.wmnet
  • 14:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host <spicerack.netbox.NetboxServer object at 0x7f5406426910>
  • 14:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2002
  • 14:49 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2002
  • 14:49 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2002.codfw.wmnet 223.16.192.10.in-addr.arpa 3.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:49 hnowlan@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2002.codfw.wmnet 223.16.192.10.in-addr.arpa 3.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 14:49 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:49 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2002 - hnowlan@cumin1002"
  • 14:48 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2002 - hnowlan@cumin1002"
  • 14:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1207.eqiad.wmnet with reason: host reimage
  • 14:44 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 14:44 akosiaris: gnt-instance replace-disks for kubemaster1002, set ganeti1037 as a secondary
  • 14:43 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1207.eqiad.wmnet with reason: host reimage
  • 14:37 akosiaris: reboot kubemaster1001 with 8 vpus for consistency with kubemaster1002.
  • 14:37 akosiaris: repool kubemaster1001 with 8 vpus for consistency with kubemaster1002.
  • 14:31 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 14:30 akosiaris: repool kubemaster1001, testing something
  • 14:29 akosiaris@cumin1002: conftool action : set/pooled=yes; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=kubemaster1001.eqiad.wmnet
  • 14:29 akosiaris: depool kubemaster1001, it's CPU is saturated after a test roll restart
  • 14:29 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1207.eqiad.wmnet with OS bookworm
  • 14:28 arnaudb@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host db1207.eqiad.wmnet with OS bookworm
  • 14:28 akosiaris@cumin1002: conftool action : set/pooled=no; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=kubemaster1001.eqiad.wmnet
  • 14:27 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 14:26 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1184.eqiad.wmnet
  • 14:25 effie: enabling puppet on wikikube-ctrl100[1-2]*
  • 14:24 ejegg: fundraising civicrm upgraded from e2dc8f4e to 7e998894
  • 14:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db1218 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63444 and previous config saved to /var/cache/conftool/dbconfig/20240528-142431-arnaudb.json
  • 14:21 akosiaris@cumin1002: conftool action : set/weight=10; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=kubemaster1002.eqiad.wmnet
  • 14:19 akosiaris: add another 4 vcpus to kubemaster1002
  • 14:11 akosiaris: restart kube-apiserver on kubemaster1002
  • 14:09 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
  • 14:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db1218 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63442 and previous config saved to /var/cache/conftool/dbconfig/20240528-140925-arnaudb.json
  • 14:08 akosiaris@cumin1002: conftool action : set/weight=1; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=kubemaster1002.eqiad.wmnet
  • 14:07 akosiaris@cumin1002: conftool action : set/weight=5; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=kubemaster1002.eqiad.wmnet
  • 14:04 akosiaris: roll restart mw-api-int pods
  • 14:03 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
  • 14:03 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 14:03 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 14:01 akosiaris: remove wikikube-ctrl1002 from the rotation to test a theory
  • 14:01 akosiaris@cumin1002: conftool action : set/pooled=no; selector: service=kubemaster,dc=eqiad,cluster=kubernetes,name=wikikube-ctrl1001.eqiad.wmnet
  • 13:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T364299)', diff saved to https://phabricator.wikimedia.org/P63440 and previous config saved to /var/cache/conftool/dbconfig/20240528-135912-marostegui.json
  • 13:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 13:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 13:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T364299)', diff saved to https://phabricator.wikimedia.org/P63439 and previous config saved to /var/cache/conftool/dbconfig/20240528-135848-marostegui.json
  • 13:57 ejegg: fundraising civicrm upgraded from 6c1fdd4f to e2dc8f4e
  • 13:55 moritzm: installing pillow security updates
  • 13:54 arnaudb@cumin1002: dbctl commit (dc=all): 'db1218 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63438 and previous config saved to /var/cache/conftool/dbconfig/20240528-135419-arnaudb.json
  • 13:54 akosiaris: add manually ferm client rule on wikikube-ctrl1002 and disable puppet
  • 13:51 akosiaris: run puppet and restart ferm on wikikube-ctrl1001
  • 13:51 akosiaris: run puppet and restart ferm
  • 13:46 jiji@deploy1002: Locking from deployment [ALL REPOSITORIES]: Kubernetes masters trouble - no deployments - serviceops
  • 13:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P63437 and previous config saved to /var/cache/conftool/dbconfig/20240528-134341-marostegui.json
  • 13:43 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Create electionadmin group on testwiki (T209892) (duration: 34m 29s)
  • 13:42 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1207.eqiad.wmnet with OS bookworm
  • 13:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1207.eqiad.wmnet with reason: reimage
  • 13:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1207.eqiad.wmnet with reason: reimage
  • 13:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1207 T364290', diff saved to https://phabricator.wikimedia.org/P63436 and previous config saved to /var/cache/conftool/dbconfig/20240528-134150-arnaudb.json
  • 13:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db1218 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63435 and previous config saved to /var/cache/conftool/dbconfig/20240528-133913-arnaudb.json
  • 13:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1169.eqiad.wmnet
  • 13:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P63434 and previous config saved to /var/cache/conftool/dbconfig/20240528-132833-marostegui.json
  • 13:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db1218 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63433 and previous config saved to /var/cache/conftool/dbconfig/20240528-132407-arnaudb.json
  • 13:20 sukhe: sudo cumin -b1 -s120 'A:dnsbox and not P{dns6001*}' 'run-puppet-agent --enable "merging CR 1036644"'
  • 13:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T364299)', diff saved to https://phabricator.wikimedia.org/P63432 and previous config saved to /var/cache/conftool/dbconfig/20240528-131325-marostegui.json
  • 13:12 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and tstarling: Continuing with sync
  • 13:11 moritzm: installing bzip2 bugfix updates
  • 13:11 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and tstarling: Backport for Create electionadmin group on testwiki (T209892) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:09 sukhe: sudo cumin 'A:dnsbox' 'disable-puppet "merging CR 1036644"'
  • 13:09 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Create electionadmin group on testwiki (T209892)
  • 13:06 moritzm: installing man-db bugfix updates
  • 13:04 hnowlan@cumin1002: START - Cookbook sre.hosts.move-vlan for host <spicerack.netbox.NetboxServer object at 0x7f5406426910>
  • 13:04 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2002.codfw.wmnet with OS bullseye
  • 13:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1218.eqiad.wmnet with OS bookworm
  • 12:58 vgutierrez: testing fifo-log-demux 0.7.5 on cp3081 and cp3073
  • 12:52 moritzm: installing python-urllib3 security updates
  • 12:51 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub: sync on production
  • 12:47 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: apply on production
  • 12:47 tstarling@deploy1002: Synchronized wmf-config/core-Permissions.php: create electionadmin group on testwiki T209892 (attempt 2 after k8s-related rollback) (duration: 16m 02s)
  • 12:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
  • 12:40 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=thanos-fe1002.eqiad.wmnet
  • 12:38 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
  • 12:38 elukey: move thanos-fe1002's envoy TLS cert to CFSSL/PKI - T344324
  • 12:37 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=thanos-fe1002.eqiad.wmnet
  • 12:30 tstarling@deploy1002: Synchronized wmf-config/core-Permissions.php: create electionadmin group on testwiki T209892 (duration: 31m 52s)
  • 12:27 moritzm: installing jetty9 security updates
  • 12:25 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1218.eqiad.wmnet with OS bookworm
  • 12:25 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1218.eqiad.wmnet with reason: reimage
  • 12:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1218.eqiad.wmnet with reason: reimage
  • 12:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1218 T364290', diff saved to https://phabricator.wikimedia.org/P63431 and previous config saved to /var/cache/conftool/dbconfig/20240528-122442-arnaudb.json
  • 12:18 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2002.codfw.wmnet with OS bullseye
  • 12:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2002.codfw.wmnet with OS bullseye
  • 12:13 jiji@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-ctrl1001.eqiad.wmnet
  • 12:12 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2032 to wikikube-worker2002
  • 12:12 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2002
  • 12:11 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2002
  • 12:11 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:11 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2032 to wikikube-worker2002 - hnowlan@cumin1002"
  • 12:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T364299)', diff saved to https://phabricator.wikimedia.org/P63429 and previous config saved to /var/cache/conftool/dbconfig/20240528-121037-marostegui.json
  • 12:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 12:10 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2032 to wikikube-worker2002 - hnowlan@cumin1002"
  • 12:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 12:09 moritzm: installing glib2.0 security updates
  • 12:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 12:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 12:07 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 12:07 hnowlan@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2032 to wikikube-worker2002
  • 12:07 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1169.eqiad.wmnet
  • 12:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1243 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P63428 and previous config saved to /var/cache/conftool/dbconfig/20240528-120503-root.json
  • 11:51 hnowlan@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-worker2001.codfw.wmnet
  • 11:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1243 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P63426 and previous config saved to /var/cache/conftool/dbconfig/20240528-114957-root.json
  • 11:34 marostegui@cumin1002: dbctl commit (dc=all): 'db1243 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P63425 and previous config saved to /var/cache/conftool/dbconfig/20240528-113451-root.json
  • 11:32 Dreamy_Jazz: Restarted MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 11:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1163.eqiad.wmnet
  • 11:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1243 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P63424 and previous config saved to /var/cache/conftool/dbconfig/20240528-111946-root.json
  • 11:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db1219 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63423 and previous config saved to /var/cache/conftool/dbconfig/20240528-110817-arnaudb.json
  • 11:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1243 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P63422 and previous config saved to /var/cache/conftool/dbconfig/20240528-110440-root.json
  • 11:00 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1163.eqiad.wmnet
  • 10:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db1219 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63421 and previous config saved to /var/cache/conftool/dbconfig/20240528-105311-arnaudb.json
  • 10:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1243 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P63420 and previous config saved to /var/cache/conftool/dbconfig/20240528-104934-root.json
  • 10:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db1219 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63419 and previous config saved to /var/cache/conftool/dbconfig/20240528-103805-arnaudb.json
  • 10:34 marostegui@cumin1002: dbctl commit (dc=all): 'db1243 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P63418 and previous config saved to /var/cache/conftool/dbconfig/20240528-103428-root.json
  • 10:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 10:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 10:23 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2216.codfw.wmnet
  • 10:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db1219 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63417 and previous config saved to /var/cache/conftool/dbconfig/20240528-102259-arnaudb.json
  • 10:21 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2049.codfw.wmnet with OS bookworm
  • 10:18 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 10:08 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 10:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db1219 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63416 and previous config saved to /var/cache/conftool/dbconfig/20240528-100752-arnaudb.json
  • 10:05 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2216.codfw.wmnet
  • 10:02 moritzm: installing jinja2 security updates
  • 09:57 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2049.codfw.wmnet with reason: host reimage
  • 09:54 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2049.codfw.wmnet with reason: host reimage
  • 09:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db1228 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63415 and previous config saved to /var/cache/conftool/dbconfig/20240528-095058-arnaudb.json
  • 09:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2212.codfw.wmnet
  • 09:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1219.eqiad.wmnet with OS bookworm
  • 09:45 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:45 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:43 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:43 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 09:39 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1243.eqiad.wmnet with reason: unknown lag
  • 09:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1243.eqiad.wmnet with reason: unknown lag
  • 09:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2212.codfw.wmnet
  • 09:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db1228 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63414 and previous config saved to /var/cache/conftool/dbconfig/20240528-093552-arnaudb.json
  • 09:35 zabe@deploy1002: Finished scap: Backport for Stop writing to af_user(_text)/afh_user(_text) everywhere (T337920), Update interwiki cache (duration: 17m 49s)
  • 09:35 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS bookworm
  • 09:34 jiji@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts mc2049.codfw.wmnet
  • 09:33 jiji@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mc2049.codfw.wmnet
  • 09:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T364299)', diff saved to https://phabricator.wikimedia.org/P63413 and previous config saved to /var/cache/conftool/dbconfig/20240528-093344-marostegui.json
  • 09:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
  • 09:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
  • 09:21 zabe@deploy1002: zabe: Continuing with sync
  • 09:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db1228 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63412 and previous config saved to /var/cache/conftool/dbconfig/20240528-092046-arnaudb.json
  • 09:20 zabe@deploy1002: zabe: Backport for Stop writing to af_user(_text)/afh_user(_text) everywhere (T337920), Update interwiki cache synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P63411 and previous config saved to /var/cache/conftool/dbconfig/20240528-091836-marostegui.json
  • 09:17 zabe@deploy1002: Started scap: Backport for Stop writing to af_user(_text)/afh_user(_text) everywhere (T337920), Update interwiki cache
  • 09:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for gerrit1003.wikimedia.org
  • 09:14 jelto@cumin1002: START - Cookbook sre.hosts.remove-downtime for gerrit1003.wikimedia.org
  • 09:14 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for gerrit2002.wikimedia.org
  • 09:14 jelto@cumin1002: START - Cookbook sre.hosts.remove-downtime for gerrit2002.wikimedia.org
  • 09:13 zabe: zabe@mwmaint1002:~$ mwscript extensions/CirrusSearch/maintenance/UpdateSearchIndexConfig.php --wiki=dtpwiki --cluster=all 2>&1 | tee /tmp/dtpwiki.UpdateSearchIndexConfig.log # T365220
  • 09:09 zabe@deploy1002: Finished scap: T365220 (duration: 19m 22s)
  • 09:08 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1219.eqiad.wmnet with OS bookworm
  • 09:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1219 T364290', diff saved to https://phabricator.wikimedia.org/P63410 and previous config saved to /var/cache/conftool/dbconfig/20240528-090724-arnaudb.json
  • 09:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db1228 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63409 and previous config saved to /var/cache/conftool/dbconfig/20240528-090538-arnaudb.json
  • 09:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P63408 and previous config saved to /var/cache/conftool/dbconfig/20240528-090328-marostegui.json
  • 09:03 jiji@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host mc2049.codfw.wmnet
  • 08:58 jiji@cumin2002: START - Cookbook sre.hosts.reboot-single for host mc2049.codfw.wmnet
  • 08:55 zabe@deploy1002: zabe: Continuing with sync
  • 08:54 zabe@deploy1002: zabe: T365220 synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:53 jiji@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts mc2049.codfw.wmnet
  • 08:52 jiji@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host mc2049.codfw.wmnet
  • 08:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1219.eqiad.wmnet with reason: reimage
  • 08:52 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1219.eqiad.wmnet with reason: reimage
  • 08:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db1228 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63407 and previous config saved to /var/cache/conftool/dbconfig/20240528-085032-arnaudb.json
  • 08:50 zabe@deploy1002: Started scap: T365220
  • 08:49 hashar: Upgraded gerrit.wikimedia.org from Gerrit 3.8.5 to 3.8.6 # T365328
  • 08:48 zabe: create Wikipedia Central Dusun # T365220
  • 08:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T364299)', diff saved to https://phabricator.wikimedia.org/P63406 and previous config saved to /var/cache/conftool/dbconfig/20240528-084820-marostegui.json
  • 08:47 hashar@deploy1002: Finished deploy [gerrit/gerrit@c93e47d]: Gerrit to v3.8.6 on gerrit1003 - T365328 (duration: 00m 05s)
  • 08:47 hashar@deploy1002: Started deploy [gerrit/gerrit@c93e47d]: Gerrit to v3.8.6 on gerrit1003 - T365328
  • 08:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1228.eqiad.wmnet with OS bookworm
  • 08:45 hashar: Upgraded gerrit-replica.wikimedia.org from Gerrit 3.8.5 to 3.8.6 # T365328
  • 08:37 jiji@cumin2002: START - Cookbook sre.hosts.reboot-single for host mc2049.codfw.wmnet
  • 08:33 hashar@deploy1002: Finished deploy [gerrit/gerrit@c93e47d]: Gerrit to v3.8.6 on gerrit2002 - T365328 (duration: 00m 08s)
  • 08:33 hashar@deploy1002: Started deploy [gerrit/gerrit@c93e47d]: Gerrit to v3.8.6 on gerrit2002 - T365328
  • 08:27 moritzm: imported jenkins to 2.452.1 in component thirdparty/ci T366008
  • 08:25 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1228.eqiad.wmnet with reason: host reimage
  • 08:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit1003.wikimedia.org with reason: Gerrit patchset upgrade
  • 08:23 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on gerrit1003.wikimedia.org with reason: Gerrit patchset upgrade
  • 08:23 jiji@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mc2049.codfw.wmnet
  • 08:23 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit2002.wikimedia.org with reason: Gerrit patchset upgrade
  • 08:23 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on gerrit2002.wikimedia.org with reason: Gerrit patchset upgrade
  • 08:22 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1228.eqiad.wmnet with reason: host reimage
  • 08:22 jiji@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc2049.codfw.wmnet with OS bookworm
  • 08:12 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2049.codfw.wmnet with reason: host reimage
  • 08:10 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1228.eqiad.wmnet with OS bookworm
  • 08:09 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 08:09 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2049.codfw.wmnet with reason: host reimage
  • 08:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on db1228.eqiad.wmnet with reason: reimage
  • 08:08 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3:00:00 on db1228.eqiad.wmnet with reason: reimage
  • 08:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1228 T364290', diff saved to https://phabricator.wikimedia.org/P63404 and previous config saved to /var/cache/conftool/dbconfig/20240528-080835-arnaudb.json
  • 08:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2203.codfw.wmnet
  • 07:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1243.eqiad.wmnet with OS bookworm
  • 07:51 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS bookworm
  • 07:51 jiji@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc2049.codfw.wmnet with OS bookworm
  • 07:49 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2001.codfw.wmnet with reason: host reimage
  • 07:46 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2001.codfw.wmnet with reason: host reimage
  • 07:42 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2203.codfw.wmnet
  • 07:41 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 07:38 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 07:35 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS bookworm
  • 07:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1243.eqiad.wmnet with reason: host reimage
  • 07:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1243.eqiad.wmnet with reason: host reimage
  • 07:30 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 07:23 kartik@deploy1002: Finished scap: Backport for Section Translation: Enable in newly created Wikipedias (T366003) (duration: 19m 51s)
  • 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2209 (T364299)', diff saved to https://phabricator.wikimedia.org/P63403 and previous config saved to /var/cache/conftool/dbconfig/20240528-072006-marostegui.json
  • 07:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 07:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 07:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T364299)', diff saved to https://phabricator.wikimedia.org/P63402 and previous config saved to /var/cache/conftool/dbconfig/20240528-071942-marostegui.json
  • 07:17 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1243.eqiad.wmnet with OS bookworm
  • 07:07 kartik@deploy1002: kartik: Continuing with sync
  • 07:06 kartik@deploy1002: kartik: Backport for Section Translation: Enable in newly created Wikipedias (T366003) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P63401 and previous config saved to /var/cache/conftool/dbconfig/20240528-070434-marostegui.json
  • 07:03 kartik@deploy1002: Started scap: Backport for Section Translation: Enable in newly created Wikipedias (T366003)
  • 06:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P63400 and previous config saved to /var/cache/conftool/dbconfig/20240528-064926-marostegui.json
  • 06:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T364299)', diff saved to https://phabricator.wikimedia.org/P63399 and previous config saved to /var/cache/conftool/dbconfig/20240528-063417-marostegui.json
  • 05:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 05:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 05:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T364069)', diff saved to https://phabricator.wikimedia.org/P63398 and previous config saved to /var/cache/conftool/dbconfig/20240528-052952-marostegui.json
  • 05:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P63397 and previous config saved to /var/cache/conftool/dbconfig/20240528-051444-marostegui.json
  • 05:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2194 (T364299)', diff saved to https://phabricator.wikimedia.org/P63396 and previous config saved to /var/cache/conftool/dbconfig/20240528-050527-marostegui.json
  • 05:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 05:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 05:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T364299)', diff saved to https://phabricator.wikimedia.org/P63395 and previous config saved to /var/cache/conftool/dbconfig/20240528-050504-marostegui.json
  • 04:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P63394 and previous config saved to /var/cache/conftool/dbconfig/20240528-045936-marostegui.json
  • 04:50 dani@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 04:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P63393 and previous config saved to /var/cache/conftool/dbconfig/20240528-044955-marostegui.json
  • 04:49 dani@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 04:49 dani@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 04:48 dani@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 04:48 dani@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 04:48 dani@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 04:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T364069)', diff saved to https://phabricator.wikimedia.org/P63392 and previous config saved to /var/cache/conftool/dbconfig/20240528-044428-marostegui.json
  • 04:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P63391 and previous config saved to /var/cache/conftool/dbconfig/20240528-043446-marostegui.json
  • 04:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T364299)', diff saved to https://phabricator.wikimedia.org/P63390 and previous config saved to /var/cache/conftool/dbconfig/20240528-041937-marostegui.json
  • 04:03 mwpresync@deploy1002: Pruned MediaWiki: 1.43.0-wmf.4 (duration: 03m 44s)
  • 04:02 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.43.0-wmf.7 refs T361401 (duration: 59m 56s)
  • 03:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2213 (T364069)', diff saved to https://phabricator.wikimedia.org/P63389 and previous config saved to /var/cache/conftool/dbconfig/20240528-035915-marostegui.json
  • 03:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
  • 03:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
  • 03:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T364069)', diff saved to https://phabricator.wikimedia.org/P63388 and previous config saved to /var/cache/conftool/dbconfig/20240528-035852-marostegui.json
  • 03:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P63387 and previous config saved to /var/cache/conftool/dbconfig/20240528-034344-marostegui.json
  • 03:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P63386 and previous config saved to /var/cache/conftool/dbconfig/20240528-032835-marostegui.json
  • 03:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T364069)', diff saved to https://phabricator.wikimedia.org/P63385 and previous config saved to /var/cache/conftool/dbconfig/20240528-031327-marostegui.json
  • 03:02 mwpresync@deploy1002: Started scap: testwikis wikis to 1.43.0-wmf.7 refs T361401
  • 02:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T364299)', diff saved to https://phabricator.wikimedia.org/P63384 and previous config saved to /var/cache/conftool/dbconfig/20240528-025234-marostegui.json
  • 02:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 02:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 02:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T364299)', diff saved to https://phabricator.wikimedia.org/P63383 and previous config saved to /var/cache/conftool/dbconfig/20240528-025211-marostegui.json
  • 02:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P63382 and previous config saved to /var/cache/conftool/dbconfig/20240528-023703-marostegui.json
  • 02:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2211 (T364069)', diff saved to https://phabricator.wikimedia.org/P63381 and previous config saved to /var/cache/conftool/dbconfig/20240528-022627-marostegui.json
  • 02:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 02:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 02:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P63380 and previous config saved to /var/cache/conftool/dbconfig/20240528-022155-marostegui.json
  • 02:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T364299)', diff saved to https://phabricator.wikimedia.org/P63379 and previous config saved to /var/cache/conftool/dbconfig/20240528-020647-marostegui.json
  • 01:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 01:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 01:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T364069)', diff saved to https://phabricator.wikimedia.org/P63378 and previous config saved to /var/cache/conftool/dbconfig/20240528-013123-marostegui.json
  • 01:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P63377 and previous config saved to /var/cache/conftool/dbconfig/20240528-011615-marostegui.json
  • 01:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P63376 and previous config saved to /var/cache/conftool/dbconfig/20240528-010107-marostegui.json
  • 00:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T364069)', diff saved to https://phabricator.wikimedia.org/P63375 and previous config saved to /var/cache/conftool/dbconfig/20240528-004559-marostegui.json
  • 00:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2177 (T364299)', diff saved to https://phabricator.wikimedia.org/P63374 and previous config saved to /var/cache/conftool/dbconfig/20240528-003255-marostegui.json
  • 00:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 00:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 00:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T364299)', diff saved to https://phabricator.wikimedia.org/P63373 and previous config saved to /var/cache/conftool/dbconfig/20240528-003230-marostegui.json
  • 00:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P63372 and previous config saved to /var/cache/conftool/dbconfig/20240528-001721-marostegui.json
  • 00:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2192 (T364069)', diff saved to https://phabricator.wikimedia.org/P63371 and previous config saved to /var/cache/conftool/dbconfig/20240528-000602-marostegui.json
  • 00:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 00:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 00:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T364069)', diff saved to https://phabricator.wikimedia.org/P63370 and previous config saved to /var/cache/conftool/dbconfig/20240528-000549-marostegui.json
  • 00:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P63369 and previous config saved to /var/cache/conftool/dbconfig/20240528-000213-marostegui.json

2024-05-27

  • 23:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P63368 and previous config saved to /var/cache/conftool/dbconfig/20240527-235041-marostegui.json
  • 23:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T364299)', diff saved to https://phabricator.wikimedia.org/P63367 and previous config saved to /var/cache/conftool/dbconfig/20240527-234705-marostegui.json
  • 23:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P63366 and previous config saved to /var/cache/conftool/dbconfig/20240527-233533-marostegui.json
  • 23:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T364069)', diff saved to https://phabricator.wikimedia.org/P63365 and previous config saved to /var/cache/conftool/dbconfig/20240527-232025-marostegui.json
  • 22:28 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2178 (T364069)', diff saved to https://phabricator.wikimedia.org/P63364 and previous config saved to /var/cache/conftool/dbconfig/20240527-222759-marostegui.json
  • 22:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 22:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 22:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T364069)', diff saved to https://phabricator.wikimedia.org/P63363 and previous config saved to /var/cache/conftool/dbconfig/20240527-222735-marostegui.json
  • 22:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T364299)', diff saved to https://phabricator.wikimedia.org/P63362 and previous config saved to /var/cache/conftool/dbconfig/20240527-221330-marostegui.json
  • 22:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 22:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 22:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 22:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 22:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T364299)', diff saved to https://phabricator.wikimedia.org/P63361 and previous config saved to /var/cache/conftool/dbconfig/20240527-221302-marostegui.json
  • 22:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P63360 and previous config saved to /var/cache/conftool/dbconfig/20240527-221227-marostegui.json
  • 21:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P63359 and previous config saved to /var/cache/conftool/dbconfig/20240527-215754-marostegui.json
  • 21:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P63358 and previous config saved to /var/cache/conftool/dbconfig/20240527-215719-marostegui.json
  • 21:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P63357 and previous config saved to /var/cache/conftool/dbconfig/20240527-214246-marostegui.json
  • 21:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T364069)', diff saved to https://phabricator.wikimedia.org/P63356 and previous config saved to /var/cache/conftool/dbconfig/20240527-214210-marostegui.json
  • 21:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T364299)', diff saved to https://phabricator.wikimedia.org/P63355 and previous config saved to /var/cache/conftool/dbconfig/20240527-212738-marostegui.json
  • 20:46 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2171 (T364069)', diff saved to https://phabricator.wikimedia.org/P63353 and previous config saved to /var/cache/conftool/dbconfig/20240527-204653-marostegui.json
  • 20:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 20:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 20:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T364069)', diff saved to https://phabricator.wikimedia.org/P63352 and previous config saved to /var/cache/conftool/dbconfig/20240527-204630-marostegui.json
  • 20:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P63351 and previous config saved to /var/cache/conftool/dbconfig/20240527-203922-ladsgroup.json
  • 20:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P63350 and previous config saved to /var/cache/conftool/dbconfig/20240527-203122-marostegui.json
  • 20:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P63349 and previous config saved to /var/cache/conftool/dbconfig/20240527-202416-ladsgroup.json
  • 20:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P63348 and previous config saved to /var/cache/conftool/dbconfig/20240527-201614-marostegui.json
  • 20:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P63347 and previous config saved to /var/cache/conftool/dbconfig/20240527-200910-ladsgroup.json
  • 20:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T364069)', diff saved to https://phabricator.wikimedia.org/P63346 and previous config saved to /var/cache/conftool/dbconfig/20240527-200106-marostegui.json
  • 19:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P63345 and previous config saved to /var/cache/conftool/dbconfig/20240527-195404-ladsgroup.json
  • 19:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T364299)', diff saved to https://phabricator.wikimedia.org/P63344 and previous config saved to /var/cache/conftool/dbconfig/20240527-195232-marostegui.json
  • 19:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 19:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 19:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T364299)', diff saved to https://phabricator.wikimedia.org/P63343 and previous config saved to /var/cache/conftool/dbconfig/20240527-195158-marostegui.json
  • 19:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P63342 and previous config saved to /var/cache/conftool/dbconfig/20240527-193650-marostegui.json
  • 19:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P63341 and previous config saved to /var/cache/conftool/dbconfig/20240527-192142-marostegui.json
  • 19:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T364299)', diff saved to https://phabricator.wikimedia.org/P63340 and previous config saved to /var/cache/conftool/dbconfig/20240527-190634-marostegui.json
  • 19:01 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2157 (T364069)', diff saved to https://phabricator.wikimedia.org/P63339 and previous config saved to /var/cache/conftool/dbconfig/20240527-190155-marostegui.json
  • 19:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 19:01 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 19:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T364069)', diff saved to https://phabricator.wikimedia.org/P63338 and previous config saved to /var/cache/conftool/dbconfig/20240527-190132-marostegui.json
  • 18:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P63337 and previous config saved to /var/cache/conftool/dbconfig/20240527-184624-marostegui.json
  • 18:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P63336 and previous config saved to /var/cache/conftool/dbconfig/20240527-183115-marostegui.json
  • 18:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T364069)', diff saved to https://phabricator.wikimedia.org/P63335 and previous config saved to /var/cache/conftool/dbconfig/20240527-181607-marostegui.json
  • 17:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2127 (T364299)', diff saved to https://phabricator.wikimedia.org/P63334 and previous config saved to /var/cache/conftool/dbconfig/20240527-173035-marostegui.json
  • 17:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 17:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 17:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2128 (T364069)', diff saved to https://phabricator.wikimedia.org/P63333 and previous config saved to /var/cache/conftool/dbconfig/20240527-172258-marostegui.json
  • 17:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 17:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 17:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 17:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 16:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 16:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 16:36 jiji@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc2049.codfw.wmnet with OS bookworm
  • 16:29 brouberol@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 16:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 16:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 16:18 brouberol@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 16:09 brouberol@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 16:03 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 16:03 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 16:02 brouberol@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 15:57 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS bookworm
  • 15:56 elukey: run `apt-get clean` on dse-k8s-worker1001 to free space on the root partition
  • 15:56 jiji@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc2049.codfw.wmnet with OS bookworm
  • 15:54 brouberol@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 15:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 15:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 15:44 brouberol@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 15:30 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 15:28 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS bookworm
  • 15:22 effie: disable puppet on mc1049 pending OS upgrade
  • 15:20 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 15:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T360332)', diff saved to https://phabricator.wikimedia.org/P63332 and previous config saved to /var/cache/conftool/dbconfig/20240527-150735-arnaudb.json
  • 15:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 15:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 15:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T364299)', diff saved to https://phabricator.wikimedia.org/P63331 and previous config saved to /var/cache/conftool/dbconfig/20240527-150514-marostegui.json
  • 15:01 fabfur: enable puppet on A:cp (T365718)
  • 14:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P63330 and previous config saved to /var/cache/conftool/dbconfig/20240527-145226-arnaudb.json
  • 14:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P63329 and previous config saved to /var/cache/conftool/dbconfig/20240527-145004-marostegui.json
  • 14:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 14:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 14:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T364069)', diff saved to https://phabricator.wikimedia.org/P63328 and previous config saved to /var/cache/conftool/dbconfig/20240527-144538-marostegui.json
  • 14:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P63327 and previous config saved to /var/cache/conftool/dbconfig/20240527-143718-arnaudb.json
  • 14:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P63326 and previous config saved to /var/cache/conftool/dbconfig/20240527-143457-marostegui.json
  • 14:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240527-143025-marostegui.json
  • 14:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T360332)', diff saved to https://phabricator.wikimedia.org/P63324 and previous config saved to /var/cache/conftool/dbconfig/20240527-142210-arnaudb.json
  • 14:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T364299)', diff saved to https://phabricator.wikimedia.org/P63323 and previous config saved to /var/cache/conftool/dbconfig/20240527-141949-marostegui.json
  • 14:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1238 (T360332)', diff saved to https://phabricator.wikimedia.org/P63322 and previous config saved to /var/cache/conftool/dbconfig/20240527-141948-arnaudb.json
  • 14:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 14:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 14:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to https://phabricator.wikimedia.org/P63321 and previous config saved to /var/cache/conftool/dbconfig/20240527-141515-marostegui.json
  • 14:14 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --touched-after=20240524120000 2>&1 | tee -a ~/T315510-enwiki-7; date # cc T365974
  • 14:11 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript updateCollation.php bswikiquote --previous-collation=uppercase # T365133
  • 14:06 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 14:05 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Set $wgCategoryCollation to uca-bs-u-kn on Bosnian Wikiquote (T365133) (duration: 17m 25s)
  • 14:00 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 14:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T364069)', diff saved to https://phabricator.wikimedia.org/P63320 and previous config saved to /var/cache/conftool/dbconfig/20240527-140007-marostegui.json
  • 13:54 sfaci@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 13:54 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 13:51 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and nmw03: Continuing with sync
  • 13:50 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and nmw03: Backport for Set $wgCategoryCollation to uca-bs-u-kn on Bosnian Wikiquote (T365133) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:47 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Set $wgCategoryCollation to uca-bs-u-kn on Bosnian Wikiquote (T365133)
  • 13:46 elukey@deploy1002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:46 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Enable wgDiscussionToolsEnablePermalinksBackend on enwiki (T315353), Pre-emptively disable DiscussionToolsEnableThanks (no-op) (duration: 18m 15s)
  • 13:46 elukey@deploy1002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:44 sfaci@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 13:42 elukey@deploy1002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:42 elukey@deploy1002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:36 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T364299)', diff saved to https://phabricator.wikimedia.org/P63319 and previous config saved to /var/cache/conftool/dbconfig/20240527-133636-marostegui.json
  • 13:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 13:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 13:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T364299)', diff saved to https://phabricator.wikimedia.org/P63318 and previous config saved to /var/cache/conftool/dbconfig/20240527-133605-marostegui.json
  • 13:32 logmsgbot: lucaswerkmeister-wmde@deploy1002 esanders and matmarex and lucaswerkmeister-wmde: Continuing with sync
  • 13:32 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:32 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 13:32 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:30 logmsgbot: lucaswerkmeister-wmde@deploy1002 esanders and matmarex and lucaswerkmeister-wmde: Backport for Enable wgDiscussionToolsEnablePermalinksBackend on enwiki (T315353), Pre-emptively disable DiscussionToolsEnableThanks (no-op) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:29 fabfur: enabled puppet on cp4037 to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1035440 (T365718)
  • 13:28 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Enable wgDiscussionToolsEnablePermalinksBackend on enwiki (T315353), Pre-emptively disable DiscussionToolsEnableThanks (no-op)
  • 13:26 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Revert "arwiki: Disable Extension:ContentTranslation for non-autoreview users" (T255022) (duration: 20m 46s)
  • 13:21 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 13:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P63317 and previous config saved to /var/cache/conftool/dbconfig/20240527-132057-marostegui.json
  • 13:18 fabfur: disabling puppet on A:cp to safely apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1035440 (T365718)
  • 13:16 vgutierrez: test fifo-log-demux 0.7.5 on cp4052
  • 13:15 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1213 (T364069)', diff saved to https://phabricator.wikimedia.org/P63316 and previous config saved to /var/cache/conftool/dbconfig/20240527-131539-marostegui.json
  • 13:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 13:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 13:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T364069)', diff saved to https://phabricator.wikimedia.org/P63315 and previous config saved to /var/cache/conftool/dbconfig/20240527-131516-marostegui.json
  • 13:15 hnowlan@cumin1002: conftool action : set/pooled=no; selector: name=parse1002.eqiad.wmnet
  • 13:12 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and gergesshamon: Continuing with sync
  • 13:08 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and gergesshamon: Backport for Revert "arwiki: Disable Extension:ContentTranslation for non-autoreview users" (T255022) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:08 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 13:06 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Revert "arwiki: Disable Extension:ContentTranslation for non-autoreview users" (T255022)
  • 13:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P63314 and previous config saved to /var/cache/conftool/dbconfig/20240527-130549-marostegui.json
  • 13:05 elukey@deploy1002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 13:04 elukey@deploy1002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: sync
  • 13:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P63313 and previous config saved to /var/cache/conftool/dbconfig/20240527-130008-marostegui.json
  • 12:56 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 12:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T364299)', diff saved to https://phabricator.wikimedia.org/P63312 and previous config saved to /var/cache/conftool/dbconfig/20240527-125041-marostegui.json
  • 12:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P63311 and previous config saved to /var/cache/conftool/dbconfig/20240527-124500-marostegui.json
  • 12:42 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 12:40 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 12:19 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 12:18 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 12:18 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 12:18 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 12:17 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 12:17 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 12:11 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host <spicerack.netbox.NetboxServer object at 0x7f9776417550>
  • 12:11 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2001
  • 12:11 ayounsi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2001
  • 12:11 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker2001.codfw.wmnet 39.16.192.10.in-addr.arpa 9.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:11 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-worker2001.codfw.wmnet 39.16.192.10.in-addr.arpa 9.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 12:11 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:11 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2001 - ayounsi@cumin1002"
  • 12:10 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker2001 - ayounsi@cumin1002"
  • 12:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1198 (T364299)', diff saved to https://phabricator.wikimedia.org/P63309 and previous config saved to /var/cache/conftool/dbconfig/20240527-120732-marostegui.json
  • 12:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 12:07 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 12:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 12:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T364299)', diff saved to https://phabricator.wikimedia.org/P63308 and previous config saved to /var/cache/conftool/dbconfig/20240527-120709-marostegui.json
  • 12:06 ayounsi@cumin1002: START - Cookbook sre.hosts.move-vlan for host <spicerack.netbox.NetboxServer object at 0x7f9776417550>
  • 12:05 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 11:52 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 11:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P63307 and previous config saved to /var/cache/conftool/dbconfig/20240527-115200-marostegui.json
  • 11:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) netbox to netbox2002.codfw.wmnet,netbox1002.eqiad.wmnet with reason: add CasApereo auth and update wheels - ayounsi@cumin1002 - T308002
  • 11:49 ayounsi@cumin1002: START - Cookbook sre.deploy.python-code netbox to netbox2002.codfw.wmnet,netbox1002.eqiad.wmnet with reason: add CasApereo auth and update wheels - ayounsi@cumin1002 - T308002
  • 11:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) netbox to netbox-dev2002.codfw.wmnet with reason: add python-jose and update wheels - ayounsi@cumin1002 - T308002
  • 11:44 ayounsi@cumin1002: START - Cookbook sre.deploy.python-code netbox to netbox-dev2002.codfw.wmnet with reason: add python-jose and update wheels - ayounsi@cumin1002 - T308002
  • 11:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1210 (T364069)', diff saved to https://phabricator.wikimedia.org/P63306 and previous config saved to /var/cache/conftool/dbconfig/20240527-114316-marostegui.json
  • 11:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 11:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 11:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T364069)', diff saved to https://phabricator.wikimedia.org/P63305 and previous config saved to /var/cache/conftool/dbconfig/20240527-114252-marostegui.json
  • 11:41 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 11:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P63304 and previous config saved to /var/cache/conftool/dbconfig/20240527-113651-marostegui.json
  • 11:33 moritzm: installing jinja2 security updates
  • 11:29 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 11:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P63303 and previous config saved to /var/cache/conftool/dbconfig/20240527-112744-marostegui.json
  • 11:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T364299)', diff saved to https://phabricator.wikimedia.org/P63302 and previous config saved to /var/cache/conftool/dbconfig/20240527-112143-marostegui.json
  • 11:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P63301 and previous config saved to /var/cache/conftool/dbconfig/20240527-111236-marostegui.json
  • 10:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T364069)', diff saved to https://phabricator.wikimedia.org/P63299 and previous config saved to /var/cache/conftool/dbconfig/20240527-105728-marostegui.json
  • 10:55 Amir1: dbmaint s2@codfw (T364985)
  • 10:55 Amir1: main s2@codfw (T364985)
  • 10:52 slyngs: Upgrade IDM to Bitu 0.0.8
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1189 (T364299)', diff saved to https://phabricator.wikimedia.org/P63298 and previous config saved to /var/cache/conftool/dbconfig/20240527-103759-marostegui.json
  • 10:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 10:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 10:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T364299)', diff saved to https://phabricator.wikimedia.org/P63297 and previous config saved to /var/cache/conftool/dbconfig/20240527-103734-marostegui.json
  • 10:26 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 100%: Repooling T365797', diff saved to https://phabricator.wikimedia.org/P63296 and previous config saved to /var/cache/conftool/dbconfig/20240527-102639-root.json
  • 10:26 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 10:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P63295 and previous config saved to /var/cache/conftool/dbconfig/20240527-102226-marostegui.json
  • 10:14 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 10:12 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:12 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 75%: Repooling T365797', diff saved to https://phabricator.wikimedia.org/P63294 and previous config saved to /var/cache/conftool/dbconfig/20240527-101133-root.json
  • 10:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P63293 and previous config saved to /var/cache/conftool/dbconfig/20240527-100717-marostegui.json
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1200 (T364069)', diff saved to https://phabricator.wikimedia.org/P63292 and previous config saved to /var/cache/conftool/dbconfig/20240527-100523-marostegui.json
  • 10:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 10:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T364069)', diff saved to https://phabricator.wikimedia.org/P63291 and previous config saved to /var/cache/conftool/dbconfig/20240527-100459-marostegui.json
  • 10:04 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:04 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:58 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 09:56 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 50%: Repooling T365797', diff saved to https://phabricator.wikimedia.org/P63290 and previous config saved to /var/cache/conftool/dbconfig/20240527-095626-root.json
  • 09:56 ladsgroup@deploy1002: Finished scap: Backport for Update tagline and wordmark of Persian Wikibooks (T365913) (duration: 16m 59s)
  • 09:54 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:54 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:52 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:52 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T364299)', diff saved to https://phabricator.wikimedia.org/P63289 and previous config saved to /var/cache/conftool/dbconfig/20240527-095208-marostegui.json
  • 09:49 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P63288 and previous config saved to /var/cache/conftool/dbconfig/20240527-094951-marostegui.json
  • 09:49 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:47 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:46 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:45 ayounsi@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) netbox to netbox-dev2002.codfw.wmnet with reason: add python-social-auth and update wheels - ayounsi@cumin1002 - T308002
  • 09:45 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 09:44 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:44 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:42 ladsgroup@deploy1002: ebrahim and ladsgroup: Continuing with sync
  • 09:41 ladsgroup@deploy1002: ebrahim and ladsgroup: Backport for Update tagline and wordmark of Persian Wikibooks (T365913) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:41 ayounsi@cumin1002: START - Cookbook sre.deploy.python-code netbox to netbox-dev2002.codfw.wmnet with reason: add python-social-auth and update wheels - ayounsi@cumin1002 - T308002
  • 09:41 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 25%: Repooling T365797', diff saved to https://phabricator.wikimedia.org/P63287 and previous config saved to /var/cache/conftool/dbconfig/20240527-094120-root.json
  • 09:39 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:39 ladsgroup@deploy1002: Started scap: Backport for Update tagline and wordmark of Persian Wikibooks (T365913)
  • 09:39 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:39 zabe@deploy1002: Finished scap: Backport for Stop writing to af_user(_text)/afh_user(_text) in group1 wikis (T337920) (duration: 29m 43s)
  • 09:37 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 64096
  • 09:36 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 64096
  • 09:34 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 09:34 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
  • 09:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P63286 and previous config saved to /var/cache/conftool/dbconfig/20240527-093443-marostegui.json
  • 09:33 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129', diff saved to https://phabricator.wikimedia.org/P63285 and previous config saved to /var/cache/conftool/dbconfig/20240527-093306-root.json
  • 09:31 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:31 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:26 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 10%: Repooling T365797', diff saved to https://phabricator.wikimedia.org/P63284 and previous config saved to /var/cache/conftool/dbconfig/20240527-092614-root.json
  • 09:25 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P63283 and previous config saved to /var/cache/conftool/dbconfig/20240527-092459-root.json
  • 09:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:23 zabe@deploy1002: zabe: Continuing with sync
  • 09:23 zabe@deploy1002: zabe: Backport for Stop writing to af_user(_text)/afh_user(_text) in group1 wikis (T337920) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T364069)', diff saved to https://phabricator.wikimedia.org/P63282 and previous config saved to /var/cache/conftool/dbconfig/20240527-091935-marostegui.json
  • 09:14 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:14 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 5%: Repooling T365797', diff saved to https://phabricator.wikimedia.org/P63280 and previous config saved to /var/cache/conftool/dbconfig/20240527-091108-root.json
  • 09:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P63279 and previous config saved to /var/cache/conftool/dbconfig/20240527-090953-root.json
  • 09:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T364299)', diff saved to https://phabricator.wikimedia.org/P63278 and previous config saved to /var/cache/conftool/dbconfig/20240527-090938-marostegui.json
  • 09:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 09:09 zabe@deploy1002: Started scap: Backport for Stop writing to af_user(_text)/afh_user(_text) in group1 wikis (T337920)
  • 09:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 09:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T364299)', diff saved to https://phabricator.wikimedia.org/P63277 and previous config saved to /var/cache/conftool/dbconfig/20240527-090915-marostegui.json
  • 09:08 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:04 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:04 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:59 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:59 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:56 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'db2150 (re)pooling @ 1%: Repooling T365797', diff saved to https://phabricator.wikimedia.org/P63276 and previous config saved to /var/cache/conftool/dbconfig/20240527-085602-root.json
  • 08:55 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2129 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P63275 and previous config saved to /var/cache/conftool/dbconfig/20240527-085447-root.json
  • 08:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P63274 and previous config saved to /var/cache/conftool/dbconfig/20240527-085407-marostegui.json
  • 08:53 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:53 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:42 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:42 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:40 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:40 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P63273 and previous config saved to /var/cache/conftool/dbconfig/20240527-083859-marostegui.json
  • 08:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:32 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:32 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1185 (T364069)', diff saved to https://phabricator.wikimedia.org/P63272 and previous config saved to /var/cache/conftool/dbconfig/20240527-082603-marostegui.json
  • 08:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 08:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 08:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T364069)', diff saved to https://phabricator.wikimedia.org/P63271 and previous config saved to /var/cache/conftool/dbconfig/20240527-082539-marostegui.json
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T364299)', diff saved to https://phabricator.wikimedia.org/P63270 and previous config saved to /var/cache/conftool/dbconfig/20240527-082351-marostegui.json
  • 08:22 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:22 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2188.codfw.wmnet
  • 08:20 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:20 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:18 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:18 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:14 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:14 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P63269 and previous config saved to /var/cache/conftool/dbconfig/20240527-081031-marostegui.json
  • 08:01 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2188.codfw.wmnet
  • 08:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2129.codfw.wmnet with reason: Long schema change
  • 08:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2129.codfw.wmnet with reason: Long schema change
  • 08:00 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:00 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:58 marostegui: Deploy schema change on s6 codfw (old master) dbmaint T364299
  • 07:57 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:57 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2129 T365783', diff saved to https://phabricator.wikimedia.org/P63268 and previous config saved to /var/cache/conftool/dbconfig/20240527-075602-root.json
  • 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P63267 and previous config saved to /var/cache/conftool/dbconfig/20240527-075524-marostegui.json
  • 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2214 to s6 primary T365783', diff saved to https://phabricator.wikimedia.org/P63266 and previous config saved to /var/cache/conftool/dbconfig/20240527-075512-marostegui.json
  • 07:54 marostegui: Starting s6 codfw failover from db2129 to db2214 - T365783
  • 07:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2176.codfw.wmnet
  • 07:49 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:49 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1166 (T364299)', diff saved to https://phabricator.wikimedia.org/P63265 and previous config saved to /var/cache/conftool/dbconfig/20240527-074105-marostegui.json
  • 07:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 07:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T364299)', diff saved to https://phabricator.wikimedia.org/P63264 and previous config saved to /var/cache/conftool/dbconfig/20240527-074042-marostegui.json
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T364069)', diff saved to https://phabricator.wikimedia.org/P63263 and previous config saved to /var/cache/conftool/dbconfig/20240527-074009-marostegui.json
  • 07:36 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:36 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:35 root@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s6 T365783
  • 07:35 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2214 with weight 0 T365783', diff saved to https://phabricator.wikimedia.org/P63262 and previous config saved to /var/cache/conftool/dbconfig/20240527-073545-root.json
  • 07:35 root@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s6 T365783
  • 07:35 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:35 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:33 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:33 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:29 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2176.codfw.wmnet
  • 07:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2174.codfw.wmnet
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P63261 and previous config saved to /var/cache/conftool/dbconfig/20240527-072534-marostegui.json
  • 07:25 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:21 marostegui: Deploy schema change on s7 codfw dbmaint T307501
  • 07:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:19 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:18 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2174.codfw.wmnet
  • 07:12 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:12 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P63260 and previous config saved to /var/cache/conftool/dbconfig/20240527-071026-marostegui.json
  • 07:08 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:06 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:06 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T364299)', diff saved to https://phabricator.wikimedia.org/P63259 and previous config saved to /var/cache/conftool/dbconfig/20240527-065518-marostegui.json
  • 06:53 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:53 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:50 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:47 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:47 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:44 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1243.eqiad.wmnet with OS bookworm
  • 06:40 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:40 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1183 (T364069)', diff saved to https://phabricator.wikimedia.org/P63258 and previous config saved to /var/cache/conftool/dbconfig/20240527-063832-marostegui.json
  • 06:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 06:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 06:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T364069)', diff saved to https://phabricator.wikimedia.org/P63257 and previous config saved to /var/cache/conftool/dbconfig/20240527-063809-marostegui.json
  • 06:34 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:25 kart_: Updated cxserver to 2024-05-20-182409-production (T354666, T365230)
  • 06:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P63256 and previous config saved to /var/cache/conftool/dbconfig/20240527-062301-marostegui.json
  • 06:17 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 06:17 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:17 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:17 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 06:15 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 06:15 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 06:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1157 (T364299)', diff saved to https://phabricator.wikimedia.org/P63255 and previous config saved to /var/cache/conftool/dbconfig/20240527-061252-marostegui.json
  • 06:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 06:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 06:08 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P63253 and previous config saved to /var/cache/conftool/dbconfig/20240527-060752-marostegui.json
  • 06:06 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:06 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:00 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:59 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:58 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:53 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:53 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 05:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T364069)', diff saved to https://phabricator.wikimedia.org/P63252 and previous config saved to /var/cache/conftool/dbconfig/20240527-055244-marostegui.json
  • 05:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:49 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:43 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:43 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:34 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:24 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1243.eqiad.wmnet with OS bookworm
  • 05:24 marostegui@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1243.eqiad.wmnet with OS bookworm
  • 05:24 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:22 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:21 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:15 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:15 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:08 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:07 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1243.eqiad.wmnet with OS bookworm
  • 05:06 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:06 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1243', diff saved to https://phabricator.wikimedia.org/P63251 and previous config saved to /var/cache/conftool/dbconfig/20240527-050551-marostegui.json
  • 05:03 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:03 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:01 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:54 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:54 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1161 (T364069)', diff saved to https://phabricator.wikimedia.org/P63250 and previous config saved to /var/cache/conftool/dbconfig/20240527-045301-marostegui.json
  • 04:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 04:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 04:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 04:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 04:47 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:47 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:44 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:44 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 04:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 04:40 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:40 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:21 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:21 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:19 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:14 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:14 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:11 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:11 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:04 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:04 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:49 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:42 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:42 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:40 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:40 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:36 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:36 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:29 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:29 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:19 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:11 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:11 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:08 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 03:05 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 03:05 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:56 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:56 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:54 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:54 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:52 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:52 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:49 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:49 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:34 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:33 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:33 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:18 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:18 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:11 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:11 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:09 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:51 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:51 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:47 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:47 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:36 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:36 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:19 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:19 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:15 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:15 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:11 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:11 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:59 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:59 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:54 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:54 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:39 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:39 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:31 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:31 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:26 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:26 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:17 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:17 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:12 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:12 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:04 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 00:04 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply

2024-05-26

  • 23:57 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:57 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:51 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:51 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:48 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:48 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:46 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:46 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:42 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:42 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:33 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:33 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:22 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:22 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:05 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:05 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:55 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:55 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:44 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:44 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:42 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:42 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:33 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:33 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:23 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:23 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:10 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:10 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:00 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:00 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:58 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:55 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:55 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:53 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:53 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:49 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:49 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:43 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:43 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:36 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:36 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:34 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:32 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:32 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:06 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:06 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:04 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:04 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:53 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:53 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:43 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:43 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:40 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:40 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:36 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:36 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:28 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:28 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:22 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:22 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:20 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:20 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:11 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:11 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:57 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:57 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:50 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:44 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:44 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:35 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:35 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:25 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:25 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:20 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:20 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:18 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:18 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:10 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:10 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:08 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:01 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:59 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:57 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:57 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:55 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:55 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:45 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:44 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:41 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:41 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:25 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:25 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:23 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:22 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:20 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:20 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:17 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:17 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:06 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:06 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:03 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:59 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:50 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:40 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:40 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:34 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:29 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:29 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:23 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:23 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:06 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:06 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:58 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:51 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:51 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:46 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:46 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:31 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:31 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:27 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:27 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:25 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:25 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:02 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:02 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:56 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:56 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:45 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:45 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:42 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:41 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:26 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:26 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:42 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:42 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:37 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:37 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:35 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:21 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:21 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:13 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:13 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:04 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:04 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T364069)', diff saved to https://phabricator.wikimedia.org/P63249 and previous config saved to /var/cache/conftool/dbconfig/20240526-140250-marostegui.json
  • 14:01 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 14:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:59 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:59 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:50 jelto: restart apache2 on gerrit1003
  • 13:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P63248 and previous config saved to /var/cache/conftool/dbconfig/20240526-134742-marostegui.json
  • 13:44 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:44 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P63247 and previous config saved to /var/cache/conftool/dbconfig/20240526-133234-marostegui.json
  • 13:29 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:29 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T364069)', diff saved to https://phabricator.wikimedia.org/P63246 and previous config saved to /var/cache/conftool/dbconfig/20240526-131726-marostegui.json
  • 13:14 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:14 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:05 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:05 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:57 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:57 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:50 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:50 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:49 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:48 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:47 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:46 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 12:01 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 12:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:24 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:22 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:22 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:20 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 11:20 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 11:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2209 (T364069)', diff saved to https://phabricator.wikimedia.org/P63245 and previous config saved to /var/cache/conftool/dbconfig/20240526-111558-marostegui.json
  • 11:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 11:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 11:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T364069)', diff saved to https://phabricator.wikimedia.org/P63244 and previous config saved to /var/cache/conftool/dbconfig/20240526-111534-marostegui.json
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P63243 and previous config saved to /var/cache/conftool/dbconfig/20240526-110026-marostegui.json
  • 10:53 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:53 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:51 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:51 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P63242 and previous config saved to /var/cache/conftool/dbconfig/20240526-104518-marostegui.json
  • 10:34 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:34 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:31 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:31 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T364069)', diff saved to https://phabricator.wikimedia.org/P63241 and previous config saved to /var/cache/conftool/dbconfig/20240526-103010-marostegui.json
  • 10:16 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:16 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:11 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:11 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 10:01 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 10:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:56 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:56 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:48 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:48 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:35 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:35 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:24 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:20 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:20 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:15 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:15 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:12 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:12 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:01 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:01 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:46 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:46 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:39 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:39 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:24 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2194 (T364069)', diff saved to https://phabricator.wikimedia.org/P63240 and previous config saved to /var/cache/conftool/dbconfig/20240526-082333-marostegui.json
  • 08:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T364069)', diff saved to https://phabricator.wikimedia.org/P63239 and previous config saved to /var/cache/conftool/dbconfig/20240526-082310-marostegui.json
  • 08:09 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:09 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P63238 and previous config saved to /var/cache/conftool/dbconfig/20240526-080802-marostegui.json
  • 08:04 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:04 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:58 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:54 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:54 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P63237 and previous config saved to /var/cache/conftool/dbconfig/20240526-075253-marostegui.json
  • 07:44 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:44 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T364069)', diff saved to https://phabricator.wikimedia.org/P63236 and previous config saved to /var/cache/conftool/dbconfig/20240526-073745-marostegui.json
  • 07:29 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:29 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2204 (T364299)', diff saved to https://phabricator.wikimedia.org/P63235 and previous config saved to /var/cache/conftool/dbconfig/20240526-072316-marostegui.json
  • 07:09 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:09 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2204', diff saved to https://phabricator.wikimedia.org/P63234 and previous config saved to /var/cache/conftool/dbconfig/20240526-070808-marostegui.json
  • 06:58 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:56 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:56 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2204', diff saved to https://phabricator.wikimedia.org/P63233 and previous config saved to /var/cache/conftool/dbconfig/20240526-065259-marostegui.json
  • 06:41 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:41 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2204 (T364299)', diff saved to https://phabricator.wikimedia.org/P63232 and previous config saved to /var/cache/conftool/dbconfig/20240526-063752-marostegui.json
  • 06:30 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:30 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 06:15 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 06:15 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:49 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:49 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2204 (T364299)', diff saved to https://phabricator.wikimedia.org/P63231 and previous config saved to /var/cache/conftool/dbconfig/20240526-054305-marostegui.json
  • 05:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2204.codfw.wmnet with reason: Maintenance
  • 05:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2204.codfw.wmnet with reason: Maintenance
  • 05:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 05:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 05:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T364069)', diff saved to https://phabricator.wikimedia.org/P63230 and previous config saved to /var/cache/conftool/dbconfig/20240526-053127-marostegui.json
  • 05:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 05:31 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 05:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T364069)', diff saved to https://phabricator.wikimedia.org/P63229 and previous config saved to /var/cache/conftool/dbconfig/20240526-053103-marostegui.json
  • 05:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P63228 and previous config saved to /var/cache/conftool/dbconfig/20240526-051555-marostegui.json
  • 05:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P63227 and previous config saved to /var/cache/conftool/dbconfig/20240526-050047-marostegui.json
  • 04:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 04:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 04:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T364299)', diff saved to https://phabricator.wikimedia.org/P63226 and previous config saved to /var/cache/conftool/dbconfig/20240526-045357-marostegui.json
  • 04:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T364069)', diff saved to https://phabricator.wikimedia.org/P63225 and previous config saved to /var/cache/conftool/dbconfig/20240526-044539-marostegui.json
  • 04:41 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:41 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P63224 and previous config saved to /var/cache/conftool/dbconfig/20240526-043849-marostegui.json
  • 04:24 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 04:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 04:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P63223 and previous config saved to /var/cache/conftool/dbconfig/20240526-042341-marostegui.json
  • 04:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T364299)', diff saved to https://phabricator.wikimedia.org/P63222 and previous config saved to /var/cache/conftool/dbconfig/20240526-040833-marostegui.json
  • 03:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T364299)', diff saved to https://phabricator.wikimedia.org/P63221 and previous config saved to /var/cache/conftool/dbconfig/20240526-030259-marostegui.json
  • 03:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 03:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 03:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T364299)', diff saved to https://phabricator.wikimedia.org/P63220 and previous config saved to /var/cache/conftool/dbconfig/20240526-030236-marostegui.json
  • 02:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P63219 and previous config saved to /var/cache/conftool/dbconfig/20240526-024728-marostegui.json
  • 02:43 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:43 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P63218 and previous config saved to /var/cache/conftool/dbconfig/20240526-023220-marostegui.json
  • 02:29 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:29 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 02:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T364299)', diff saved to https://phabricator.wikimedia.org/P63217 and previous config saved to /var/cache/conftool/dbconfig/20240526-021711-marostegui.json
  • 02:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2177 (T364069)', diff saved to https://phabricator.wikimedia.org/P63216 and previous config saved to /var/cache/conftool/dbconfig/20240526-021238-marostegui.json
  • 02:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 02:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 02:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T364069)', diff saved to https://phabricator.wikimedia.org/P63215 and previous config saved to /var/cache/conftool/dbconfig/20240526-021213-marostegui.json
  • 01:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P63214 and previous config saved to /var/cache/conftool/dbconfig/20240526-015704-marostegui.json
  • 01:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P63213 and previous config saved to /var/cache/conftool/dbconfig/20240526-014156-marostegui.json
  • 01:37 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:37 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T364069)', diff saved to https://phabricator.wikimedia.org/P63212 and previous config saved to /var/cache/conftool/dbconfig/20240526-012648-marostegui.json
  • 01:08 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:08 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T364299)', diff saved to https://phabricator.wikimedia.org/P63211 and previous config saved to /var/cache/conftool/dbconfig/20240526-010523-marostegui.json
  • 01:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 01:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 01:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T364299)', diff saved to https://phabricator.wikimedia.org/P63210 and previous config saved to /var/cache/conftool/dbconfig/20240526-010500-marostegui.json
  • 00:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P63209 and previous config saved to /var/cache/conftool/dbconfig/20240526-004952-marostegui.json
  • 00:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P63208 and previous config saved to /var/cache/conftool/dbconfig/20240526-003444-marostegui.json
  • 00:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T364299)', diff saved to https://phabricator.wikimedia.org/P63207 and previous config saved to /var/cache/conftool/dbconfig/20240526-001936-marostegui.json

2024-05-25

  • 23:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T364299)', diff saved to https://phabricator.wikimedia.org/P63206 and previous config saved to /var/cache/conftool/dbconfig/20240525-230523-marostegui.json
  • 23:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 23:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 23:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138 (T364299)', diff saved to https://phabricator.wikimedia.org/P63205 and previous config saved to /var/cache/conftool/dbconfig/20240525-230500-marostegui.json
  • 22:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T364069)', diff saved to https://phabricator.wikimedia.org/P63204 and previous config saved to /var/cache/conftool/dbconfig/20240525-225331-marostegui.json
  • 22:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 22:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 22:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 22:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 22:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T364069)', diff saved to https://phabricator.wikimedia.org/P63203 and previous config saved to /var/cache/conftool/dbconfig/20240525-225251-marostegui.json
  • 22:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138', diff saved to https://phabricator.wikimedia.org/P63202 and previous config saved to /var/cache/conftool/dbconfig/20240525-224952-marostegui.json
  • 22:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P63201 and previous config saved to /var/cache/conftool/dbconfig/20240525-223743-marostegui.json
  • 22:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138', diff saved to https://phabricator.wikimedia.org/P63200 and previous config saved to /var/cache/conftool/dbconfig/20240525-223444-marostegui.json
  • 22:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P63199 and previous config saved to /var/cache/conftool/dbconfig/20240525-222235-marostegui.json
  • 22:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138 (T364299)', diff saved to https://phabricator.wikimedia.org/P63198 and previous config saved to /var/cache/conftool/dbconfig/20240525-221936-marostegui.json
  • 22:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T364069)', diff saved to https://phabricator.wikimedia.org/P63197 and previous config saved to /var/cache/conftool/dbconfig/20240525-220727-marostegui.json
  • 21:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2138 (T364299)', diff saved to https://phabricator.wikimedia.org/P63196 and previous config saved to /var/cache/conftool/dbconfig/20240525-210754-marostegui.json
  • 21:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 21:07 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 21:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T364299)', diff saved to https://phabricator.wikimedia.org/P63195 and previous config saved to /var/cache/conftool/dbconfig/20240525-210731-marostegui.json
  • 20:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P63194 and previous config saved to /var/cache/conftool/dbconfig/20240525-205223-marostegui.json
  • 20:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P63193 and previous config saved to /var/cache/conftool/dbconfig/20240525-203715-marostegui.json
  • 20:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T364299)', diff saved to https://phabricator.wikimedia.org/P63192 and previous config saved to /var/cache/conftool/dbconfig/20240525-202207-marostegui.json
  • 19:51 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:51 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T364069)', diff saved to https://phabricator.wikimedia.org/P63191 and previous config saved to /var/cache/conftool/dbconfig/20240525-193047-marostegui.json
  • 19:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 19:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 19:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2126 (T364299)', diff saved to https://phabricator.wikimedia.org/P63190 and previous config saved to /var/cache/conftool/dbconfig/20240525-191242-marostegui.json
  • 19:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 19:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 19:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 19:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 19:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T364299)', diff saved to https://phabricator.wikimedia.org/P63189 and previous config saved to /var/cache/conftool/dbconfig/20240525-191201-marostegui.json
  • 18:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P63188 and previous config saved to /var/cache/conftool/dbconfig/20240525-185653-marostegui.json
  • 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P63187 and previous config saved to /var/cache/conftool/dbconfig/20240525-184145-marostegui.json
  • 18:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T364299)', diff saved to https://phabricator.wikimedia.org/P63186 and previous config saved to /var/cache/conftool/dbconfig/20240525-182637-marostegui.json
  • 16:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2125 (T364299)', diff saved to https://phabricator.wikimedia.org/P63185 and previous config saved to /var/cache/conftool/dbconfig/20240525-164506-marostegui.json
  • 16:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 16:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 16:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 16:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 16:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T364069)', diff saved to https://phabricator.wikimedia.org/P63184 and previous config saved to /var/cache/conftool/dbconfig/20240525-164135-marostegui.json
  • 16:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P63183 and previous config saved to /var/cache/conftool/dbconfig/20240525-162627-marostegui.json
  • 16:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P63182 and previous config saved to /var/cache/conftool/dbconfig/20240525-161118-marostegui.json
  • 15:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T364069)', diff saved to https://phabricator.wikimedia.org/P63181 and previous config saved to /var/cache/conftool/dbconfig/20240525-155610-marostegui.json
  • 15:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 15:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 13:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 13:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 13:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246 (T364299)', diff saved to https://phabricator.wikimedia.org/P63180 and previous config saved to /var/cache/conftool/dbconfig/20240525-135800-marostegui.json
  • 13:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246', diff saved to https://phabricator.wikimedia.org/P63179 and previous config saved to /var/cache/conftool/dbconfig/20240525-134252-marostegui.json
  • 13:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246', diff saved to https://phabricator.wikimedia.org/P63178 and previous config saved to /var/cache/conftool/dbconfig/20240525-132744-marostegui.json
  • 13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2127 (T364069)', diff saved to https://phabricator.wikimedia.org/P63177 and previous config saved to /var/cache/conftool/dbconfig/20240525-131619-marostegui.json
  • 13:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 13:15 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2127.codfw.wmnet with reason: Maintenance
  • 13:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246 (T364299)', diff saved to https://phabricator.wikimedia.org/P63176 and previous config saved to /var/cache/conftool/dbconfig/20240525-131236-marostegui.json
  • 11:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1246 (T364299)', diff saved to https://phabricator.wikimedia.org/P63175 and previous config saved to /var/cache/conftool/dbconfig/20240525-110931-marostegui.json
  • 11:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1246.eqiad.wmnet with reason: Maintenance
  • 11:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1246.eqiad.wmnet with reason: Maintenance
  • 10:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 10:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 09:41 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:41 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 09:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 09:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T364299)', diff saved to https://phabricator.wikimedia.org/P63174 and previous config saved to /var/cache/conftool/dbconfig/20240525-093814-marostegui.json
  • 09:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P63173 and previous config saved to /var/cache/conftool/dbconfig/20240525-092306-marostegui.json
  • 09:10 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:10 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P63172 and previous config saved to /var/cache/conftool/dbconfig/20240525-090758-marostegui.json
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T364299)', diff saved to https://phabricator.wikimedia.org/P63171 and previous config saved to /var/cache/conftool/dbconfig/20240525-085250-marostegui.json
  • 08:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 08:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T364069)', diff saved to https://phabricator.wikimedia.org/P63170 and previous config saved to /var/cache/conftool/dbconfig/20240525-082057-marostegui.json
  • 08:05 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P63169 and previous config saved to /var/cache/conftool/dbconfig/20240525-080549-marostegui.json
  • 07:50 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P63168 and previous config saved to /var/cache/conftool/dbconfig/20240525-075041-marostegui.json
  • 07:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T364069)', diff saved to https://phabricator.wikimedia.org/P63167 and previous config saved to /var/cache/conftool/dbconfig/20240525-073533-marostegui.json
  • 06:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T364299)', diff saved to https://phabricator.wikimedia.org/P63166 and previous config saved to /var/cache/conftool/dbconfig/20240525-063712-marostegui.json
  • 06:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 06:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 06:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T364299)', diff saved to https://phabricator.wikimedia.org/P63165 and previous config saved to /var/cache/conftool/dbconfig/20240525-063649-marostegui.json
  • 06:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P63164 and previous config saved to /var/cache/conftool/dbconfig/20240525-062141-marostegui.json
  • 06:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T364069)', diff saved to https://phabricator.wikimedia.org/P63163 and previous config saved to /var/cache/conftool/dbconfig/20240525-061028-marostegui.json
  • 06:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 06:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 06:10 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 06:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 06:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T364069)', diff saved to https://phabricator.wikimedia.org/P63162 and previous config saved to /var/cache/conftool/dbconfig/20240525-060947-marostegui.json
  • 06:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P63161 and previous config saved to /var/cache/conftool/dbconfig/20240525-060633-marostegui.json
  • 05:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P63160 and previous config saved to /var/cache/conftool/dbconfig/20240525-055439-marostegui.json
  • 05:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T364299)', diff saved to https://phabricator.wikimedia.org/P63159 and previous config saved to /var/cache/conftool/dbconfig/20240525-055125-marostegui.json
  • 05:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P63158 and previous config saved to /var/cache/conftool/dbconfig/20240525-053931-marostegui.json
  • 05:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T364069)', diff saved to https://phabricator.wikimedia.org/P63157 and previous config saved to /var/cache/conftool/dbconfig/20240525-052423-marostegui.json
  • 04:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T364299)', diff saved to https://phabricator.wikimedia.org/P63156 and previous config saved to /var/cache/conftool/dbconfig/20240525-044304-marostegui.json
  • 04:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 04:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 03:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 03:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 03:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T364299)', diff saved to https://phabricator.wikimedia.org/P63155 and previous config saved to /var/cache/conftool/dbconfig/20240525-030316-marostegui.json
  • 02:57 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1198 (T364069)', diff saved to https://phabricator.wikimedia.org/P63154 and previous config saved to /var/cache/conftool/dbconfig/20240525-025742-marostegui.json
  • 02:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 02:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 02:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T364069)', diff saved to https://phabricator.wikimedia.org/P63153 and previous config saved to /var/cache/conftool/dbconfig/20240525-025719-marostegui.json
  • 02:48 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P63152 and previous config saved to /var/cache/conftool/dbconfig/20240525-024808-marostegui.json
  • 02:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P63151 and previous config saved to /var/cache/conftool/dbconfig/20240525-024211-marostegui.json
  • 02:33 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P63150 and previous config saved to /var/cache/conftool/dbconfig/20240525-023300-marostegui.json
  • 02:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P63149 and previous config saved to /var/cache/conftool/dbconfig/20240525-022703-marostegui.json
  • 02:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T364299)', diff saved to https://phabricator.wikimedia.org/P63148 and previous config saved to /var/cache/conftool/dbconfig/20240525-021752-marostegui.json
  • 02:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T364069)', diff saved to https://phabricator.wikimedia.org/P63147 and previous config saved to /var/cache/conftool/dbconfig/20240525-021154-marostegui.json
  • 01:38 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 01:38 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 01:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1197 (T364299)', diff saved to https://phabricator.wikimedia.org/P63146 and previous config saved to /var/cache/conftool/dbconfig/20240525-011423-marostegui.json
  • 01:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 01:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 01:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T364299)', diff saved to https://phabricator.wikimedia.org/P63145 and previous config saved to /var/cache/conftool/dbconfig/20240525-011359-marostegui.json
  • 00:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P63144 and previous config saved to /var/cache/conftool/dbconfig/20240525-005851-marostegui.json
  • 00:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P63143 and previous config saved to /var/cache/conftool/dbconfig/20240525-004343-marostegui.json
  • 00:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T364299)', diff saved to https://phabricator.wikimedia.org/P63142 and previous config saved to /var/cache/conftool/dbconfig/20240525-002835-marostegui.json

2024-05-24

  • 23:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1189 (T364069)', diff saved to https://phabricator.wikimedia.org/P63141 and previous config saved to /var/cache/conftool/dbconfig/20240524-234433-marostegui.json
  • 23:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 23:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 23:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T364069)', diff saved to https://phabricator.wikimedia.org/P63140 and previous config saved to /var/cache/conftool/dbconfig/20240524-234410-marostegui.json
  • 23:31 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:31 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P63139 and previous config saved to /var/cache/conftool/dbconfig/20240524-232902-marostegui.json
  • 23:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T364299)', diff saved to https://phabricator.wikimedia.org/P63138 and previous config saved to /var/cache/conftool/dbconfig/20240524-232508-marostegui.json
  • 23:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 23:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 23:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T364299)', diff saved to https://phabricator.wikimedia.org/P63137 and previous config saved to /var/cache/conftool/dbconfig/20240524-232445-marostegui.json
  • 23:16 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:16 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P63136 and previous config saved to /var/cache/conftool/dbconfig/20240524-231354-marostegui.json
  • 23:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P63135 and previous config saved to /var/cache/conftool/dbconfig/20240524-230937-marostegui.json
  • 22:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T364069)', diff saved to https://phabricator.wikimedia.org/P63134 and previous config saved to /var/cache/conftool/dbconfig/20240524-225846-marostegui.json
  • 22:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P63133 and previous config saved to /var/cache/conftool/dbconfig/20240524-225428-marostegui.json
  • 22:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T364299)', diff saved to https://phabricator.wikimedia.org/P63132 and previous config saved to /var/cache/conftool/dbconfig/20240524-223921-marostegui.json
  • 22:24 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:24 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:37 eileen: tools upgraded from 36840b71 to 8c98b674
  • 21:08 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:07 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:04 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:03 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:54 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:53 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:53 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:52 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:47 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:47 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T364299)', diff saved to https://phabricator.wikimedia.org/P63131 and previous config saved to /var/cache/conftool/dbconfig/20240524-203243-marostegui.json
  • 20:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 20:32 cdanis@deploy1002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 20:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T364299)', diff saved to https://phabricator.wikimedia.org/P63130 and previous config saved to /var/cache/conftool/dbconfig/20240524-203219-marostegui.json
  • 20:31 cdanis@deploy1002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T364069)', diff saved to https://phabricator.wikimedia.org/P63129 and previous config saved to /var/cache/conftool/dbconfig/20240524-203037-marostegui.json
  • 20:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 20:30 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 20:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T364069)', diff saved to https://phabricator.wikimedia.org/P63128 and previous config saved to /var/cache/conftool/dbconfig/20240524-203014-marostegui.json
  • 20:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P63127 and previous config saved to /var/cache/conftool/dbconfig/20240524-201711-marostegui.json
  • 20:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P63126 and previous config saved to /var/cache/conftool/dbconfig/20240524-201506-marostegui.json
  • 20:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P63125 and previous config saved to /var/cache/conftool/dbconfig/20240524-200203-marostegui.json
  • 19:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P63124 and previous config saved to /var/cache/conftool/dbconfig/20240524-195958-marostegui.json
  • 19:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T364299)', diff saved to https://phabricator.wikimedia.org/P63123 and previous config saved to /var/cache/conftool/dbconfig/20240524-194655-marostegui.json
  • 19:44 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T364069)', diff saved to https://phabricator.wikimedia.org/P63122 and previous config saved to /var/cache/conftool/dbconfig/20240524-194450-marostegui.json
  • 19:32 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config: apply
  • 19:32 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 19:26 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 19:26 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 19:24 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 19:19 cdanis@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 18:45 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 18:45 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 18:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1162 (T364299)', diff saved to https://phabricator.wikimedia.org/P63121 and previous config saved to /var/cache/conftool/dbconfig/20240524-184009-marostegui.json
  • 18:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 18:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 18:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T364299)', diff saved to https://phabricator.wikimedia.org/P63120 and previous config saved to /var/cache/conftool/dbconfig/20240524-183945-marostegui.json
  • 18:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P63118 and previous config saved to /var/cache/conftool/dbconfig/20240524-182437-marostegui.json
  • 18:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P63117 and previous config saved to /var/cache/conftool/dbconfig/20240524-180929-marostegui.json
  • 17:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T364299)', diff saved to https://phabricator.wikimedia.org/P63116 and previous config saved to /var/cache/conftool/dbconfig/20240524-175421-marostegui.json
  • 17:50 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 17:45 cdanis@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 17:45 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 17:44 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 17:41 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 17:34 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 17:30 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 17:30 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 17:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1166 (T364069)', diff saved to https://phabricator.wikimedia.org/P63115 and previous config saved to /var/cache/conftool/dbconfig/20240524-171833-marostegui.json
  • 17:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 17:18 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 17:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T364069)', diff saved to https://phabricator.wikimedia.org/P63114 and previous config saved to /var/cache/conftool/dbconfig/20240524-171809-marostegui.json
  • 17:10 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 17:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P63113 and previous config saved to /var/cache/conftool/dbconfig/20240524-170301-marostegui.json
  • 16:58 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 16:57 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 16:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P63112 and previous config saved to /var/cache/conftool/dbconfig/20240524-164753-marostegui.json
  • 16:36 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 16:35 andrew@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 16:32 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T364069)', diff saved to https://phabricator.wikimedia.org/P63111 and previous config saved to /var/cache/conftool/dbconfig/20240524-163245-marostegui.json
  • 15:51 andrew@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 15:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T364299)', diff saved to https://phabricator.wikimedia.org/P63109 and previous config saved to /var/cache/conftool/dbconfig/20240524-154108-marostegui.json
  • 15:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 15:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 15:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 15:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 15:24 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:24 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:18 Lucas_WMDE: FINISHED lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["76882704"]' 2>&1 | tee -a ~/T315510-enwiki-6; date # a few minutes ago
  • 14:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1157 (T364069)', diff saved to https://phabricator.wikimedia.org/P63107 and previous config saved to /var/cache/conftool/dbconfig/20240524-145912-marostegui.json
  • 14:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 14:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 14:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 14:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 14:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 14:51 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T364299)', diff saved to https://phabricator.wikimedia.org/P63106 and previous config saved to /var/cache/conftool/dbconfig/20240524-145139-marostegui.json
  • 14:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P63105 and previous config saved to /var/cache/conftool/dbconfig/20240524-143630-marostegui.json
  • 14:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P63104 and previous config saved to /var/cache/conftool/dbconfig/20240524-142122-marostegui.json
  • 14:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T364299)', diff saved to https://phabricator.wikimedia.org/P63103 and previous config saved to /var/cache/conftool/dbconfig/20240524-140614-marostegui.json
  • 14:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 100%: post clone (dst) repool', diff saved to https://phabricator.wikimedia.org/P63102 and previous config saved to /var/cache/conftool/dbconfig/20240524-140258-arnaudb.json
  • 13:59 hashar@deploy1002: Finished deploy [gerrit/gerrit@af1257f]: wm-pcc: add a run action - T363918 (duration: 00m 07s)
  • 13:59 hashar@deploy1002: Started deploy [gerrit/gerrit@af1257f]: wm-pcc: add a run action - T363918
  • 13:57 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["76882704"]' 2>&1 | tee -a ~/T315510-enwiki-6; date
  • 13:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 75%: post clone (dst) repool', diff saved to https://phabricator.wikimedia.org/P63101 and previous config saved to /var/cache/conftool/dbconfig/20240524-134752-arnaudb.json
  • 13:36 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 13:36 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 13:36 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts snapshot1008.eqiad.wmnet
  • 13:36 btullis@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:36 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: snapshot1008.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 13:34 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: snapshot1008.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1002"
  • 13:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 50%: post clone (dst) repool', diff saved to https://phabricator.wikimedia.org/P63100 and previous config saved to /var/cache/conftool/dbconfig/20240524-133245-arnaudb.json
  • 13:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1224 (T364299)', diff saved to https://phabricator.wikimedia.org/P63099 and previous config saved to /var/cache/conftool/dbconfig/20240524-132514-marostegui.json
  • 13:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 13:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 13:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T364299)', diff saved to https://phabricator.wikimedia.org/P63098 and previous config saved to /var/cache/conftool/dbconfig/20240524-132450-marostegui.json
  • 13:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 25%: post clone (dst) repool', diff saved to https://phabricator.wikimedia.org/P63097 and previous config saved to /var/cache/conftool/dbconfig/20240524-131739-arnaudb.json
  • 13:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P63096 and previous config saved to /var/cache/conftool/dbconfig/20240524-130942-marostegui.json
  • 13:05 btullis@cumin1002: START - Cookbook sre.dns.netbox
  • 13:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 10%: post clone (dst) repool', diff saved to https://phabricator.wikimedia.org/P63095 and previous config saved to /var/cache/conftool/dbconfig/20240524-130233-arnaudb.json
  • 13:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 100%: post clone (src) repool', diff saved to https://phabricator.wikimedia.org/P63094 and previous config saved to /var/cache/conftool/dbconfig/20240524-130217-arnaudb.json
  • 12:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P63093 and previous config saved to /var/cache/conftool/dbconfig/20240524-125433-marostegui.json
  • 12:53 btullis@cumin1002: START - Cookbook sre.hosts.decommission for hosts snapshot1008.eqiad.wmnet
  • 12:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 5%: post clone (dst) repool', diff saved to https://phabricator.wikimedia.org/P63092 and previous config saved to /var/cache/conftool/dbconfig/20240524-124727-arnaudb.json
  • 12:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 75%: post clone (src) repool', diff saved to https://phabricator.wikimedia.org/P63091 and previous config saved to /var/cache/conftool/dbconfig/20240524-124711-arnaudb.json
  • 12:37 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
  • 12:25 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 12:24 btullis@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 12:23 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
  • 12:23 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
  • 12:23 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
  • 12:22 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
  • 12:20 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 12:20 btullis@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 12:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 1%: post clone (dst) repool', diff saved to https://phabricator.wikimedia.org/P63087 and previous config saved to /var/cache/conftool/dbconfig/20240524-121715-arnaudb.json
  • 12:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 25%: post clone (src) repool', diff saved to https://phabricator.wikimedia.org/P63086 and previous config saved to /var/cache/conftool/dbconfig/20240524-121659-arnaudb.json
  • 12:16 arnaudb@cumin1002: dbctl commit (dc=all): 'fix wrong weight', diff saved to https://phabricator.wikimedia.org/P63085 and previous config saved to /var/cache/conftool/dbconfig/20240524-121641-arnaudb.json
  • 12:16 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 12:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 12:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 12:15 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 12:15 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 12:15 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 25%: post clone (src) repool', diff saved to https://phabricator.wikimedia.org/P63084 and previous config saved to /var/cache/conftool/dbconfig/20240524-121523-arnaudb.json
  • 12:14 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 12:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2116.codfw.wmnet onto db2176.codfw.wmnet
  • 11:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T364299)', diff saved to https://phabricator.wikimedia.org/P63083 and previous config saved to /var/cache/conftool/dbconfig/20240524-115351-marostegui.json
  • 11:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 11:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 11:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T364299)', diff saved to https://phabricator.wikimedia.org/P63082 and previous config saved to /var/cache/conftool/dbconfig/20240524-115328-marostegui.json
  • 11:44 akosiaris: manually delete the 1 sessionstore pod running on parse1004
  • 11:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P63081 and previous config saved to /var/cache/conftool/dbconfig/20240524-113820-marostegui.json
  • 11:24 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 11:24 btullis@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 11:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P63080 and previous config saved to /var/cache/conftool/dbconfig/20240524-112310-marostegui.json
  • 11:22 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
  • 11:22 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
  • 11:21 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
  • 11:21 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/media-analytics: apply
  • 11:21 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
  • 11:20 btullis@deploy1002: helmfile [staging] START helmfile.d/services/media-analytics: apply
  • 11:19 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
  • 11:18 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
  • 11:18 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
  • 11:17 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
  • 11:15 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
  • 11:15 btullis@deploy1002: helmfile [staging] START helmfile.d/services/page-analytics: apply
  • 11:10 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
  • 11:10 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
  • 11:10 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
  • 11:09 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T364299)', diff saved to https://phabricator.wikimedia.org/P63079 and previous config saved to /var/cache/conftool/dbconfig/20240524-110802-marostegui.json
  • 11:07 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
  • 11:07 btullis@deploy1002: helmfile [staging] START helmfile.d/services/geo-analytics: apply
  • 11:06 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 10:56 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2150.codfw.wmnet with reason: reimage
  • 10:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db2150.codfw.wmnet with reason: reimage
  • 10:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2150, hardware issues ', diff saved to https://phabricator.wikimedia.org/P63078 and previous config saved to /var/cache/conftool/dbconfig/20240524-104953-arnaudb.json
  • 10:27 arnaudb@cumin1002: START - Cookbook sre.mysql.clone of db2116.codfw.wmnet onto db2176.codfw.wmnet
  • 10:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2116 to clone on db2176 T365793', diff saved to https://phabricator.wikimedia.org/P63077 and previous config saved to /var/cache/conftool/dbconfig/20240524-102424-arnaudb.json
  • 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T364299)', diff saved to https://phabricator.wikimedia.org/P63076 and previous config saved to /var/cache/conftool/dbconfig/20240524-102340-marostegui.json
  • 10:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 10:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T364299)', diff saved to https://phabricator.wikimedia.org/P63075 and previous config saved to /var/cache/conftool/dbconfig/20240524-102315-marostegui.json
  • 10:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P63074 and previous config saved to /var/cache/conftool/dbconfig/20240524-100807-marostegui.json
  • 09:54 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2176.codfw.wmnet with reason: Host has issues
  • 09:54 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db2176.codfw.wmnet with reason: Host has issues
  • 09:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P63073 and previous config saved to /var/cache/conftool/dbconfig/20240524-095259-marostegui.json
  • 09:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2176', diff saved to https://phabricator.wikimedia.org/P63072 and previous config saved to /var/cache/conftool/dbconfig/20240524-094703-arnaudb.json
  • 09:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T364299)', diff saved to https://phabricator.wikimedia.org/P63071 and previous config saved to /var/cache/conftool/dbconfig/20240524-093751-marostegui.json
  • 09:25 hashar@deploy1002: Finished deploy [gerrit/gerrit@159288a]: Allow users to recheck tests in checkers - T363918 (duration: 00m 07s)
  • 09:25 hashar@deploy1002: Started deploy [gerrit/gerrit@159288a]: Allow users to recheck tests in checkers - T363918
  • 08:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T364299)', diff saved to https://phabricator.wikimedia.org/P63070 and previous config saved to /var/cache/conftool/dbconfig/20240524-085423-marostegui.json
  • 08:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 08:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 08:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T364299)', diff saved to https://phabricator.wikimedia.org/P63069 and previous config saved to /var/cache/conftool/dbconfig/20240524-085400-marostegui.json
  • 08:41 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab to new version
  • 08:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P63068 and previous config saved to /var/cache/conftool/dbconfig/20240524-083851-marostegui.json
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P63067 and previous config saved to /var/cache/conftool/dbconfig/20240524-082343-marostegui.json
  • 08:10 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 08:10 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T364299)', diff saved to https://phabricator.wikimedia.org/P63066 and previous config saved to /var/cache/conftool/dbconfig/20240524-080835-marostegui.json
  • 07:40 dcausse@deploy1002: Finished deploy [airflow-dags/search@8f0b4a1]: search: fix import_ttl dag (duration: 00m 19s)
  • 07:40 dcausse@deploy1002: Started deploy [airflow-dags/search@8f0b4a1]: search: fix import_ttl dag
  • 07:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1173 (T364299)', diff saved to https://phabricator.wikimedia.org/P63065 and previous config saved to /var/cache/conftool/dbconfig/20240524-072639-marostegui.json
  • 07:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 07:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 07:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T364299)', diff saved to https://phabricator.wikimedia.org/P63064 and previous config saved to /var/cache/conftool/dbconfig/20240524-072616-marostegui.json
  • 07:13 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 07:13 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 07:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P63063 and previous config saved to /var/cache/conftool/dbconfig/20240524-071108-marostegui.json
  • 06:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P63062 and previous config saved to /var/cache/conftool/dbconfig/20240524-065600-marostegui.json
  • 06:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T364299)', diff saved to https://phabricator.wikimedia.org/P63061 and previous config saved to /var/cache/conftool/dbconfig/20240524-064053-marostegui.json
  • 06:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P63060 and previous config saved to /var/cache/conftool/dbconfig/20240524-061812-root.json
  • 06:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 75%: After reimage', diff saved to https://phabricator.wikimedia.org/P63059 and previous config saved to /var/cache/conftool/dbconfig/20240524-060305-root.json
  • 05:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T364299)', diff saved to https://phabricator.wikimedia.org/P63058 and previous config saved to /var/cache/conftool/dbconfig/20240524-055616-marostegui.json
  • 05:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 05:55 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 05:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T364299)', diff saved to https://phabricator.wikimedia.org/P63057 and previous config saved to /var/cache/conftool/dbconfig/20240524-055553-marostegui.json
  • 05:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 50%: After reimage', diff saved to https://phabricator.wikimedia.org/P63056 and previous config saved to /var/cache/conftool/dbconfig/20240524-054759-root.json
  • 05:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P63055 and previous config saved to /var/cache/conftool/dbconfig/20240524-054045-marostegui.json
  • 05:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P63054 and previous config saved to /var/cache/conftool/dbconfig/20240524-053250-root.json
  • 05:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P63053 and previous config saved to /var/cache/conftool/dbconfig/20240524-052537-marostegui.json
  • 05:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2122.codfw.wmnet with OS bookworm
  • 05:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P63052 and previous config saved to /var/cache/conftool/dbconfig/20240524-051744-root.json
  • 05:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T364299)', diff saved to https://phabricator.wikimedia.org/P63051 and previous config saved to /var/cache/conftool/dbconfig/20240524-051028-marostegui.json
  • 04:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2122.codfw.wmnet with reason: host reimage
  • 04:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2122.codfw.wmnet with reason: host reimage
  • 04:36 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2122.codfw.wmnet with OS bookworm
  • 04:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2122', diff saved to https://phabricator.wikimedia.org/P63050 and previous config saved to /var/cache/conftool/dbconfig/20240524-043441-root.json
  • 04:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T364299)', diff saved to https://phabricator.wikimedia.org/P63049 and previous config saved to /var/cache/conftool/dbconfig/20240524-042358-marostegui.json
  • 04:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 04:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 04:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 04:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 02:58 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 02:58 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 00:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T364299)', diff saved to https://phabricator.wikimedia.org/P63048 and previous config saved to /var/cache/conftool/dbconfig/20240524-004342-marostegui.json
  • 00:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P63047 and previous config saved to /var/cache/conftool/dbconfig/20240524-002834-marostegui.json
  • 00:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P63046 and previous config saved to /var/cache/conftool/dbconfig/20240524-001326-marostegui.json

2024-05-23

  • 23:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T364299)', diff saved to https://phabricator.wikimedia.org/P63045 and previous config saved to /var/cache/conftool/dbconfig/20240523-235817-marostegui.json
  • 23:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1238 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P63044 and previous config saved to /var/cache/conftool/dbconfig/20240523-233017-ladsgroup.json
  • 23:24 zabe@deploy1002: Finished scap: Backport for Deploy configuration for wrapping B type passwords with encrypted Argon2 (T112359) (duration: 16m 00s)
  • 23:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1238 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P63043 and previous config saved to /var/cache/conftool/dbconfig/20240523-231511-ladsgroup.json
  • 23:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2217 (T364299)', diff saved to https://phabricator.wikimedia.org/P63042 and previous config saved to /var/cache/conftool/dbconfig/20240523-231302-marostegui.json
  • 23:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2217.codfw.wmnet with reason: Maintenance
  • 23:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2217.codfw.wmnet with reason: Maintenance
  • 23:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214 (T364299)', diff saved to https://phabricator.wikimedia.org/P63041 and previous config saved to /var/cache/conftool/dbconfig/20240523-231238-marostegui.json
  • 23:11 zabe@deploy1002: zabe: Continuing with sync
  • 23:10 zabe@deploy1002: zabe: Backport for Deploy configuration for wrapping B type passwords with encrypted Argon2 (T112359) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:08 zabe@deploy1002: Started scap: Backport for Deploy configuration for wrapping B type passwords with encrypted Argon2 (T112359)
  • 23:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1238 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P63040 and previous config saved to /var/cache/conftool/dbconfig/20240523-230005-ladsgroup.json
  • 22:57 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214', diff saved to https://phabricator.wikimedia.org/P63039 and previous config saved to /var/cache/conftool/dbconfig/20240523-225730-marostegui.json
  • 22:54 eileen: tools upgraded from bce5f52b to 91893e29
  • 22:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1238 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P63038 and previous config saved to /var/cache/conftool/dbconfig/20240523-224459-ladsgroup.json
  • 22:43 zabe@deploy1002: Finished scap: Backport for Stop writing to af_user(_text)/afh_user(_text) in group0 wikis (T337920) (duration: 18m 39s)
  • 22:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214', diff saved to https://phabricator.wikimedia.org/P63037 and previous config saved to /var/cache/conftool/dbconfig/20240523-224222-marostegui.json
  • 22:30 zabe@deploy1002: zabe: Continuing with sync
  • 22:27 zabe@deploy1002: zabe: Backport for Stop writing to af_user(_text)/afh_user(_text) in group0 wikis (T337920) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214 (T364299)', diff saved to https://phabricator.wikimedia.org/P63036 and previous config saved to /var/cache/conftool/dbconfig/20240523-222714-marostegui.json
  • 22:26 eileen: civicrm upgraded from 72aa5118 to 6c1fdd4f
  • 22:24 zabe@deploy1002: Started scap: Backport for Stop writing to af_user(_text)/afh_user(_text) in group0 wikis (T337920)
  • 22:11 eileen: civicrm upgraded from 22a38356 to 72aa5118
  • 21:52 thcipriani@deploy1002: Finished scap: Backport for wikitech: (Un)block GitLab accounts when (un)blocked on wikitech (T316418) (duration: 18m 01s)
  • 21:46 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2214 (T364299)', diff saved to https://phabricator.wikimedia.org/P63035 and previous config saved to /var/cache/conftool/dbconfig/20240523-214614-marostegui.json
  • 21:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2214.codfw.wmnet with reason: Maintenance
  • 21:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2214.codfw.wmnet with reason: Maintenance
  • 21:40 thcipriani@deploy1002: thcipriani and bd808: Continuing with sync
  • 21:37 thcipriani@deploy1002: thcipriani and bd808: Backport for wikitech: (Un)block GitLab accounts when (un)blocked on wikitech (T316418) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:34 thcipriani@deploy1002: Started scap: Backport for wikitech: (Un)block GitLab accounts when (un)blocked on wikitech (T316418)
  • 21:29 eileen: civicrm upgraded from 55cb3cf7 to 22a38356
  • 21:26 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 21:18 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 21:14 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 21:14 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 21:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 21:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 21:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T364299)', diff saved to https://phabricator.wikimedia.org/P63034 and previous config saved to /var/cache/conftool/dbconfig/20240523-211044-marostegui.json
  • 20:59 eileen: civicrm upgraded from de92d6bc to 55cb3cf7
  • 20:55 jsn@deploy1002: Finished scap: Backport for Always use desktop watchlist HTML on mobile (T109277) (duration: 16m 23s)
  • 20:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P63033 and previous config saved to /var/cache/conftool/dbconfig/20240523-205536-marostegui.json
  • 20:44 jsn@deploy1002: jdlrobson and jsn: Continuing with sync
  • 20:42 jsn@deploy1002: jdlrobson and jsn: Backport for Always use desktop watchlist HTML on mobile (T109277) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P63032 and previous config saved to /var/cache/conftool/dbconfig/20240523-204028-marostegui.json
  • 20:39 jsn@deploy1002: Started scap: Backport for Always use desktop watchlist HTML on mobile (T109277)
  • 20:36 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 20:36 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 20:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T364299)', diff saved to https://phabricator.wikimedia.org/P63031 and previous config saved to /var/cache/conftool/dbconfig/20240523-202520-marostegui.json
  • 20:24 jsn@deploy1002: Finished scap: Backport for CommonSettings: Load AutoModerator extension (T361643), InitialiseSettings: testwiki enable AutoModerator (T361643) (duration: 17m 30s)
  • 20:24 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 20:23 cdanis@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 20:21 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 20:20 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 20:12 jsn@deploy1002: jsn: Continuing with sync
  • 20:09 jsn@deploy1002: jsn: Backport for CommonSettings: Load AutoModerator extension (T361643), InitialiseSettings: testwiki enable AutoModerator (T361643) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:08 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
  • 20:07 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/toolhub: apply
  • 20:06 jsn@deploy1002: Started scap: Backport for CommonSettings: Load AutoModerator extension (T361643), InitialiseSettings: testwiki enable AutoModerator (T361643)
  • 20:06 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
  • 20:05 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/toolhub: apply
  • 20:05 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 20:04 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
  • 20:04 bd808@deploy1002: helmfile [staging] START helmfile.d/services/toolhub: apply
  • 19:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T364299)', diff saved to https://phabricator.wikimedia.org/P63030 and previous config saved to /var/cache/conftool/dbconfig/20240523-194723-marostegui.json
  • 19:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 19:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 19:47 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T364299)', diff saved to https://phabricator.wikimedia.org/P63029 and previous config saved to /var/cache/conftool/dbconfig/20240523-194659-marostegui.json
  • 19:38 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 19:38 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 19:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P63028 and previous config saved to /var/cache/conftool/dbconfig/20240523-193152-marostegui.json
  • 19:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P63027 and previous config saved to /var/cache/conftool/dbconfig/20240523-191644-marostegui.json
  • 19:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T364299)', diff saved to https://phabricator.wikimedia.org/P63026 and previous config saved to /var/cache/conftool/dbconfig/20240523-190136-marostegui.json
  • 18:55 cdanis@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 18:55 cdanis@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 18:55 cdanis@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 18:54 cdanis@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 18:48 cdanis: T365626 helmfile destroy'd all opentelemetry-collector releases
  • 18:32 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2050.codfw.wmnet with OS bookworm
  • 18:26 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1050.eqiad.wmnet with OS bookworm
  • 18:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 18:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 18:16 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T364299)', diff saved to https://phabricator.wikimedia.org/P63025 and previous config saved to /var/cache/conftool/dbconfig/20240523-181643-marostegui.json
  • 18:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 18:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 18:16 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T364299)', diff saved to https://phabricator.wikimedia.org/P63024 and previous config saved to /var/cache/conftool/dbconfig/20240523-181630-marostegui.json
  • 18:14 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2050.codfw.wmnet with reason: host reimage
  • 18:11 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2050.codfw.wmnet with reason: host reimage
  • 18:09 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1050.eqiad.wmnet with reason: host reimage
  • 18:06 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1050.eqiad.wmnet with reason: host reimage
  • 18:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P63023 and previous config saved to /var/cache/conftool/dbconfig/20240523-180122-marostegui.json
  • 17:53 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2050.codfw.wmnet with OS bookworm
  • 17:53 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1050.eqiad.wmnet with OS bookworm
  • 17:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P63022 and previous config saved to /var/cache/conftool/dbconfig/20240523-174614-marostegui.json
  • 17:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T364299)', diff saved to https://phabricator.wikimedia.org/P63021 and previous config saved to /var/cache/conftool/dbconfig/20240523-173106-marostegui.json
  • 17:16 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
  • 17:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63020 and previous config saved to /var/cache/conftool/dbconfig/20240523-171022-arnaudb.json
  • 17:06 bd808@deploy1002: helmfile [staging] START helmfile.d/services/toolhub: apply
  • 17:05 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 17:04 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 17:04 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 17:03 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 17:03 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 17:03 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 16:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63019 and previous config saved to /var/cache/conftool/dbconfig/20240523-165516-arnaudb.json
  • 16:43 dduvall@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply
  • 16:43 dduvall@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply
  • 16:43 dduvall@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply
  • 16:42 dduvall@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply
  • 16:42 dduvall@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply
  • 16:42 dduvall@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply
  • 16:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63018 and previous config saved to /var/cache/conftool/dbconfig/20240523-164010-arnaudb.json
  • 16:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2169 (T364299)', diff saved to https://phabricator.wikimedia.org/P63017 and previous config saved to /var/cache/conftool/dbconfig/20240523-164002-marostegui.json
  • 16:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 16:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 16:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T364299)', diff saved to https://phabricator.wikimedia.org/P63016 and previous config saved to /var/cache/conftool/dbconfig/20240523-163938-marostegui.json
  • 16:37 dduvall: destroying all blubberoid deployments as part of its decommissioning (T318289)
  • 16:27 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/image-suggestion: apply
  • 16:26 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/image-suggestion: apply
  • 16:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63015 and previous config saved to /var/cache/conftool/dbconfig/20240523-162457-arnaudb.json
  • 16:24 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
  • 16:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P63014 and previous config saved to /var/cache/conftool/dbconfig/20240523-162430-marostegui.json
  • 16:23 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
  • 16:21 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
  • 16:20 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
  • 16:19 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
  • 16:18 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
  • 16:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63013 and previous config saved to /var/cache/conftool/dbconfig/20240523-161755-arnaudb.json
  • 16:17 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
  • 16:16 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
  • 16:15 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
  • 16:15 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
  • 16:14 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 16:13 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 16:13 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 16:12 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 16:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63012 and previous config saved to /var/cache/conftool/dbconfig/20240523-160951-arnaudb.json
  • 16:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P63011 and previous config saved to /var/cache/conftool/dbconfig/20240523-160921-marostegui.json
  • 16:08 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 16:08 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 16:05 topranks: enabling BFD on transit circuit to telxius in magru
  • 16:04 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/image-suggestion: apply
  • 16:04 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 16:04 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 16:03 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/image-suggestion: apply
  • 16:02 rzl@deploy1002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 16:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63010 and previous config saved to /var/cache/conftool/dbconfig/20240523-160249-arnaudb.json
  • 16:02 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
  • 16:02 rzl@deploy1002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
  • 16:01 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
  • 16:00 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
  • 15:59 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/media-analytics: apply
  • 15:58 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
  • 15:57 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
  • 15:56 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
  • 15:55 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 15:55 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
  • 15:54 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63009 and previous config saved to /var/cache/conftool/dbconfig/20240523-155444-arnaudb.json
  • 15:54 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T364299)', diff saved to https://phabricator.wikimedia.org/P63008 and previous config saved to /var/cache/conftool/dbconfig/20240523-155413-marostegui.json
  • 15:53 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
  • 15:51 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 15:50 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 15:47 rzl@deploy1002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 15:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63006 and previous config saved to /var/cache/conftool/dbconfig/20240523-154743-arnaudb.json
  • 15:47 rzl@deploy1002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
  • 15:41 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
  • 15:41 rzl@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 15:40 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
  • 15:40 rzl@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
  • 15:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63004 and previous config saved to /var/cache/conftool/dbconfig/20240523-153937-arnaudb.json
  • 15:37 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
  • 15:36 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/page-analytics: apply
  • 15:34 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
  • 15:34 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/media-analytics: apply
  • 15:32 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
  • 15:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63003 and previous config saved to /var/cache/conftool/dbconfig/20240523-153237-arnaudb.json
  • 15:32 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/geo-analytics: apply
  • 15:30 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
  • 15:30 jhathaway: moving phabricator outbound email to postfix based mx-out{1001,2001}
  • 15:29 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
  • 15:28 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
  • 15:28 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
  • 15:26 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 15:26 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
  • 15:25 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 15:25 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
  • 15:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db1235 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63002 and previous config saved to /var/cache/conftool/dbconfig/20240523-152431-arnaudb.json
  • 15:22 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 15:22 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 15:18 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
  • 15:18 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
  • 15:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63001 and previous config saved to /var/cache/conftool/dbconfig/20240523-151731-arnaudb.json
  • 15:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1235.eqiad.wmnet with OS bookworm
  • 15:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P63000 and previous config saved to /var/cache/conftool/dbconfig/20240523-150225-arnaudb.json
  • 14:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T364299)', diff saved to https://phabricator.wikimedia.org/P62999 and previous config saved to /var/cache/conftool/dbconfig/20240523-145938-marostegui.json
  • 14:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 14:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 14:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 14:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 14:59 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T364299)', diff saved to https://phabricator.wikimedia.org/P62998 and previous config saved to /var/cache/conftool/dbconfig/20240523-145858-marostegui.json
  • 14:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
  • 14:52 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
  • 14:51 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 14:51 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 14:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62997 and previous config saved to /var/cache/conftool/dbconfig/20240523-144719-arnaudb.json
  • 14:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host stat1008.eqiad.wmnet
  • 14:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P62996 and previous config saved to /var/cache/conftool/dbconfig/20240523-144351-marostegui.json
  • 14:39 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1235.eqiad.wmnet with OS bookworm
  • 14:38 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1235.eqiad.wmnet with reason: reimage
  • 14:38 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1235.eqiad.wmnet with reason: reimage
  • 14:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1235 T364290', diff saved to https://phabricator.wikimedia.org/P62995 and previous config saved to /var/cache/conftool/dbconfig/20240523-143742-arnaudb.json
  • 14:35 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host stat1008.eqiad.wmnet
  • 14:34 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1041.eqiad.wmnet with OS bookworm
  • 14:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2116 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62994 and previous config saved to /var/cache/conftool/dbconfig/20240523-143213-arnaudb.json
  • 14:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2116.codfw.wmnet with OS bookworm
  • 14:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P62993 and previous config saved to /var/cache/conftool/dbconfig/20240523-142843-marostegui.json
  • 14:26 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 5769
  • 14:25 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 5769
  • 14:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T364299)', diff saved to https://phabricator.wikimedia.org/P62992 and previous config saved to /var/cache/conftool/dbconfig/20240523-141334-marostegui.json
  • 14:06 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2116.codfw.wmnet with reason: host reimage
  • 14:03 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2116.codfw.wmnet with reason: host reimage
  • 13:56 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:49 reedy@deploy1002: Synchronized wmf-config/interwiki-labs.php: (no justification provided) (duration: 16m 30s)
  • 13:46 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2116.codfw.wmnet with OS bookworm
  • 13:41 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2051.codfw.wmnet with OS bookworm
  • 13:36 arnaudb@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2116.codfw.wmnet with OS bookworm
  • 13:29 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1051.eqiad.wmnet with OS bookworm
  • 13:22 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2051.codfw.wmnet with reason: host reimage
  • 13:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2116.codfw.wmnet with reason: host reimage
  • 13:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T364299)', diff saved to https://phabricator.wikimedia.org/P62991 and previous config saved to /var/cache/conftool/dbconfig/20240523-131734-marostegui.json
  • 13:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 13:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 13:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T364299)', diff saved to https://phabricator.wikimedia.org/P62990 and previous config saved to /var/cache/conftool/dbconfig/20240523-131710-marostegui.json
  • 13:16 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2051.codfw.wmnet with reason: host reimage
  • 13:16 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2116.codfw.wmnet with reason: host reimage
  • 13:13 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1051.eqiad.wmnet with reason: host reimage
  • 13:10 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1051.eqiad.wmnet with reason: host reimage
  • 13:02 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P62989 and previous config saved to /var/cache/conftool/dbconfig/20240523-130202-marostegui.json
  • 12:59 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2116.codfw.wmnet with OS bookworm
  • 12:58 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2051.codfw.wmnet with OS bookworm
  • 12:57 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1051.eqiad.wmnet with OS bookworm
  • 12:57 vgutierrez: repool upload@esams with IPIP encapsulation enabled - T357257
  • 12:57 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2116.codfw.wmnet with reason: reimage
  • 12:56 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2116.codfw.wmnet with reason: reimage
  • 12:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2116 T364290', diff saved to https://phabricator.wikimedia.org/P62988 and previous config saved to /var/cache/conftool/dbconfig/20240523-125641-arnaudb.json
  • 12:50 vgutierrez: rolling restart of pybal on lvs3010 and lvs3009 - T357257
  • 12:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62987 and previous config saved to /var/cache/conftool/dbconfig/20240523-124832-arnaudb.json
  • 12:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P62986 and previous config saved to /var/cache/conftool/dbconfig/20240523-124654-marostegui.json
  • 12:21 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2052.codfw.wmnet with OS bookworm
  • 12:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62983 and previous config saved to /var/cache/conftool/dbconfig/20240523-121819-arnaudb.json
  • 12:17 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1052.eqiad.wmnet with OS bookworm
  • 12:04 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2052.codfw.wmnet with reason: host reimage
  • 12:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62982 and previous config saved to /var/cache/conftool/dbconfig/20240523-120313-arnaudb.json
  • 12:01 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1052.eqiad.wmnet with reason: host reimage
  • 12:01 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2052.codfw.wmnet with reason: host reimage
  • 11:56 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1052.eqiad.wmnet with reason: host reimage
  • 11:52 vgutierrez: depool upload@esams before enabling IPIP encapsulation - T357257
  • 11:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62981 and previous config saved to /var/cache/conftool/dbconfig/20240523-114807-arnaudb.json
  • 11:43 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2052.codfw.wmnet with OS bookworm
  • 11:43 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1052.eqiad.wmnet with OS bookworm
  • 11:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T364299)', diff saved to https://phabricator.wikimedia.org/P62980 and previous config saved to /var/cache/conftool/dbconfig/20240523-114259-marostegui.json
  • 11:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 11:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 11:40 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host stat1008.eqiad.wmnet with OS bullseye
  • 11:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62979 and previous config saved to /var/cache/conftool/dbconfig/20240523-113301-arnaudb.json
  • 11:27 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62978 and previous config saved to /var/cache/conftool/dbconfig/20240523-112704-root.json
  • 11:24 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2053.codfw.wmnet with OS bookworm
  • 11:18 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1053.eqiad.wmnet with OS bookworm
  • 11:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62977 and previous config saved to /var/cache/conftool/dbconfig/20240523-111755-arnaudb.json
  • 11:11 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62976 and previous config saved to /var/cache/conftool/dbconfig/20240523-111157-root.json
  • 11:11 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on stat1008.eqiad.wmnet with reason: host reimage
  • 11:08 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on stat1008.eqiad.wmnet with reason: host reimage
  • 11:06 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2053.codfw.wmnet with reason: host reimage
  • 11:02 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1053.eqiad.wmnet with reason: host reimage
  • 11:02 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2053.codfw.wmnet with reason: host reimage
  • 11:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62975 and previous config saved to /var/cache/conftool/dbconfig/20240523-110249-arnaudb.json
  • 10:57 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1053.eqiad.wmnet with reason: host reimage
  • 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2170.codfw.wmnet
  • 10:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62974 and previous config saved to /var/cache/conftool/dbconfig/20240523-105651-root.json
  • 10:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2130.codfw.wmnet with OS bookworm
  • 10:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 10:47 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 10:45 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host stat1008.eqiad.wmnet with OS bullseye
  • 10:44 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc2053.codfw.wmnet with OS bookworm
  • 10:44 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc1053.eqiad.wmnet with OS bookworm
  • 10:42 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2170.codfw.wmnet
  • 10:42 hnowlan@cumin1002: conftool action : set/pooled=no; selector: name=wikikube-worker2001.codfw.wmnet
  • 10:41 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62973 and previous config saved to /var/cache/conftool/dbconfig/20240523-104145-root.json
  • 10:40 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host stat1008.eqiad.wmnet with OS bullseye
  • 10:39 hnowlan@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-worker2001.codfw.wmnet
  • 10:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2130.codfw.wmnet with reason: host reimage
  • 10:26 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62972 and previous config saved to /var/cache/conftool/dbconfig/20240523-102639-root.json
  • 10:25 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2130.codfw.wmnet with reason: host reimage
  • 10:25 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host stat1008.eqiad.wmnet with OS bullseye
  • 10:11 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62971 and previous config saved to /var/cache/conftool/dbconfig/20240523-101133-root.json
  • 10:08 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host stat1008.eqiad.wmnet with OS bullseye
  • 10:06 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2130.codfw.wmnet with OS bookworm
  • 10:06 btullis@cumin1002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) pool device-analytics in eqiad: maintenance
  • 10:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db2130 T364290', diff saved to https://phabricator.wikimedia.org/P62970 and previous config saved to /var/cache/conftool/dbconfig/20240523-100452-arnaudb.json
  • 10:04 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2130.codfw.wmnet with reason: reimage
  • 10:04 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2130.codfw.wmnet with reason: reimage
  • 10:01 btullis@cumin1002: START - Cookbook sre.discovery.service-route pool device-analytics in eqiad: maintenance
  • 09:57 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 09:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 09:56 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62969 and previous config saved to /var/cache/conftool/dbconfig/20240523-095627-root.json
  • 09:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2146.codfw.wmnet
  • 09:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P62968 and previous config saved to /var/cache/conftool/dbconfig/20240523-095338-marostegui.json
  • 09:50 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 09:49 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 09:47 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host stat1008.eqiad.wmnet with OS bullseye
  • 09:42 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2146.codfw.wmnet
  • 09:42 btullis@cumin1002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool device-analytics in eqiad: maintenance
  • 09:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T364299)', diff saved to https://phabricator.wikimedia.org/P62967 and previous config saved to /var/cache/conftool/dbconfig/20240523-093830-marostegui.json
  • 09:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1226 (T364299)', diff saved to https://phabricator.wikimedia.org/P62966 and previous config saved to /var/cache/conftool/dbconfig/20240523-093720-marostegui.json
  • 09:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 09:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 09:37 btullis@cumin1002: START - Cookbook sre.discovery.service-route depool device-analytics in eqiad: maintenance
  • 09:37 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 09:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 09:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T364299)', diff saved to https://phabricator.wikimedia.org/P62965 and previous config saved to /var/cache/conftool/dbconfig/20240523-093703-marostegui.json
  • 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2145.codfw.wmnet
  • 09:30 moritzm: installing zeromq3 bugfix updates from Bullseye point release
  • 09:30 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
  • 09:29 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/sessionstore: apply
  • 09:24 btullis@cumin1002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) pool device-analytics in codfw: maintenance
  • 09:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P62963 and previous config saved to /var/cache/conftool/dbconfig/20240523-092153-marostegui.json
  • 09:19 btullis@cumin1002: START - Cookbook sre.discovery.service-route pool device-analytics in codfw: maintenance
  • 09:18 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2145.codfw.wmnet
  • 09:18 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 09:17 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 09:12 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2054.codfw.wmnet with OS bookworm
  • 09:12 btullis@cumin1002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool device-analytics in codfw: maintenance
  • 09:08 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS bookworm
  • 09:07 btullis@cumin1002: START - Cookbook sre.discovery.service-route depool device-analytics in codfw: maintenance
  • 09:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P62962 and previous config saved to /var/cache/conftool/dbconfig/20240523-090645-marostegui.json
  • 09:04 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 09:04 btullis@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 08:54 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2054.codfw.wmnet with reason: host reimage
  • 08:51 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2054.codfw.wmnet with reason: host reimage
  • 08:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T364299)', diff saved to https://phabricator.wikimedia.org/P62961 and previous config saved to /var/cache/conftool/dbconfig/20240523-085137-marostegui.json
  • 08:51 jiji@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
  • 08:51 marostegui: Deploy schema change on s4 eqiad old master db1238 dbmaint T356166
  • 08:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1238', diff saved to https://phabricator.wikimedia.org/P62960 and previous config saved to /var/cache/conftool/dbconfig/20240523-085023-root.json
  • 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2130.codfw.wmnet
  • 08:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db1238 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62959 and previous config saved to /var/cache/conftool/dbconfig/20240523-084834-arnaudb.json
  • 08:48 jiji@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
  • 08:44 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 08:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2130.codfw.wmnet
  • 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2116.codfw.wmnet
  • 08:35 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2001.codfw.wmnet
  • 08:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1238.eqiad.wmnet with OS bookworm
  • 08:33 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc2054.codfw.wmnet with OS bookworm
  • 08:33 jiji@cumin2002: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS bookworm
  • 08:31 aklapper@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.43.0-wmf.6 refs T361400
  • 08:25 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2001.codfw.wmnet with reason: host reimage
  • 08:23 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2001.codfw.wmnet with reason: host reimage
  • 08:20 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet
  • 08:18 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2116.codfw.wmnet
  • 08:15 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
  • 08:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1238.eqiad.wmnet with reason: host reimage
  • 08:11 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-serve2001.codfw.wmnet
  • 08:10 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1238.eqiad.wmnet with reason: host reimage
  • 08:07 jayme@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-worker2001.codfw.wmnet with OS bullseye
  • 08:02 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from kubernetes2023 to wikikube-worker2001
  • 08:01 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2001
  • 08:01 ayounsi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2001
  • 08:01 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:01 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2023 to wikikube-worker2001 - ayounsi@cumin1002"
  • 07:59 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming kubernetes2023 to wikikube-worker2001 - ayounsi@cumin1002"
  • 07:57 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 07:57 ayounsi@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2023 to wikikube-worker2001
  • 07:56 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1238.eqiad.wmnet with OS bookworm
  • 07:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1238.eqiad.wmnet with reason: reimage
  • 07:54 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1238.eqiad.wmnet with reason: reimage
  • 07:49 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.rename (exit_code=99) from kubernetes2023 to wikikube-worker2001
  • 07:48 ayounsi@cumin1002: START - Cookbook sre.hosts.rename from kubernetes2023 to wikikube-worker2001
  • 07:42 dcausse@deploy1002: Finished scap: Backport for extension registration: Fix handling of null default values (T365190) (duration: 16m 56s)
  • 07:30 dcausse@deploy1002: dcausse: Continuing with sync
  • 07:28 dcausse@deploy1002: dcausse: Backport for extension registration: Fix handling of null default values (T365190) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:25 dcausse@deploy1002: Started scap: Backport for extension registration: Fix handling of null default values (T365190)
  • 07:20 dcausse@deploy1002: Finished scap: Backport for cirrus: Keep archive writes running through cirrus (duration: 17m 19s)
  • 07:08 dcausse@deploy1002: ebernhardson and dcausse: Continuing with sync
  • 07:06 dcausse@deploy1002: ebernhardson and dcausse: Backport for cirrus: Keep archive writes running through cirrus synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:03 dcausse@deploy1002: Started scap: Backport for cirrus: Keep archive writes running through cirrus
  • 06:45 dcausse@deploy1002: Finished deploy [airflow-dags/search@49369da]: search: automate graph split and n3 dump generation (duration: 00m 19s)
  • 06:45 dcausse@deploy1002: Started deploy [airflow-dags/search@49369da]: search: automate graph split and n3 dump generation
  • 06:40 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62954 and previous config saved to /var/cache/conftool/dbconfig/20240523-064027-root.json
  • 06:31 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 06:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Depool db1238 T363689', diff saved to https://phabricator.wikimedia.org/P62953 and previous config saved to /var/cache/conftool/dbconfig/20240523-063025-arnaudb.json
  • 06:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62952 and previous config saved to /var/cache/conftool/dbconfig/20240523-062521-root.json
  • 06:24 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 06:24 jelto@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 06:16 jelto@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version
  • 06:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db1160 to s4 primary and set section read-write T363689', diff saved to https://phabricator.wikimedia.org/P62951 and previous config saved to /var/cache/conftool/dbconfig/20240523-061524-arnaudb.json
  • 06:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Set s4 eqiad as read-only for maintenance - T363689', diff saved to https://phabricator.wikimedia.org/P62950 and previous config saved to /var/cache/conftool/dbconfig/20240523-061408-arnaudb.json
  • 06:13 arnaudb: Starting s4 eqiad failover from db1238 to db1160 - T363689
  • 06:10 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62949 and previous config saved to /var/cache/conftool/dbconfig/20240523-061014-root.json
  • 05:57 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62948 and previous config saved to /var/cache/conftool/dbconfig/20240523-055747-root.json
  • 05:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1155.eqiad.wmnet with OS bookworm
  • 05:55 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62947 and previous config saved to /var/cache/conftool/dbconfig/20240523-055508-root.json
  • 05:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db1160 with weight 0 T363689', diff saved to https://phabricator.wikimedia.org/P62946 and previous config saved to /var/cache/conftool/dbconfig/20240523-054816-arnaudb.json
  • 05:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s4 T363689
  • 05:47 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s4 T363689
  • 05:42 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62945 and previous config saved to /var/cache/conftool/dbconfig/20240523-054240-root.json
  • 05:40 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62944 and previous config saved to /var/cache/conftool/dbconfig/20240523-054002-root.json
  • 05:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1155.eqiad.wmnet with reason: host reimage
  • 05:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1155.eqiad.wmnet with reason: host reimage
  • 05:27 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62942 and previous config saved to /var/cache/conftool/dbconfig/20240523-052734-root.json
  • 05:24 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62941 and previous config saved to /var/cache/conftool/dbconfig/20240523-052456-root.json
  • 05:17 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1155.eqiad.wmnet with OS bookworm
  • 05:12 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62940 and previous config saved to /var/cache/conftool/dbconfig/20240523-051228-root.json
  • 05:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1174 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62939 and previous config saved to /var/cache/conftool/dbconfig/20240523-050950-root.json
  • 05:08 marostegui: Install 10..6.18 on db1174 T365338
  • 05:06 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1174', diff saved to https://phabricator.wikimedia.org/P62938 and previous config saved to /var/cache/conftool/dbconfig/20240523-050626-root.json
  • 04:57 marostegui@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62937 and previous config saved to /var/cache/conftool/dbconfig/20240523-045722-root.json
  • 03:16 eileen: civicrm upgraded from 50211434 to 252eed3c
  • 02:56 eileen: config revision changed from f8af8188 to d8905b73
  • 02:54 eileen: config revision changed from b9fbe283 to f8af8188
  • 02:53 eileen: tools upgraded from ad48f63e to bce5f52b
  • 01:50 eileen: civicrm upgraded from 5cb7c467 to 50211434
  • 01:26 eileen: civicrm upgraded from 172feea2 to 5cb7c467
  • 00:29 ejegg: fundraising civicrm upgraded from 84c36324 to 172feea2

2024-05-22

  • 23:49 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1214 (T364299)', diff saved to https://phabricator.wikimedia.org/P62936 and previous config saved to /var/cache/conftool/dbconfig/20240522-234937-marostegui.json
  • 23:49 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 23:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 23:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T364299)', diff saved to https://phabricator.wikimedia.org/P62935 and previous config saved to /var/cache/conftool/dbconfig/20240522-234914-marostegui.json
  • 23:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P62934 and previous config saved to /var/cache/conftool/dbconfig/20240522-233406-marostegui.json
  • 23:18 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P62933 and previous config saved to /var/cache/conftool/dbconfig/20240522-231858-marostegui.json
  • 23:03 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T364299)', diff saved to https://phabricator.wikimedia.org/P62932 and previous config saved to /var/cache/conftool/dbconfig/20240522-230350-marostegui.json
  • 22:24 ryankemper@cumin2002: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons.
  • 21:57 ryankemper@cumin2002: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons.
  • 21:56 eileen: civicrm upgraded from b0a3965a to 84c36324
  • 21:54 ryankemper: T363973 Finished manual rolling restart of hadoop masters `an-master100[3,4].eqiad.wmnet`
  • 21:47 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:47 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:50 jdrewniak@deploy1002: Finished scap: Backport for Revert "Add exclusion behaviour for "width" option in Appearance menu" (T364015), Small font size is not applying to excluded pages (T364887 T365408) (duration: 16m 46s)
  • 20:44 ejegg: payments-wiki upgraded from 5b86bd09 to d871e439
  • 20:39 Lucas_WMDE: STOPPED lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["76318767"]' 2>&1 | tee -a ~/T315510-enwiki-5; date # ca. 1 hour and 20 minutes ago, after running for a bit over 6 days; some errors
  • 20:37 jdrewniak@deploy1002: jdrewniak and jdlrobson: Continuing with sync
  • 20:36 jdrewniak@deploy1002: jdrewniak and jdlrobson: Backport for Revert "Add exclusion behaviour for "width" option in Appearance menu" (T364015), Small font size is not applying to excluded pages (T364887 T365408) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:33 jdrewniak@deploy1002: Started scap: Backport for Revert "Add exclusion behaviour for "width" option in Appearance menu" (T364015), Small font size is not applying to excluded pages (T364887 T365408)
  • 18:33 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts sretest2002.wikimedia.org
  • 18:33 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:33 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest2002.wikimedia.org decommissioned, removing all IPs except the asset tag one - cmooney@cumin1002"
  • 18:32 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest2002.wikimedia.org decommissioned, removing all IPs except the asset tag one - cmooney@cumin1002"
  • 18:29 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 18:18 cmooney@cumin1002: START - Cookbook sre.hosts.decommission for hosts sretest2002.wikimedia.org
  • 18:16 cmooney@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host sretest2002.wikimedia.org with OS bookworm
  • 18:05 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2002.wikimedia.org with OS bookworm
  • 17:58 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 17:58 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 17:55 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2002.wikimedia.org with OS bookworm
  • 17:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62930 and previous config saved to /var/cache/conftool/dbconfig/20240522-173900-arnaudb.json
  • 17:24 topranks: Setting DHCP in codfw row A to 'forward-only' mode to troubleshoot DHCP bug T365204
  • 17:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62929 and previous config saved to /var/cache/conftool/dbconfig/20240522-172354-arnaudb.json
  • 17:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62928 and previous config saved to /var/cache/conftool/dbconfig/20240522-170848-arnaudb.json
  • 17:07 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2002.wikimedia.org with OS bookworm
  • 16:58 ejegg: standalone SmashPig upgraded from a9c5ee43 to edf573bb
  • 16:55 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62926 and previous config saved to /var/cache/conftool/dbconfig/20240522-165558-root.json
  • 16:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62925 and previous config saved to /var/cache/conftool/dbconfig/20240522-165340-arnaudb.json
  • 16:40 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62924 and previous config saved to /var/cache/conftool/dbconfig/20240522-164052-root.json
  • 16:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62923 and previous config saved to /var/cache/conftool/dbconfig/20240522-163834-arnaudb.json
  • 16:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62922 and previous config saved to /var/cache/conftool/dbconfig/20240522-162546-root.json
  • 16:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62921 and previous config saved to /var/cache/conftool/dbconfig/20240522-162327-arnaudb.json
  • 16:19 kamila@deploy1002: helmfile [codfw] DONE helmfile.d/services/recommendation-api: apply
  • 16:19 kamila@deploy1002: helmfile [codfw] START helmfile.d/services/recommendation-api: apply
  • 16:13 James_F: Running `mwscript extensions/WikiLambda/maintenance/migrateZ16K1StringsToZ61s.php --wiki=wikifunctionswiki --implement` on mwmaint1002 for T287153
  • 16:10 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62920 and previous config saved to /var/cache/conftool/dbconfig/20240522-161039-root.json
  • 16:08 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
  • 16:08 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/sessionstore: apply
  • 16:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62919 and previous config saved to /var/cache/conftool/dbconfig/20240522-160821-arnaudb.json
  • 15:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Long schema change
  • 15:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Long schema change
  • 15:56 kamila@deploy1002: helmfile [eqiad] DONE helmfile.d/services/recommendation-api: apply
  • 15:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1226', diff saved to https://phabricator.wikimedia.org/P62918 and previous config saved to /var/cache/conftool/dbconfig/20240522-155621-root.json
  • 15:55 kamila@deploy1002: helmfile [eqiad] START helmfile.d/services/recommendation-api: apply
  • 15:55 marostegui@cumin1002: dbctl commit (dc=all): 'db1192 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62917 and previous config saved to /var/cache/conftool/dbconfig/20240522-155533-root.json
  • 15:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2130 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62916 and previous config saved to /var/cache/conftool/dbconfig/20240522-155315-arnaudb.json
  • 15:50 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2130.codfw.wmnet with OS bookworm
  • 15:44 elukey: upload to bookworm-wikimedia dragonfly-{dfdaemon,dfget}, calicoctl, calico-cni - T365253
  • 15:42 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
  • 15:42 kamila@deploy1002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
  • 15:42 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-staging2001.codfw.wmnet with OS bookworm
  • 15:40 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
  • 15:40 kamila@deploy1002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
  • 15:39 kamila@deploy1002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
  • 15:39 kamila@deploy1002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
  • 15:34 damilare: civicrm upgraded from 8c5fee40 to b0a3965a
  • 15:32 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
  • 15:32 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/sessionstore: apply
  • 15:27 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2130.codfw.wmnet with reason: host reimage
  • 15:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2130.codfw.wmnet with reason: host reimage
  • 15:22 elukey@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-staging2001.codfw.wmnet with reason: host reimage
  • 15:19 elukey@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-staging2001.codfw.wmnet with reason: host reimage
  • 15:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T352010)', diff saved to https://phabricator.wikimedia.org/P62915 and previous config saved to /var/cache/conftool/dbconfig/20240522-151923-ladsgroup.json
  • 15:16 vgutierrez: repool upload@drmrs with IPIP encapsulation enabled - T357257
  • 15:16 fabfur: enabling puppet on all cp-ulsfo (T365566)
  • 15:16 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts contint1003.eqiad.wmnet
  • 15:16 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:16 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: contint1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin1002"
  • 15:14 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: contint1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin1002"
  • 15:10 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 15:10 vgutierrez: rolling restart of pybal on lvs6003 and lvs6002 - T357257
  • 15:06 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2130.codfw.wmnet with reason: reimage
  • 15:06 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2130.codfw.wmnet with reason: reimage
  • 15:06 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2130.codfw.wmnet with OS bookworm
  • 15:05 dzahn@cumin1002: START - Cookbook sre.hosts.decommission for hosts contint1003.eqiad.wmnet
  • 15:05 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db2130', diff saved to https://phabricator.wikimedia.org/P62914 and previous config saved to /var/cache/conftool/dbconfig/20240522-150516-arnaudb.json
  • 15:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P62913 and previous config saved to /var/cache/conftool/dbconfig/20240522-150415-ladsgroup.json
  • 15:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T352010)', diff saved to https://phabricator.wikimedia.org/P62912 and previous config saved to /var/cache/conftool/dbconfig/20240522-150333-ladsgroup.json
  • 15:01 jynus: stopping eqiad mediabackups for cleaning up missing files T365607
  • 14:58 elukey@cumin1002: START - Cookbook sre.hosts.reimage for host ml-staging2001.codfw.wmnet with OS bookworm
  • 14:57 hnowlan: running `puppet cert revoke sessionstore.discovery.wmnet ` T363996
  • 14:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P62911 and previous config saved to /var/cache/conftool/dbconfig/20240522-144907-ladsgroup.json
  • 14:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P62910 and previous config saved to /var/cache/conftool/dbconfig/20240522-144826-ladsgroup.json
  • 14:43 vgutierrez: depool upload@drmrs before enabling IPIP encapsulation - T357257
  • 14:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2220 (T352010)', diff saved to https://phabricator.wikimedia.org/P62909 and previous config saved to /var/cache/conftool/dbconfig/20240522-143359-ladsgroup.json
  • 14:33 jayme: drained, cordoned and pooled=inactive kubernetes2023 and kubernetes2032 for cookbook testing - T350152 T365571
  • 14:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P62908 and previous config saved to /var/cache/conftool/dbconfig/20240522-143318-ladsgroup.json
  • 14:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62907 and previous config saved to /var/cache/conftool/dbconfig/20240522-143238-arnaudb.json
  • 14:32 jayme@cumin1002: conftool action : set/pooled=inactive; selector: name=kubernetes20(23|32).codfw.wmnet
  • 14:28 elukey: copy calico, istio-cni, kubernetes-node packages from bullseye-wikimedia to bookworm-wikimedia - T365253
  • 14:28 fabfur: disabling puppet on all cp-ulsfo to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1034852 selectively (T365566)
  • 14:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2213 (T352010)', diff saved to https://phabricator.wikimedia.org/P62906 and previous config saved to /var/cache/conftool/dbconfig/20240522-141809-ladsgroup.json
  • 14:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62905 and previous config saved to /var/cache/conftool/dbconfig/20240522-141732-arnaudb.json
  • 14:14 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for PrefixSearch: Make sure $prefix is a string (T365565) (duration: 14m 58s)
  • 14:02 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde: Continuing with sync
  • 14:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62904 and previous config saved to /var/cache/conftool/dbconfig/20240522-140225-arnaudb.json
  • 14:02 logmsgbot: lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde: Backport for PrefixSearch: Make sure $prefix is a string (T365565) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:59 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for PrefixSearch: Make sure $prefix is a string (T365565)
  • 13:55 moritzm: installing libcaca security updates
  • 13:53 vgutierrez: repool upload@eqiad with IPIP encapsulation enabled - T357257
  • 13:48 moritzm: installing bind9 security updates (client-side tools/libs)
  • 13:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62902 and previous config saved to /var/cache/conftool/dbconfig/20240522-134717-arnaudb.json
  • 13:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62901 and previous config saved to /var/cache/conftool/dbconfig/20240522-134646-arnaudb.json
  • 13:39 vgutierrez: rolling restart of pybal on lvs1020 and lvs1018 - T357257
  • 13:36 logmsgbot: lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for Change $wgUploadNavigationUrl for azwiki (T364674) (duration: 16m 27s)
  • 13:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62900 and previous config saved to /var/cache/conftool/dbconfig/20240522-133209-arnaudb.json
  • 13:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62899 and previous config saved to /var/cache/conftool/dbconfig/20240522-133140-arnaudb.json
  • 13:27 kormat@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 100%: Repool db1246 T364552', diff saved to https://phabricator.wikimedia.org/P62898 and previous config saved to /var/cache/conftool/dbconfig/20240522-132712-kormat.json
  • 13:25 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1211 (T364299)', diff saved to https://phabricator.wikimedia.org/P62897 and previous config saved to /var/cache/conftool/dbconfig/20240522-132526-marostegui.json
  • 13:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 13:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 13:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T364299)', diff saved to https://phabricator.wikimedia.org/P62896 and previous config saved to /var/cache/conftool/dbconfig/20240522-132501-marostegui.json
  • 13:24 logmsgbot: lucaswerkmeister-wmde@deploy1002 nmw03 and lucaswerkmeister-wmde: Continuing with sync
  • 13:23 logmsgbot: lucaswerkmeister-wmde@deploy1002 nmw03 and lucaswerkmeister-wmde: Backport for Change $wgUploadNavigationUrl for azwiki (T364674) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:20 logmsgbot: lucaswerkmeister-wmde@deploy1002 Started scap: Backport for Change $wgUploadNavigationUrl for azwiki (T364674)
  • 13:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62895 and previous config saved to /var/cache/conftool/dbconfig/20240522-131700-arnaudb.json
  • 13:16 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62894 and previous config saved to /var/cache/conftool/dbconfig/20240522-131634-arnaudb.json
  • 13:12 kormat@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 90%: Repool db1246 T364552', diff saved to https://phabricator.wikimedia.org/P62893 and previous config saved to /var/cache/conftool/dbconfig/20240522-131206-kormat.json
  • 13:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P62892 and previous config saved to /var/cache/conftool/dbconfig/20240522-130954-marostegui.json
  • 13:08 urbanecm@deploy1002: Finished scap: Backport for foundationwiki: Grant autopatrol to the editor group (T365584), Remove forward slashes (T332580 T363815) (duration: 25m 09s)
  • 13:05 fabfur: restarting all benthos instances in A:cp-ulsfo
  • 13:01 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62891 and previous config saved to /var/cache/conftool/dbconfig/20240522-130154-arnaudb.json
  • 13:01 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62890 and previous config saved to /var/cache/conftool/dbconfig/20240522-130128-arnaudb.json
  • 13:00 vgutierrez: depool upload@eqiad before enabling IPIP encapsulation - T357257
  • 13:00 brouberol@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:59 brouberol@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:57 kormat@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 75%: Repool db1246 T364552', diff saved to https://phabricator.wikimedia.org/P62889 and previous config saved to /var/cache/conftool/dbconfig/20240522-125659-kormat.json
  • 12:55 urbanecm@deploy1002: urbanecm and cyndywikime: Continuing with sync
  • 12:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P62888 and previous config saved to /var/cache/conftool/dbconfig/20240522-125446-marostegui.json
  • 12:50 vgutierrez: repool upload@magru with IPIP encapsulation enabled - T357257
  • 12:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2145 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62887 and previous config saved to /var/cache/conftool/dbconfig/20240522-124648-arnaudb.json
  • 12:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62886 and previous config saved to /var/cache/conftool/dbconfig/20240522-124622-arnaudb.json
  • 12:45 urbanecm@deploy1002: urbanecm and cyndywikime: Backport for foundationwiki: Grant autopatrol to the editor group (T365584), Remove forward slashes (T332580 T363815) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:44 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2145.codfw.wmnet with OS bookworm
  • 12:26 kormat@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 45%: Repool db1246 T364552', diff saved to https://phabricator.wikimedia.org/P62881 and previous config saved to /var/cache/conftool/dbconfig/20240522-122647-kormat.json
  • 12:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2145.codfw.wmnet with reason: host reimage
  • 12:20 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2145.codfw.wmnet with reason: host reimage
  • 12:18 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub-next: sync on staging
  • 12:18 vgutierrez: depool upload@magru before enabling IPIP encapsulation - T357257
  • 12:18 daniel@deploy1002: Finished scap: Backport for REST: fix metrics keys (T365111) (duration: 16m 53s)
  • 12:16 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62880 and previous config saved to /var/cache/conftool/dbconfig/20240522-121611-arnaudb.json
  • 12:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2213 (T352010)', diff saved to https://phabricator.wikimedia.org/P62879 and previous config saved to /var/cache/conftool/dbconfig/20240522-121245-ladsgroup.json
  • 12:12 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2213.codfw.wmnet with reason: Maintenance
  • 12:12 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2213.codfw.wmnet with reason: Maintenance
  • 12:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T352010)', diff saved to https://phabricator.wikimedia.org/P62878 and previous config saved to /var/cache/conftool/dbconfig/20240522-121222-ladsgroup.json
  • 12:11 kormat@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 30%: Repool db1246 T364552', diff saved to https://phabricator.wikimedia.org/P62877 and previous config saved to /var/cache/conftool/dbconfig/20240522-121139-kormat.json
  • 12:07 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 12:06 daniel@deploy1002: daniel: Continuing with sync
  • 12:04 daniel@deploy1002: daniel: Backport for REST: fix metrics keys (T365111) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:04 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2145.codfw.wmnet with OS bookworm
  • 12:02 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2145.codfw.wmnet with reason: reimage
  • 12:02 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2145.codfw.wmnet with reason: reimage
  • 12:02 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db2145', diff saved to https://phabricator.wikimedia.org/P62876 and previous config saved to /var/cache/conftool/dbconfig/20240522-120223-arnaudb.json
  • 12:01 daniel@deploy1002: Started scap: Backport for REST: fix metrics keys (T365111)
  • 12:01 arnaudb@cumin1002: dbctl commit (dc=all): 'db2153 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62875 and previous config saved to /var/cache/conftool/dbconfig/20240522-120105-arnaudb.json
  • 12:00 daniel@deploy1002: Finished scap: Backport for REST: fix metrics keys (T365111) (duration: 17m 25s)
  • 11:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P62874 and previous config saved to /var/cache/conftool/dbconfig/20240522-115714-ladsgroup.json
  • 11:56 kormat@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 15%: Repool db1246 T364552', diff saved to https://phabricator.wikimedia.org/P62873 and previous config saved to /var/cache/conftool/dbconfig/20240522-115633-kormat.json
  • 11:55 kormat@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 100%: repool clone source T364552', diff saved to https://phabricator.wikimedia.org/P62872 and previous config saved to /var/cache/conftool/dbconfig/20240522-115458-kormat.json
  • 11:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62871 and previous config saved to /var/cache/conftool/dbconfig/20240522-115313-arnaudb.json
  • 11:47 daniel@deploy1002: daniel: Continuing with sync
  • 11:45 daniel@deploy1002: daniel: Backport for REST: fix metrics keys (T365111) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 11:42 daniel@deploy1002: Started scap: Backport for REST: fix metrics keys (T365111)
  • 11:42 mvolz@deploy1002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 11:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P62870 and previous config saved to /var/cache/conftool/dbconfig/20240522-114206-ladsgroup.json
  • 11:41 mvolz@deploy1002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 11:39 kormat@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 90%: repool clone source T364552', diff saved to https://phabricator.wikimedia.org/P62869 and previous config saved to /var/cache/conftool/dbconfig/20240522-113952-kormat.json
  • 11:39 mvolz@deploy1002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 11:38 mvolz@deploy1002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 11:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62868 and previous config saved to /var/cache/conftool/dbconfig/20240522-113807-arnaudb.json
  • 11:29 mvolz@deploy1002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 11:29 mvolz@deploy1002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 11:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T352010)', diff saved to https://phabricator.wikimedia.org/P62867 and previous config saved to /var/cache/conftool/dbconfig/20240522-112658-ladsgroup.json
  • 11:24 kormat@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 75%: repool clone source T364552', diff saved to https://phabricator.wikimedia.org/P62866 and previous config saved to /var/cache/conftool/dbconfig/20240522-112444-kormat.json
  • 11:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62865 and previous config saved to /var/cache/conftool/dbconfig/20240522-112301-arnaudb.json
  • 11:09 kormat@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 60%: repool clone source T364552', diff saved to https://phabricator.wikimedia.org/P62864 and previous config saved to /var/cache/conftool/dbconfig/20240522-110938-kormat.json
  • 11:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62863 and previous config saved to /var/cache/conftool/dbconfig/20240522-110754-arnaudb.json
  • 11:02 hnowlan@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-ctrl2003.codfw.wmnet
  • 10:54 kormat@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 45%: repool clone source T364552', diff saved to https://phabricator.wikimedia.org/P62862 and previous config saved to /var/cache/conftool/dbconfig/20240522-105432-kormat.json
  • 10:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2153.codfw.wmnet with OS bookworm
  • 10:52 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62861 and previous config saved to /var/cache/conftool/dbconfig/20240522-105248-arnaudb.json
  • 10:40 hnowlan@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-ctrl2002.codfw.wmnet
  • 10:39 kormat@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 30%: repool clone source T364552', diff saved to https://phabricator.wikimedia.org/P62860 and previous config saved to /var/cache/conftool/dbconfig/20240522-103924-kormat.json
  • 10:37 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62859 and previous config saved to /var/cache/conftool/dbconfig/20240522-103742-arnaudb.json
  • 10:32 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
  • 10:27 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
  • 10:24 kormat@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 15%: repool clone source T364552', diff saved to https://phabricator.wikimedia.org/P62858 and previous config saved to /var/cache/conftool/dbconfig/20240522-102418-kormat.json
  • 10:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62857 and previous config saved to /var/cache/conftool/dbconfig/20240522-102236-arnaudb.json
  • 10:10 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2153.codfw.wmnet with OS bookworm
  • 10:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2153.codfw.wmnet with reason: reimage
  • 10:09 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2153.codfw.wmnet with reason: reimage
  • 10:09 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: name=wikikube-ctrl2001.codfw.wmnet
  • 10:08 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db2153', diff saved to https://phabricator.wikimedia.org/P62856 and previous config saved to /var/cache/conftool/dbconfig/20240522-100834-arnaudb.json
  • 10:07 arnaudb@cumin1002: dbctl commit (dc=all): 'db2173 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62855 and previous config saved to /var/cache/conftool/dbconfig/20240522-100730-arnaudb.json
  • 10:06 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2173.codfw.wmnet with OS bookworm
  • 10:02 cmooney@cumin1002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: Release v0.6.5 update to hostname to bgp group mappings - cmooney@cumin1002 - T353464
  • 10:00 cmooney@cumin1002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: Release v0.6.5 update to hostname to bgp group mappings - cmooney@cumin1002 - T353464
  • 09:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62854 and previous config saved to /var/cache/conftool/dbconfig/20240522-095507-arnaudb.json
  • 09:44 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
  • 09:40 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
  • 09:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62853 and previous config saved to /var/cache/conftool/dbconfig/20240522-094001-arnaudb.json
  • 09:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62852 and previous config saved to /var/cache/conftool/dbconfig/20240522-092455-arnaudb.json
  • 09:22 hnowlan: running homer to add bgp status for wikikube-ctrl2001
  • 09:21 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2173.codfw.wmnet with OS bookworm
  • 09:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2211 (T352010)', diff saved to https://phabricator.wikimedia.org/P62851 and previous config saved to /var/cache/conftool/dbconfig/20240522-091942-ladsgroup.json
  • 09:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 09:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 09:14 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 09:14 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 09:13 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 09:13 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
  • 09:13 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
  • 09:12 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 09:12 btullis@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 09:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62850 and previous config saved to /var/cache/conftool/dbconfig/20240522-090949-arnaudb.json
  • 09:06 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 09:06 btullis@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 08:54 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62849 and previous config saved to /var/cache/conftool/dbconfig/20240522-085443-arnaudb.json
  • 08:51 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 26162
  • 08:50 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 26162
  • 08:49 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 20121
  • 08:49 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 20121
  • 08:49 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
  • 08:49 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 20121
  • 08:49 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 20121
  • 08:49 btullis@deploy1002: helmfile [staging] START helmfile.d/services/device-analytics: apply
  • 08:48 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 20121
  • 08:48 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 20121
  • 08:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0)
  • 08:46 arnaudb@cumin1002: Updating IPMI password on 1 hosts - arnaudb@cumin1002
  • 08:46 arnaudb@cumin1002: START - Cookbook sre.hosts.ipmi-password-reset
  • 08:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0)
  • 08:46 arnaudb@cumin1002: Updating IPMI password on 1 hosts - arnaudb@cumin1002
  • 08:45 arnaudb@cumin1002: START - Cookbook sre.hosts.ipmi-password-reset
  • 08:45 arnaudb@cumin1002: END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99)
  • 08:45 arnaudb@cumin1002: START - Cookbook sre.hosts.ipmi-password-reset
  • 08:41 aklapper@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.43.0-wmf.6 refs T361400
  • 08:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62848 and previous config saved to /var/cache/conftool/dbconfig/20240522-083937-arnaudb.json
  • 08:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62846 and previous config saved to /var/cache/conftool/dbconfig/20240522-082431-arnaudb.json
  • 08:16 hashar@deploy1002: Finished scap: Backport for Fix fatal error due to missing signature on very old comments (T365495) (duration: 16m 27s)
  • 08:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2173.codfw.wmnet with reason: reimage
  • 08:13 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2173.codfw.wmnet with reason: reimage
  • 08:11 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db2173', diff saved to https://phabricator.wikimedia.org/P62845 and previous config saved to /var/cache/conftool/dbconfig/20240522-081059-arnaudb.json
  • 08:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62844 and previous config saved to /var/cache/conftool/dbconfig/20240522-080924-arnaudb.json
  • 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1249.eqiad.wmnet
  • 08:02 hashar@deploy1002: jforrester and hashar: Continuing with sync
  • 08:02 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1232.eqiad.wmnet with OS bookworm
  • 08:02 hashar@deploy1002: jforrester and hashar: Backport for Fix fatal error due to missing signature on very old comments (T365495) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:00 hashar@deploy1002: Started scap: Backport for Fix fatal error due to missing signature on very old comments (T365495)
  • 07:56 kartik@deploy1002: Finished scap: Backport for SpecialNotifyTranslators: Fix group id in dropdown (T253984) (duration: 22m 42s)
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62843 and previous config saved to /var/cache/conftool/dbconfig/20240522-075142-root.json
  • 07:43 kartik@deploy1002: abi and kartik: Continuing with sync
  • 07:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1154.eqiad.wmnet with OS bookworm
  • 07:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
  • 07:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62842 and previous config saved to /var/cache/conftool/dbconfig/20240522-073636-root.json
  • 07:36 kartik@deploy1002: abi and kartik: Backport for SpecialNotifyTranslators: Fix group id in dropdown (T253984) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:33 kartik@deploy1002: Started scap: Backport for SpecialNotifyTranslators: Fix group id in dropdown (T253984)
  • 07:33 ayounsi@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 07:33 ayounsi@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 07:32 moritzm: installing postgresql-11 security updates
  • 07:30 kartik@deploy1002: Finished scap: Backport for Disable Section Translation on simplewiki (T361597) (duration: 19m 47s)
  • 07:26 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db1232.eqiad.wmnet with OS bookworm
  • 07:25 ayounsi@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 07:25 ayounsi@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 07:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1232.eqiad.wmnet with reason: reimage
  • 07:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1232.eqiad.wmnet with reason: reimage
  • 07:23 moritzm: installing nodejs security updates
  • 07:23 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db1232', diff saved to https://phabricator.wikimedia.org/P62841 and previous config saved to /var/cache/conftool/dbconfig/20240522-072307-arnaudb.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62840 and previous config saved to /var/cache/conftool/dbconfig/20240522-072130-root.json
  • 07:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1154.eqiad.wmnet with reason: host reimage
  • 07:17 kartik@deploy1002: kartik: Continuing with sync
  • 07:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1154.eqiad.wmnet with reason: host reimage
  • 07:13 kartik@deploy1002: kartik: Backport for Disable Section Translation on simplewiki (T361597) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:10 kartik@deploy1002: Started scap: Backport for Disable Section Translation on simplewiki (T361597)
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62839 and previous config saved to /var/cache/conftool/dbconfig/20240522-070624-root.json
  • 07:03 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1154.eqiad.wmnet with OS bookworm
  • 07:02 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1249.eqiad.wmnet
  • 07:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1248.eqiad.wmnet
  • 06:58 marostegui: Reimage db1154 (sanitarium) there will be lag in s1, s3, s5 and s8 in wiki replicas
  • 06:53 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 06:53 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 06:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T352010)', diff saved to https://phabricator.wikimedia.org/P62838 and previous config saved to /var/cache/conftool/dbconfig/20240522-065340-ladsgroup.json
  • 06:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1248.eqiad.wmnet
  • 06:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1247.eqiad.wmnet
  • 06:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62837 and previous config saved to /var/cache/conftool/dbconfig/20240522-065117-root.json
  • 06:41 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1247.eqiad.wmnet
  • 06:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P62836 and previous config saved to /var/cache/conftool/dbconfig/20240522-063832-ladsgroup.json
  • 06:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62835 and previous config saved to /var/cache/conftool/dbconfig/20240522-063610-root.json
  • 06:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P62834 and previous config saved to /var/cache/conftool/dbconfig/20240522-062324-ladsgroup.json
  • 06:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62833 and previous config saved to /var/cache/conftool/dbconfig/20240522-062103-root.json
  • 06:19 marostegui: Install 10..6.18 on db1249 T365338
  • 06:18 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1249', diff saved to https://phabricator.wikimedia.org/P62832 and previous config saved to /var/cache/conftool/dbconfig/20240522-061806-root.json
  • 06:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62831 and previous config saved to /var/cache/conftool/dbconfig/20240522-060901-root.json
  • 06:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T352010)', diff saved to https://phabricator.wikimedia.org/P62830 and previous config saved to /var/cache/conftool/dbconfig/20240522-060814-ladsgroup.json
  • 05:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1249.eqiad.wmnet with OS bookworm
  • 05:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1249 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62829 and previous config saved to /var/cache/conftool/dbconfig/20240522-055355-root.json
  • 05:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2220 (T352010)', diff saved to https://phabricator.wikimedia.org/P62828 and previous config saved to /var/cache/conftool/dbconfig/20240522-054857-ladsgroup.json
  • 05:48 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 05:48 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2220.codfw.wmnet with reason: Maintenance
  • 05:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T352010)', diff saved to https://phabricator.wikimedia.org/P62827 and previous config saved to /var/cache/conftool/dbconfig/20240522-054834-ladsgroup.json
  • 05:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1249.eqiad.wmnet with reason: host reimage
  • 05:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1249.eqiad.wmnet with reason: host reimage
  • 05:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P62826 and previous config saved to /var/cache/conftool/dbconfig/20240522-053326-ladsgroup.json
  • 05:22 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1249.eqiad.wmnet with OS bookworm
  • 05:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1249', diff saved to https://phabricator.wikimedia.org/P62825 and previous config saved to /var/cache/conftool/dbconfig/20240522-052108-root.json
  • 05:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P62824 and previous config saved to /var/cache/conftool/dbconfig/20240522-051818-ladsgroup.json
  • 05:07 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1192 for a schema change', diff saved to https://phabricator.wikimedia.org/P62823 and previous config saved to /var/cache/conftool/dbconfig/20240522-050727-root.json
  • 05:07 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Long schema change
  • 05:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Long schema change
  • 05:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T352010)', diff saved to https://phabricator.wikimedia.org/P62822 and previous config saved to /var/cache/conftool/dbconfig/20240522-050310-ladsgroup.json
  • 04:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2192 (T352010)', diff saved to https://phabricator.wikimedia.org/P62821 and previous config saved to /var/cache/conftool/dbconfig/20240522-041922-ladsgroup.json
  • 04:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 04:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 04:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T352010)', diff saved to https://phabricator.wikimedia.org/P62820 and previous config saved to /var/cache/conftool/dbconfig/20240522-041858-ladsgroup.json
  • 04:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P62819 and previous config saved to /var/cache/conftool/dbconfig/20240522-040349-ladsgroup.json
  • 03:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P62818 and previous config saved to /var/cache/conftool/dbconfig/20240522-034840-ladsgroup.json
  • 03:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T352010)', diff saved to https://phabricator.wikimedia.org/P62817 and previous config saved to /var/cache/conftool/dbconfig/20240522-033332-ladsgroup.json
  • 02:21 eileen: civicrm upgraded from f1c24cb7 to c9d64b68
  • 02:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1203 (T364299)', diff saved to https://phabricator.wikimedia.org/P62816 and previous config saved to /var/cache/conftool/dbconfig/20240522-021116-marostegui.json
  • 02:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 02:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 02:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T364299)', diff saved to https://phabricator.wikimedia.org/P62815 and previous config saved to /var/cache/conftool/dbconfig/20240522-021053-marostegui.json
  • 01:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P62814 and previous config saved to /var/cache/conftool/dbconfig/20240522-015545-marostegui.json
  • 01:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P62813 and previous config saved to /var/cache/conftool/dbconfig/20240522-014037-marostegui.json
  • 01:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T364299)', diff saved to https://phabricator.wikimedia.org/P62812 and previous config saved to /var/cache/conftool/dbconfig/20240522-012529-marostegui.json
  • 01:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2178 (T352010)', diff saved to https://phabricator.wikimedia.org/P62811 and previous config saved to /var/cache/conftool/dbconfig/20240522-011536-ladsgroup.json
  • 01:15 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 01:15 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 01:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T352010)', diff saved to https://phabricator.wikimedia.org/P62810 and previous config saved to /var/cache/conftool/dbconfig/20240522-011512-ladsgroup.json
  • 01:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P62809 and previous config saved to /var/cache/conftool/dbconfig/20240522-010004-ladsgroup.json
  • 00:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P62808 and previous config saved to /var/cache/conftool/dbconfig/20240522-004456-ladsgroup.json
  • 00:33 eileen: civicrm upgraded from c77df721 to f1c24cb7
  • 00:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T352010)', diff saved to https://phabricator.wikimedia.org/P62807 and previous config saved to /var/cache/conftool/dbconfig/20240522-002948-ladsgroup.json

2024-05-21

  • 23:46 eileen: civicrm upgraded from c77df721 to 9f65d36a
  • 23:40 eileen: config revision changed from 22106526 to b9fbe283
  • 23:40 eileen: civicrm upgraded from c77df721 to 9f65d36a
  • 23:39 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Apply change that makes encryption optional - eevans@cumin1002
  • 23:23 zabe@deploy1002: Finished scap: Backport for Use encrypted Argon2 Hashes to store user passwords (T150647 T216682) (duration: 26m 51s)
  • 23:19 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Apply change that makes encryption optional - eevans@cumin1002
  • 23:10 zabe@deploy1002: zabe: Continuing with sync
  • 22:59 zabe@deploy1002: zabe: Backport for Use encrypted Argon2 Hashes to store user passwords (T150647 T216682) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:56 zabe@deploy1002: Started scap: Backport for Use encrypted Argon2 Hashes to store user passwords (T150647 T216682)
  • 22:40 zabe@deploy1002: Finished scap: Backport for Stop writing to af_user(_text)/afh_user(_text) on test wikis (T337920) (duration: 16m 23s)
  • 22:27 zabe@deploy1002: zabe: Continuing with sync
  • 22:27 zabe@deploy1002: zabe: Backport for Stop writing to af_user(_text)/afh_user(_text) on test wikis (T337920) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:24 zabe@deploy1002: Started scap: Backport for Stop writing to af_user(_text)/afh_user(_text) on test wikis (T337920)
  • 22:17 zabe: zabe@mwmaint1002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=ukwiki --logwiki=metawiki 'QFTP2024' 'Organic2024' # T365533
  • 22:16 zabe: zabe@mwmaint1002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=ptwiki --logwiki=metawiki 'Aurelio de Sandoval' 'Aurelio Sandoval' # T365533
  • 22:15 zabe: zabe@mwmaint1002:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=eswiki --logwiki=metawiki '17420g' 'Ras I' # T365533
  • 21:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2171 (T352010)', diff saved to https://phabricator.wikimedia.org/P62806 and previous config saved to /var/cache/conftool/dbconfig/20240521-215924-ladsgroup.json
  • 21:59 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 21:59 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 21:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T352010)', diff saved to https://phabricator.wikimedia.org/P62805 and previous config saved to /var/cache/conftool/dbconfig/20240521-215900-ladsgroup.json
  • 21:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P62804 and previous config saved to /var/cache/conftool/dbconfig/20240521-214352-ladsgroup.json
  • 21:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P62803 and previous config saved to /var/cache/conftool/dbconfig/20240521-212842-ladsgroup.json
  • 21:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T352010)', diff saved to https://phabricator.wikimedia.org/P62802 and previous config saved to /var/cache/conftool/dbconfig/20240521-211335-ladsgroup.json
  • 21:09 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wikikube-ctrl1001.mgmt.eqiad.wmnet']
  • 21:08 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001.mgmt.eqiad.wmnet']
  • 21:07 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001.mgmt.eqiad.wmnet']
  • 20:56 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001.mgmt.eqiad.wmnet']
  • 20:51 jforrester@deploy1002: Finished scap: Backport for Drop responsive behaviour (T109277), Decouple MFUseDesktopSpecialWatchlistPage from EditWatchlist page, Enable desktop watchlist HTML on mobile (T109277), Don't define wmgUseListings, no longer read (duration: 18m 17s)
  • 20:38 jforrester@deploy1002: jforrester and jdlrobson: Continuing with sync
  • 20:36 jforrester@deploy1002: jforrester and jdlrobson: Backport for Drop responsive behaviour (T109277), Decouple MFUseDesktopSpecialWatchlistPage from EditWatchlist page, Enable desktop watchlist HTML on mobile (T109277), Don't define wmgUseListings, no longer read synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:34 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001.mgmt.eqiad.wmnet']
  • 20:34 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001.mgmt.eqiad.wmnet']
  • 20:33 jforrester@deploy1002: Started scap: Backport for Drop responsive behaviour (T109277), Decouple MFUseDesktopSpecialWatchlistPage from EditWatchlist page, Enable desktop watchlist HTML on mobile (T109277), Don't define wmgUseListings, no longer read
  • 20:33 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:32 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:31 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:31 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:27 jforrester@deploy1002: Finished scap: Backport for Cleanup night mode exclude list (T365084) (duration: 19m 32s)
  • 20:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2208 (T352010)', diff saved to https://phabricator.wikimedia.org/P62801 and previous config saved to /var/cache/conftool/dbconfig/20240521-202218-ladsgroup.json
  • 20:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 20:21 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 20:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62800 and previous config saved to /var/cache/conftool/dbconfig/20240521-201910-root.json
  • 20:13 jforrester@deploy1002: jforrester and jdlrobson: Continuing with sync
  • 20:10 jforrester@deploy1002: jforrester and jdlrobson: Backport for Cleanup night mode exclude list (T365084) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:08 jforrester@deploy1002: Started scap: Backport for Cleanup night mode exclude list (T365084)
  • 20:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62799 and previous config saved to /var/cache/conftool/dbconfig/20240521-200402-root.json
  • 20:03 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: T365467 (duration: 14m 56s)
  • 20:02 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:02 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:00 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:00 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:00 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001']
  • 20:00 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001']
  • 19:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62798 and previous config saved to /var/cache/conftool/dbconfig/20240521-194856-root.json
  • 19:47 reedy@deploy1002: Synchronized dblists-index.php: T365467 (duration: 15m 00s)
  • 19:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62797 and previous config saved to /var/cache/conftool/dbconfig/20240521-193349-root.json
  • 19:28 reedy@deploy1002: Synchronized multiversion/MWMultiVersion.php: T365467 (duration: 14m 59s)
  • 19:21 eileen: civicrm upgraded from f41f3432 to c77df721
  • 19:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62796 and previous config saved to /var/cache/conftool/dbconfig/20240521-191841-root.json
  • 19:11 reedy@deploy1002: Synchronized dblists/translate.dblist: T365467 (duration: 16m 59s)
  • 19:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62795 and previous config saved to /var/cache/conftool/dbconfig/20240521-190334-root.json
  • 18:50 mutante: gitlab-runners*.wmnet: ran puppet via cumin to deploy update of docker.gc service to use image 1.3.0 (from 1.2.0) - T350478
  • 18:48 marostegui@cumin1002: dbctl commit (dc=all): 'db1167 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62794 and previous config saved to /var/cache/conftool/dbconfig/20240521-184828-root.json
  • 18:44 ebernhardson: T363734: start reindex of cloudelastic
  • 18:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 18:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 18:38 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 18:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2157 (T352010)', diff saved to https://phabricator.wikimedia.org/P62793 and previous config saved to /var/cache/conftool/dbconfig/20240521-183735-ladsgroup.json
  • 18:37 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 18:37 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 18:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T352010)', diff saved to https://phabricator.wikimedia.org/P62792 and previous config saved to /var/cache/conftool/dbconfig/20240521-183711-ladsgroup.json
  • 18:35 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:35 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:34 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:25 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:25 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:22 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl1001.eqiad.wmnet with reason: host reimage
  • 18:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P62791 and previous config saved to /var/cache/conftool/dbconfig/20240521-182203-ladsgroup.json
  • 18:18 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1001.eqiad.wmnet with reason: host reimage
  • 18:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl1003.eqiad.wmnet with OS bullseye
  • 18:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 18:13 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 18:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P62790 and previous config saved to /var/cache/conftool/dbconfig/20240521-180655-ladsgroup.json
  • 18:04 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 18:03 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 18:02 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 18:01 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 18:01 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl1002.eqiad.wmnet with OS bullseye
  • 18:01 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 17:59 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:59 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl1003.eqiad.wmnet with reason: host reimage
  • 17:56 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 17:55 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1003.eqiad.wmnet with reason: host reimage
  • 17:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T352010)', diff saved to https://phabricator.wikimedia.org/P62789 and previous config saved to /var/cache/conftool/dbconfig/20240521-175146-ladsgroup.json
  • 17:43 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2002.codfw.wmnet with reason: host reimage
  • 17:41 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl1002.eqiad.wmnet with reason: host reimage
  • 17:40 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2002.codfw.wmnet with reason: host reimage
  • 17:38 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1002.eqiad.wmnet with reason: host reimage
  • 17:27 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@49bc8eb]: discolytics to 0.21, update search metrics group ownership (duration: 00m 26s)
  • 17:27 ebernhardson@deploy1002: Started deploy [airflow-dags/search@49bc8eb]: discolytics to 0.21, update search metrics group ownership
  • 17:25 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 17:24 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 17:23 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 17:21 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-ctrl2002.codfw.wmnet on all recursors
  • 17:21 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-ctrl2002.codfw.wmnet on all recursors
  • 17:09 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 17:09 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 16:58 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons.
  • 16:42 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:42 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new entries for wikikube-ctrl2002.codfw.wmnet - cmooney@cumin1002"
  • 16:41 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new entries for wikikube-ctrl2002.codfw.wmnet - cmooney@cumin1002"
  • 16:40 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 16:40 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 16:38 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 16:38 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-ctrl2002.codfw.wmnet on all recursors
  • 16:38 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-ctrl2002.codfw.wmnet on all recursors
  • 16:35 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2003.codfw.wmnet with OS bullseye
  • 16:35 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 16:35 kormat@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1182.eqiad.wmnet onto db1246.eqiad.wmnet
  • 16:34 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 16:34 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2001.codfw.wmnet with OS bullseye
  • 16:34 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 16:30 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1003.eqiad.wmnet with OS bullseye
  • 16:30 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 16:30 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - hnowlan@cumin1002"
  • 16:25 ryankemper@cumin2002: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons.
  • 16:23 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1002.eqiad.wmnet with OS bullseye
  • 16:18 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 16:17 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 16:16 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2003.codfw.wmnet with reason: host reimage
  • 16:13 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2001.codfw.wmnet with reason: host reimage
  • 16:12 ladsgroup@deploy1002: Finished scap: Backport for x-wikimedia-debug: Update k8s-mwdebug label, move to front (T362662) (duration: 17m 16s)
  • 16:11 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2003.codfw.wmnet with reason: host reimage
  • 16:10 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2001.codfw.wmnet with reason: host reimage
  • 16:09 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 16:03 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1003']
  • 16:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: analytics_cluster::launcher
  • 16:00 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1003']
  • 16:00 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1003']
  • 16:00 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1002']
  • 15:59 ladsgroup@deploy1002: ladsgroup and jforrester: Continuing with sync
  • 15:59 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1003']
  • 15:59 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 15:59 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1002']
  • 15:58 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wikikube-ctrl1002']
  • 15:58 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wikikube-ctrl1003']
  • 15:58 ladsgroup@deploy1002: ladsgroup and jforrester: Backport for x-wikimedia-debug: Update k8s-mwdebug label, move to front (T362662) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:56 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2003.codfw.wmnet with OS bullseye
  • 15:55 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 15:55 ladsgroup@deploy1002: Started scap: Backport for x-wikimedia-debug: Update k8s-mwdebug label, move to front (T362662)
  • 15:54 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2001.codfw.wmnet with OS bullseye
  • 15:54 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 15:53 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-ctrl2003.codfw.wmnet with OS bullseye
  • 15:53 jmm@cumin2002: START - Cookbook sre.puppet.migrate-role for role: analytics_cluster::launcher
  • 15:50 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1003']
  • 15:49 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wikikube-ctrl1003']
  • 15:49 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1002']
  • 15:49 ladsgroup@deploy1002: Finished scap: Backport for configure parsercache servers via dbconfig in etcd (T362786) (duration: 23m 46s)
  • 15:48 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wikikube-ctrl1002']
  • 15:45 ryankemper@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 15:43 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001']
  • 15:43 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001']
  • 15:41 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:41 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1003']
  • 15:41 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:39 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1002']
  • 15:39 ryankemper@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 15:38 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['wikikube-ctrl1001']
  • 15:38 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001']
  • 15:36 ladsgroup@deploy1002: ladsgroup and swfrench: Continuing with sync
  • 15:35 brouberol@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
  • 15:35 brouberol@cumin2002: START - Cookbook sre.wdqs.restart
  • 15:34 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2003.codfw.wmnet with OS bullseye
  • 15:33 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2002.codfw.wmnet with OS bullseye
  • 15:33 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-ctrl2003.codfw.wmnet with OS bullseye
  • 15:33 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-ctrl2001.codfw.wmnet with OS bullseye
  • 15:32 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 15:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2128 (T352010)', diff saved to https://phabricator.wikimedia.org/P62787 and previous config saved to /var/cache/conftool/dbconfig/20240521-153010-ladsgroup.json
  • 15:30 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 15:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 15:29 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 15:29 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:29 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 15:29 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 15:28 ladsgroup@deploy1002: ladsgroup and swfrench: Backport for configure parsercache servers via dbconfig in etcd (T362786) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:25 ladsgroup@deploy1002: Started scap: Backport for configure parsercache servers via dbconfig in etcd (T362786)
  • 15:22 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 15:21 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:19 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 15:17 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 15:17 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:15 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 15:15 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2001.codfw.wmnet with OS bullseye
  • 15:10 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl1002.eqiad.wmnet with OS bullseye
  • 15:10 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2003.codfw.wmnet with OS bullseye
  • 15:10 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl1003.eqiad.wmnet with OS bullseye
  • 15:10 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 15:07 jgiannelos@deploy1002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 15:07 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 15:07 jgiannelos@deploy1002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 15:06 jgiannelos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 15:06 kormat@cumin1002: START - Cookbook sre.mysql.clone of db1182.eqiad.wmnet onto db1246.eqiad.wmnet
  • 15:05 jgiannelos@deploy1002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 15:04 ejegg: fundraising civicrm upgraded from 8901b5b3 to f41f3432
  • 15:04 aklapper@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.43.0-wmf.6 refs T361400
  • 15:03 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 15:02 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 15:02 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync
  • 15:02 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 15:02 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 14:59 kormat@cumin1002: dbctl commit (dc=all): 'Depooling db1182 as cloning source T364552', diff saved to https://phabricator.wikimedia.org/P62785 and previous config saved to /var/cache/conftool/dbconfig/20240521-145924-kormat.json
  • 14:58 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 14:58 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 14:54 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2055.codfw.wmnet with OS bookworm
  • 14:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1193 (T364299)', diff saved to https://phabricator.wikimedia.org/P62784 and previous config saved to /var/cache/conftool/dbconfig/20240521-145451-marostegui.json
  • 14:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 14:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 14:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T364299)', diff saved to https://phabricator.wikimedia.org/P62783 and previous config saved to /var/cache/conftool/dbconfig/20240521-145428-marostegui.json
  • 14:52 elukey@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync
  • 14:51 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1003.eqiad.wmnet with OS bullseye
  • 14:51 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1002.eqiad.wmnet with OS bullseye
  • 14:48 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye
  • 14:46 aklapper@deploy1002: Finished scap: Backport for DocumentationAid: Fix fatal error (T365451) (duration: 16m 30s)
  • 14:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P62782 and previous config saved to /var/cache/conftool/dbconfig/20240521-143920-marostegui.json
  • 14:37 klausman@cumin1002: conftool action : set/pooled=yes; selector: name=ml-serve2002.codfw.wmnet
  • 14:37 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2055.codfw.wmnet with reason: host reimage
  • 14:36 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet2006-dev.codfw.wmnet with OS bookworm
  • 14:36 vgutierrez: testing fifo-log-demux 0.7.4 on cp4052
  • 14:34 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2055.codfw.wmnet with reason: host reimage
  • 14:33 aklapper@deploy1002: aklapper: Continuing with sync
  • 14:32 aklapper@deploy1002: aklapper: Backport for DocumentationAid: Fix fatal error (T365451) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:30 aklapper@deploy1002: Started scap: Backport for DocumentationAid: Fix fatal error (T365451)
  • 14:29 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:29 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:26 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P62781 and previous config saved to /var/cache/conftool/dbconfig/20240521-142412-marostegui.json
  • 14:19 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:16 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc2055.codfw.wmnet with OS bookworm
  • 14:09 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet2006-dev.codfw.wmnet with reason: host reimage
  • 14:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T364299)', diff saved to https://phabricator.wikimedia.org/P62780 and previous config saved to /var/cache/conftool/dbconfig/20240521-140904-marostegui.json
  • 14:06 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2006-dev.codfw.wmnet with reason: host reimage
  • 14:04 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:04 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:03 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:55 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp2003.codfw.wmnet with OS bookworm
  • 13:47 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudnet2006-dev.codfw.wmnet with OS bookworm
  • 13:39 kormat@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1246.eqiad.wmnet with OS bookworm
  • 13:38 jiji@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp2003.codfw.wmnet with reason: host reimage
  • 13:34 jiji@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp2003.codfw.wmnet with reason: host reimage
  • 13:34 zabe@deploy1002: Finished scap: Backport for arwiki: Disable Extension:ContentTranslation for non-autoreview users (T255022) (duration: 23m 20s)
  • 13:30 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS bullseye
  • 13:29 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:27 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 13:26 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config-next: apply
  • 13:23 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 13:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 13:20 zabe@deploy1002: zabe and gergesshamon: Continuing with sync
  • 13:18 kormat@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1246.eqiad.wmnet with reason: host reimage
  • 13:17 jiji@cumin1002: START - Cookbook sre.hosts.reimage for host mc-gp2003.codfw.wmnet with OS bookworm
  • 13:15 kormat@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1246.eqiad.wmnet with reason: host reimage
  • 13:13 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:13 zabe@deploy1002: zabe and gergesshamon: Backport for arwiki: Disable Extension:ContentTranslation for non-autoreview users (T255022) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:13 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:13 marostegui: Deploy schema change on s3 eqiad with replication dbmaint T365465
  • 13:12 vgutierrez: re-enable puppet on acme-chief clients - T364589
  • 13:11 jclark@cumin1002: START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED
  • 13:11 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 13:11 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 13:10 zabe@deploy1002: Started scap: Backport for arwiki: Disable Extension:ContentTranslation for non-autoreview users (T255022)
  • 13:09 marostegui: Deploy schema change on s5 (azwikimedia wikifunctionswiki vewikimedia) eqiad with replication dbmaint T365465
  • 13:08 marostegui: Deploy schema change on s7 (metawiki and frwiktionary ) eqiad with replication dbmaint T365465
  • 13:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1192 (T364299)', diff saved to https://phabricator.wikimedia.org/P62778 and previous config saved to /var/cache/conftool/dbconfig/20240521-130838-marostegui.json
  • 13:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 13:08 vgutierrez: upgrading to acme-chief 0.37 on acmechief instances - T364589
  • 13:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 13:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T364299)', diff saved to https://phabricator.wikimedia.org/P62777 and previous config saved to /var/cache/conftool/dbconfig/20240521-130814-marostegui.json
  • 13:07 marostegui: Deploy schema change on s4 eqiad with replication dbmaint T365465
  • 13:06 marostegui: Deploy schema change on s8 eqiad with replication dbmaint T365465
  • 13:04 vgutierrez: disable puppet on acme-chief clients - T364589
  • 13:01 kormat@cumin1002: START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm
  • 12:59 vgutierrez: upgrading to acme-chief 0.37 on acmechief-test instances - T364589
  • 12:55 vgutierrez: upload acme-chief 0.37 to apt.wm.org (bookworm-wikimedia) - T364589
  • 12:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P62775 and previous config saved to /var/cache/conftool/dbconfig/20240521-125306-marostegui.json
  • 12:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T364299)', diff saved to https://phabricator.wikimedia.org/P62771 and previous config saved to /var/cache/conftool/dbconfig/20240521-122250-marostegui.json
  • 12:15 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 12:13 stevemunene@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub-next: apply on staging
  • 12:07 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl1001.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:06 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet1006.eqiad.wmnet with OS bookworm
  • 12:04 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2001.mgmt.codfw.wmnet with reboot policy FORCED
  • 12:02 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1244.eqiad.wmnet
  • 12:01 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 12:01 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:01 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new entries for wikikube-ctrl2002.codfw.wmnet - cmooney@cumin1002"
  • 12:00 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new entries for wikikube-ctrl2002.codfw.wmnet - cmooney@cumin1002"
  • 12:00 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl1003.mgmt.eqiad.wmnet with reboot policy FORCED
  • 11:59 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-ctrl2002
  • 11:58 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 11:57 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-ctrl2002.codfw.wmnet on all recursors
  • 11:57 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache wikikube-ctrl2002.codfw.wmnet on all recursors
  • 11:57 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2002
  • 11:55 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1244.eqiad.wmnet
  • 11:55 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1243.eqiad.wmnet
  • 11:52 hnowlan@cumin1002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wikikube-ctrl2002
  • 11:51 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2002
  • 11:51 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-ctrl2003
  • 11:51 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2003
  • 11:49 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet1006.eqiad.wmnet with reason: host reimage
  • 11:49 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:47 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:46 hnowlan@cumin1002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wikikube-ctrl2002
  • 11:46 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet1006.eqiad.wmnet with reason: host reimage
  • 11:46 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2002
  • 11:46 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-ctrl2003
  • 11:43 hnowlan@cumin1002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wikikube-ctrl2002
  • 11:43 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2003
  • 11:43 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2002
  • 11:42 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:42 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:41 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2003.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:41 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:41 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2001.mgmt.codfw.wmnet with reboot policy FORCED
  • 11:40 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl1003.mgmt.eqiad.wmnet with reboot policy FORCED
  • 11:39 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 11:36 hnowlan@cumin1002: START - Cookbook sre.hosts.provision for host wikikube-ctrl1001.mgmt.eqiad.wmnet with reboot policy FORCED
  • 11:36 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1243.eqiad.wmnet
  • 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1242.eqiad.wmnet
  • 11:32 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudnet1006.eqiad.wmnet with OS bookworm
  • 11:10 taavi@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudnet1005
  • 11:10 taavi@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudnet1005
  • 11:02 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-ctrl1003
  • 11:00 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl1003
  • 11:00 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-ctrl1002
  • 10:59 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl1002
  • 10:58 hnowlan@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-ctrl1001
  • 10:57 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl1001
  • 10:57 hnowlan@cumin1002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wikikube-ctrl2002
  • 10:57 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2002
  • 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
  • 10:51 hnowlan@cumin1002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wikikube-ctrl2001
  • 10:51 hnowlan@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-ctrl2001
  • 10:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:50 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add renamed k8s ctrl nodes - hnowlan@cumin1002"
  • 10:50 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad
  • 10:49 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add renamed k8s ctrl nodes - hnowlan@cumin1002"
  • 10:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1238.eqiad.wmnet
  • 10:46 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 10:43 hnowlan@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 10:41 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 10:41 hnowlan@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 10:38 joal@deploy1002: Finished deploy [analytics/refinery@4d42877]: Deploy of Refinery after reimage of an-launcher1002 [analytics/refinery@4d42877e] (duration: 01m 01s)
  • 10:37 joal@deploy1002: Started deploy [analytics/refinery@4d42877]: Deploy of Refinery after reimage of an-launcher1002 [analytics/refinery@4d42877e]
  • 10:36 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
  • 10:34 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 10:33 hnowlan@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 10:31 jmm@cumin2002: START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw
  • 10:31 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 10:24 aklapper@deploy1002: rebuilt and synchronized wikiversions files: Revert "group0 wikis to 1.43.0-wmf.5"
  • 10:21 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 10:21 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 10:20 effie: restart memcached on mc2055
  • 10:18 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1238.eqiad.wmnet
  • 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1199.eqiad.wmnet
  • 09:58 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw[2331,2361,2391].codfw.wmnet,mw[1372,1429,1436].eqiad.wmnet
  • 09:58 hnowlan@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:58 hnowlan@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mw[2331,2361,2391].codfw.wmnet,mw[1372,1429,1436].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - hnowlan@cumin1002"
  • 09:57 hnowlan@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mw[2331,2361,2391].codfw.wmnet,mw[1372,1429,1436].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - hnowlan@cumin1002"
  • 09:57 moritzm: installing mariadb-10.3 security updates (libs/tools as packaged in Debian, unrelated to wmf-db)
  • 09:56 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1199.eqiad.wmnet
  • 09:55 hnowlan@cumin1002: START - Cookbook sre.dns.netbox
  • 09:53 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1190.eqiad.wmnet
  • 09:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62769 and previous config saved to /var/cache/conftool/dbconfig/20240521-094744-root.json
  • 09:41 aklapper@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.43.0-wmf.6 refs T361400
  • 09:36 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1190.eqiad.wmnet
  • 09:34 hnowlan@cumin1002: START - Cookbook sre.hosts.decommission for hosts mw[2331,2361,2391].codfw.wmnet,mw[1372,1429,1436].eqiad.wmnet
  • 09:33 hnowlan: decommissioning 6 appservers in advance of reimaging to k8s control nodes
  • 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1160.eqiad.wmnet
  • 09:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62768 and previous config saved to /var/cache/conftool/dbconfig/20240521-093238-root.json
  • 09:31 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-launcher1002.eqiad.wmnet with reason: host reimage
  • 09:29 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet1005.eqiad.wmnet with OS bookworm
  • 09:28 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-launcher1002.eqiad.wmnet with reason: host reimage
  • 09:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62767 and previous config saved to /var/cache/conftool/dbconfig/20240521-091732-root.json
  • 09:16 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-launcher1002.eqiad.wmnet with OS bullseye
  • 09:13 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudnet1005.eqiad.wmnet with reason: host reimage
  • 09:10 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet1005.eqiad.wmnet with reason: host reimage
  • away: UTC morning deploys done
  • 09:06 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1160.eqiad.wmnet
  • 09:05 tgr@deploy1002: Finished scap: Backport for Temporarily restore $wgCentralAuthDatabase (T348486) (duration: 17m 45s)
  • 09:02 marostegui@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62766 and previous config saved to /var/cache/conftool/dbconfig/20240521-090224-root.json
  • 09:02 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2219.codfw.wmnet
  • 08:55 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudnet1005.eqiad.wmnet with OS bookworm
  • 08:51 tgr@deploy1002: tgr: Continuing with sync
  • 08:51 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2219.codfw.wmnet
  • 08:50 tgr@deploy1002: tgr: Backport for Temporarily restore $wgCentralAuthDatabase (T348486) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2210.codfw.wmnet
  • 08:48 moritzm: installing edk2 security updates
  • 08:47 tgr@deploy1002: Started scap: Backport for Temporarily restore $wgCentralAuthDatabase (T348486)
  • 08:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62765 and previous config saved to /var/cache/conftool/dbconfig/20240521-084718-root.json
  • 08:43 moritzm: installing ghostscript security updates
  • 08:41 matthiasmullie: UTC morning backports done
  • 08:41 mlitn@deploy1002: Finished scap: Backport for Allow async (job queue based) chunked upload on all wikis (T364644) (duration: 17m 32s)
  • 08:40 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest2002.wikimedia.org on all recursors
  • 08:40 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache sretest2002.wikimedia.org on all recursors
  • 08:38 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:38 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add dns for sretest2002 - cmooney@cumin1002"
  • 08:38 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2210.codfw.wmnet
  • 08:37 effie: enable puppet on all mw* baremetal hosts
  • 08:37 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add dns for sretest2002 - cmooney@cumin1002"
  • 08:35 marostegui: Deploy schema change on s8 eqiad, this will cause a few hours of replication lag in s8 clouddb replicas T364299
  • 08:34 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 08:34 cmooney@cumin1002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
  • 08:34 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 08:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Long schema change
  • 08:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Long schema change
  • 08:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Long schema change
  • 08:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Long schema change
  • 08:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62764 and previous config saved to /var/cache/conftool/dbconfig/20240521-083212-root.json
  • 08:30 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1167 for a schema change', diff saved to https://phabricator.wikimedia.org/P62763 and previous config saved to /var/cache/conftool/dbconfig/20240521-083053-root.json
  • 08:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1237 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P62762 and previous config saved to /var/cache/conftool/dbconfig/20240521-082842-root.json
  • 08:27 mlitn@deploy1002: mlitn and bawolff: Continuing with sync
  • 08:26 mlitn@deploy1002: mlitn and bawolff: Backport for Allow async (job queue based) chunked upload on all wikis (T364644) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:23 mlitn@deploy1002: Started scap: Backport for Allow async (job queue based) chunked upload on all wikis (T364644)
  • 08:22 mlitn@deploy1002: Finished scap: Backport for Remove complicated synchronization of caption/description inputs (T365119) (duration: 17m 40s)
  • 08:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 08:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 08:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T352010)', diff saved to https://phabricator.wikimedia.org/P62761 and previous config saved to /var/cache/conftool/dbconfig/20240521-081930-ladsgroup.json
  • 08:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1221.eqiad.wmnet with OS bookworm
  • 08:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1221 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62760 and previous config saved to /var/cache/conftool/dbconfig/20240521-081706-root.json
  • 08:14 effie: enable puppet on mediawiki codfw servers
  • 08:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1237 (re)pooling @ 75%: After reimage', diff saved to https://phabricator.wikimedia.org/P62759 and previous config saved to /var/cache/conftool/dbconfig/20240521-081336-root.json
  • 08:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2206.codfw.wmnet
  • 08:09 mlitn@deploy1002: mlitn: Continuing with sync
  • 08:07 mlitn@deploy1002: mlitn: Backport for Remove complicated synchronization of caption/description inputs (T365119) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:04 mlitn@deploy1002: Started scap: Backport for Remove complicated synchronization of caption/description inputs (T365119)
  • 08:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to https://phabricator.wikimedia.org/P62758 and previous config saved to /var/cache/conftool/dbconfig/20240521-080422-ladsgroup.json
  • 08:04 mlitn@deploy1002: Finished scap: Backport for Fix automatic numbering of copied titles (T365107) (duration: 17m 02s)
  • 08:01 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62757 and previous config saved to /var/cache/conftool/dbconfig/20240521-080145-root.json
  • 07:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1237 (re)pooling @ 50%: After reimage', diff saved to https://phabricator.wikimedia.org/P62756 and previous config saved to /var/cache/conftool/dbconfig/20240521-075830-root.json
  • 07:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1221.eqiad.wmnet with reason: host reimage
  • 07:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1221.eqiad.wmnet with reason: host reimage
  • 07:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2206.codfw.wmnet
  • 07:51 moritzm: installing nginx security updates
  • 07:50 mlitn@deploy1002: mlitn: Continuing with sync
  • 07:49 mlitn@deploy1002: mlitn: Backport for Fix automatic numbering of copied titles (T365107) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:49 effie: disable puppet on all mediawiki hardware hosts - T345740
  • 07:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2179.codfw.wmnet
  • 07:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to https://phabricator.wikimedia.org/P62755 and previous config saved to /var/cache/conftool/dbconfig/20240521-074914-ladsgroup.json
  • 07:47 mlitn@deploy1002: Started scap: Backport for Fix automatic numbering of copied titles (T365107)
  • 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62754 and previous config saved to /var/cache/conftool/dbconfig/20240521-074639-root.json
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1237 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P62753 and previous config saved to /var/cache/conftool/dbconfig/20240521-074323-root.json
  • 07:40 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2179.codfw.wmnet
  • 07:40 moritzm: installing python 3.7 security updates
  • 07:39 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1221.eqiad.wmnet with OS bookworm
  • 07:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2172.codfw.wmnet
  • 07:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1221', diff saved to https://phabricator.wikimedia.org/P62752 and previous config saved to /var/cache/conftool/dbconfig/20240521-073727-marostegui.json
  • 07:35 kartik@deploy1002: Finished scap: Backport for Fix the mobile experience for a second group of Wikipedias where CX is in beta (T361597) (duration: 20m 18s)
  • 07:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T352010)', diff saved to https://phabricator.wikimedia.org/P62751 and previous config saved to /var/cache/conftool/dbconfig/20240521-073407-ladsgroup.json
  • 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62750 and previous config saved to /var/cache/conftool/dbconfig/20240521-073133-root.json
  • 07:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1237.eqiad.wmnet with OS bookworm
  • 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1237 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P62749 and previous config saved to /var/cache/conftool/dbconfig/20240521-072817-root.json
  • 07:21 kartik@deploy1002: kartik: Continuing with sync
  • 07:17 kartik@deploy1002: kartik: Backport for Fix the mobile experience for a second group of Wikipedias where CX is in beta (T361597) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62748 and previous config saved to /var/cache/conftool/dbconfig/20240521-071627-root.json
  • 07:15 kartik@deploy1002: Started scap: Backport for Fix the mobile experience for a second group of Wikipedias where CX is in beta (T361597)
  • 07:14 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2172.codfw.wmnet
  • 07:14 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 11170
  • 07:13 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 11170
  • 07:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
  • 07:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2147.codfw.wmnet
  • 07:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
  • 07:08 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 8075
  • 07:03 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2147.codfw.wmnet
  • 07:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2140.codfw.wmnet
  • 07:01 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62747 and previous config saved to /var/cache/conftool/dbconfig/20240521-070121-root.json
  • 07:00 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 8075
  • 06:54 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS bookworm
  • 06:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1237 T358642', diff saved to https://phabricator.wikimedia.org/P62746 and previous config saved to /var/cache/conftool/dbconfig/20240521-065318-marostegui.json
  • 06:52 moritzm: installing postgresql-11 security updates
  • 06:47 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2140.codfw.wmnet
  • 06:46 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62745 and previous config saved to /var/cache/conftool/dbconfig/20240521-064615-root.json
  • 06:44 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 398203
  • 06:44 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 398203
  • 06:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2137.codfw.wmnet
  • 06:36 moritzm: installing nghttp2 security updates
  • 06:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS bookworm
  • 06:33 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2137.codfw.wmnet
  • 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2136.codfw.wmnet
  • 06:31 marostegui@cumin1002: dbctl commit (dc=all): 'db1182 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62744 and previous config saved to /var/cache/conftool/dbconfig/20240521-063109-root.json
  • 06:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2136.codfw.wmnet
  • 06:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
  • 06:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
  • 05:56 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS bookworm
  • 05:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1182 T361543', diff saved to https://phabricator.wikimedia.org/P62743 and previous config saved to /var/cache/conftool/dbconfig/20240521-055501-root.json
  • 05:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2102.codfw.wmnet with OS bookworm
  • 05:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1213 (T352010)', diff saved to https://phabricator.wikimedia.org/P62742 and previous config saved to /var/cache/conftool/dbconfig/20240521-053627-ladsgroup.json
  • 05:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 05:36 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 05:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T352010)', diff saved to https://phabricator.wikimedia.org/P62741 and previous config saved to /var/cache/conftool/dbconfig/20240521-053602-ladsgroup.json
  • 05:35 marostegui: Deploy schema change on s7 (metawiki) eqiad dbmaint T365352
  • 05:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2102.codfw.wmnet with reason: host reimage
  • 05:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P62740 and previous config saved to /var/cache/conftool/dbconfig/20240521-052054-ladsgroup.json
  • 05:20 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2102.codfw.wmnet with reason: host reimage
  • 05:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P62739 and previous config saved to /var/cache/conftool/dbconfig/20240521-050546-ladsgroup.json
  • 05:05 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2102.codfw.wmnet with OS bookworm
  • 05:00 marostegui: Deploy schema change on s7 (metawiki) codfw dbmaint T365352
  • 04:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Schema change
  • 04:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Schema change
  • 04:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T352010)', diff saved to https://phabricator.wikimedia.org/P62738 and previous config saved to /var/cache/conftool/dbconfig/20240521-045037-ladsgroup.json
  • 04:05 mwpresync@deploy1002: Pruned MediaWiki: 1.43.0-wmf.3 (duration: 05m 28s)
  • 04:01 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.43.0-wmf.6 refs T361400 (duration: 58m 51s)
  • 03:02 mwpresync@deploy1002: Started scap: testwikis wikis to 1.43.0-wmf.6 refs T361400
  • 02:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1210 (T352010)', diff saved to https://phabricator.wikimedia.org/P62737 and previous config saved to /var/cache/conftool/dbconfig/20240521-024715-ladsgroup.json
  • 02:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 02:46 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 02:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T352010)', diff saved to https://phabricator.wikimedia.org/P62736 and previous config saved to /var/cache/conftool/dbconfig/20240521-024652-ladsgroup.json
  • 02:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P62735 and previous config saved to /var/cache/conftool/dbconfig/20240521-023144-ladsgroup.json
  • 02:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P62734 and previous config saved to /var/cache/conftool/dbconfig/20240521-021634-ladsgroup.json
  • 02:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T352010)', diff saved to https://phabricator.wikimedia.org/P62733 and previous config saved to /var/cache/conftool/dbconfig/20240521-020126-ladsgroup.json
  • 01:50 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1178 (T364299)', diff saved to https://phabricator.wikimedia.org/P62732 and previous config saved to /var/cache/conftool/dbconfig/20240521-015014-marostegui.json
  • 01:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 01:49 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 01:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T364299)', diff saved to https://phabricator.wikimedia.org/P62731 and previous config saved to /var/cache/conftool/dbconfig/20240521-014949-marostegui.json
  • 01:34 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P62730 and previous config saved to /var/cache/conftool/dbconfig/20240521-013441-marostegui.json
  • 01:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P62729 and previous config saved to /var/cache/conftool/dbconfig/20240521-011931-marostegui.json
  • 01:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T364299)', diff saved to https://phabricator.wikimedia.org/P62728 and previous config saved to /var/cache/conftool/dbconfig/20240521-010423-marostegui.json
  • 00:18 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2002.codfw.wmnet with OS bullseye
  • 00:17 eileen: civicrm upgraded from 19b6a9a0 to 8901b5b3
  • 00:17 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['sretest2002']
  • 00:16 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest2002']
  • 00:16 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 00:02 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 00:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 00:00 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED

2024-05-20

  • 23:53 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:52 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1200 (T352010)', diff saved to https://phabricator.wikimedia.org/P62727 and previous config saved to /var/cache/conftool/dbconfig/20240520-234431-ladsgroup.json
  • 23:44 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 23:44 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 23:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T352010)', diff saved to https://phabricator.wikimedia.org/P62726 and previous config saved to /var/cache/conftool/dbconfig/20240520-234406-ladsgroup.json
  • 23:33 eileen: civicrm upgraded from f838d84d to 19b6a9a0
  • 23:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P62725 and previous config saved to /var/cache/conftool/dbconfig/20240520-232858-ladsgroup.json
  • 23:26 mutante: LDAP - added jaycano to wmf group (T365349)
  • 23:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P62724 and previous config saved to /var/cache/conftool/dbconfig/20240520-231350-ladsgroup.json
  • 23:13 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 23:05 ryankemper@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 22:59 ryankemper@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 22:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T352010)', diff saved to https://phabricator.wikimedia.org/P62723 and previous config saved to /var/cache/conftool/dbconfig/20240520-225842-ladsgroup.json
  • 22:17 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:17 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:16 urbanecm@deploy1002: Finished scap: Backport for Add account_conversion event streams. (T363815) (duration: 16m 18s)
  • 22:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 22:02 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 22:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T352010)', diff saved to https://phabricator.wikimedia.org/P62722 and previous config saved to /var/cache/conftool/dbconfig/20240520-220247-ladsgroup.json
  • 22:00 urbanecm@deploy1002: Started scap: Backport for Add account_conversion event streams. (T363815)
  • 21:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P62721 and previous config saved to /var/cache/conftool/dbconfig/20240520-214739-ladsgroup.json
  • 21:38 bking@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 21:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P62720 and previous config saved to /var/cache/conftool/dbconfig/20240520-213230-ladsgroup.json
  • 21:32 bking@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-eqiad cluster: Roll restart of jvm daemons.
  • 21:29 bking@cumin2002: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 21:22 bking@cumin2002: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-flink-codfw cluster: Roll restart of jvm daemons.
  • 21:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T352010)', diff saved to https://phabricator.wikimedia.org/P62719 and previous config saved to /var/cache/conftool/dbconfig/20240520-211721-ladsgroup.json
  • 20:57 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:57 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:51 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:51 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:44 urbanecm@deploy1002: Finished scap: Backport for Remove readability survey tool (T349337), wgVectorShareUserScripts should be false now (T301212) (duration: 18m 34s)
  • 20:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1185 (T352010)', diff saved to https://phabricator.wikimedia.org/P62718 and previous config saved to /var/cache/conftool/dbconfig/20240520-203811-ladsgroup.json
  • 20:38 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 20:37 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 20:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T352010)', diff saved to https://phabricator.wikimedia.org/P62717 and previous config saved to /var/cache/conftool/dbconfig/20240520-203748-ladsgroup.json
  • 20:30 urbanecm@deploy1002: ksarabia and jdlrobson and urbanecm: Continuing with sync
  • 20:28 urbanecm@deploy1002: ksarabia and jdlrobson and urbanecm: Backport for Remove readability survey tool (T349337), wgVectorShareUserScripts should be false now (T301212) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:25 urbanecm@deploy1002: Started scap: Backport for Remove readability survey tool (T349337), wgVectorShareUserScripts should be false now (T301212)
  • 20:25 urbanecm@deploy1002: Finished scap: Backport for Introduce sample overrides to web_ui_actions (T361962), Disable wgParserEnableLegacyMediaDOM (T363597), Disable last remaining projects using share user scripts (T301212) (duration: 18m 18s)
  • 20:24 eileen: config revision changed from 21dba21a to 22106526
  • 20:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P62716 and previous config saved to /var/cache/conftool/dbconfig/20240520-202240-ladsgroup.json
  • 20:11 urbanecm@deploy1002: urbanecm and jdlrobson and ksarabia: Continuing with sync
  • 20:09 urbanecm@deploy1002: urbanecm and jdlrobson and ksarabia: Backport for Introduce sample overrides to web_ui_actions (T361962), Disable wgParserEnableLegacyMediaDOM (T363597), Disable last remaining projects using share user scripts (T301212) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P62715 and previous config saved to /var/cache/conftool/dbconfig/20240520-200732-ladsgroup.json
  • 20:06 urbanecm@deploy1002: Started scap: Backport for Introduce sample overrides to web_ui_actions (T361962), Disable wgParserEnableLegacyMediaDOM (T363597), Disable last remaining projects using share user scripts (T301212)
  • 19:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T352010)', diff saved to https://phabricator.wikimedia.org/P62714 and previous config saved to /var/cache/conftool/dbconfig/20240520-195224-ladsgroup.json
  • 19:46 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 19:45 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 19:33 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 19:32 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 19:31 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 19:31 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 19:23 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:23 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:20 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 19:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62713 and previous config saved to /var/cache/conftool/dbconfig/20240520-190908-root.json
  • 19:02 ebernhardson@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:02 ebernhardson@deploy1002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62712 and previous config saved to /var/cache/conftool/dbconfig/20240520-185402-root.json
  • 18:43 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:43 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62711 and previous config saved to /var/cache/conftool/dbconfig/20240520-183856-root.json
  • 18:29 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 18:28 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 18:23 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62710 and previous config saved to /var/cache/conftool/dbconfig/20240520-182350-root.json
  • 18:16 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 18:15 swfrench@deploy1002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 18:11 swfrench@deploy1002: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 18:09 swfrench@deploy1002: helmfile [staging] START helmfile.d/services/citoid: apply
  • 18:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62709 and previous config saved to /var/cache/conftool/dbconfig/20240520-180844-root.json
  • 18:00 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 17:59 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1008.eqiad.wmnet with OS bullseye
  • 17:59 akosiaris@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - akosiaris@cumin1002"
  • 17:59 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1007.eqiad.wmnet with OS bullseye
  • 17:59 akosiaris@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - akosiaris@cumin1002"
  • 17:59 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 17:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62708 and previous config saved to /var/cache/conftool/dbconfig/20240520-175337-root.json
  • 17:42 mforns@deploy1002: Finished deploy [airflow-dags/analytics@b977332]: (no justification provided) (duration: 00m 27s)
  • 17:42 mforns@deploy1002: Started deploy [airflow-dags/analytics@b977332]: (no justification provided)
  • 17:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62707 and previous config saved to /var/cache/conftool/dbconfig/20240520-173831-root.json
  • 17:33 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2006.codfw.wmnet with OS bullseye
  • 17:30 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2008.codfw.wmnet with OS bullseye
  • 17:25 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2010.codfw.wmnet with OS bullseye
  • 17:23 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - akosiaris@cumin1002"
  • 17:21 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2007.codfw.wmnet with OS bullseye
  • 17:16 akosiaris@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - akosiaris@cumin1002"
  • 17:16 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
  • 17:14 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 17:13 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
  • 17:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1183 (T352010)', diff saved to https://phabricator.wikimedia.org/P62706 and previous config saved to /var/cache/conftool/dbconfig/20240520-171228-ladsgroup.json
  • 17:12 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 17:12 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1183.eqiad.wmnet with reason: Maintenance
  • 17:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T352010)', diff saved to https://phabricator.wikimedia.org/P62705 and previous config saved to /var/cache/conftool/dbconfig/20240520-171204-ladsgroup.json
  • 17:09 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
  • 17:06 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
  • 17:04 jforrester@deploy1002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 17:03 taavi@cumin1002: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
  • 17:03 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
  • 17:02 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
  • 17:01 jforrester@deploy1002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 17:01 jforrester@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 17:01 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
  • 17:00 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
  • 17:00 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
  • 17:00 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
  • 16:59 jforrester@deploy1002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 16:58 jforrester@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 16:58 taavi@cumin1002: START - Cookbook sre.wikireplicas.update-views
  • 16:57 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
  • 16:57 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 16:57 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:57 jforrester@deploy1002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 16:56 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 16:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P62704 and previous config saved to /var/cache/conftool/dbconfig/20240520-165656-ladsgroup.json
  • 16:55 robh@cumin2002: START - Cookbook sre.dns.netbox
  • 16:55 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
  • 16:54 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
  • 16:54 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
  • 16:54 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:54 jforrester@deploy1002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 16:53 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:53 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:53 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 16:53 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 16:52 jforrester@deploy1002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 16:52 jforrester@deploy1002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 16:52 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 16:52 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 16:51 jforrester@deploy1002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 16:50 jforrester@deploy1002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 16:50 jforrester@deploy1002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 16:48 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:46 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 16:42 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bullseye
  • 16:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P62703 and previous config saved to /var/cache/conftool/dbconfig/20240520-164148-ladsgroup.json
  • 16:41 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS bullseye
  • 16:40 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS bullseye
  • 16:40 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS bullseye
  • 16:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:39 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:38 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 16:38 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1008.eqiad.wmnet with OS bullseye
  • 16:37 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1007.eqiad.wmnet with OS bullseye
  • 16:37 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:36 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:35 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:32 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:32 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2002 to codfw - jhancock@cumin2002"
  • 16:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2002 to codfw - jhancock@cumin2002"
  • 16:29 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 16:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T352010)', diff saved to https://phabricator.wikimedia.org/P62702 and previous config saved to /var/cache/conftool/dbconfig/20240520-162640-ladsgroup.json
  • 16:21 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 16:04 akosiaris@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
  • 16:00 akosiaris@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
  • 15:58 swfrench@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 15:56 swfrench@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 15:55 swfrench@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 15:52 swfrench@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 15:52 swfrench@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:51 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2010.codfw.wmnet with OS bullseye
  • 15:50 swfrench@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 15:50 swfrench@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 15:48 swfrench@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 15:41 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 15:38 vgutierrez: repool upload@codfw with IPIP encapsulation enabled - T357257
  • 15:33 hnowlan: move 100% of commons traffic to run on k8s
  • 15:30 vgutierrez: rolling restart of pybal on lvs2014 and lvs2012 - T357257
  • 15:27 ejegg: payments-wiki upgraded from 3c23d3d8 to bc25f115
  • 15:21 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['sretest2002']
  • 15:21 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest2002']
  • 15:05 vgutierrez: depool upload@codfw before enabling IPIP encapsulation - T357257
  • 15:04 mforns@deploy1002: Finished deploy [analytics/refinery@4d42877] (hadoop-test): Deploy Commons Impact Metrics query improvements TEST [analytics/refinery@4d42877e] (duration: 03m 50s)
  • 15:02 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bullseye
  • 15:01 mforns@deploy1002: Started deploy [analytics/refinery@4d42877] (hadoop-test): Deploy Commons Impact Metrics query improvements TEST [analytics/refinery@4d42877e]
  • 14:59 mforns@deploy1002: Finished deploy [analytics/refinery@4d42877] (thin): Deploy Commons Impact Metrics query improvements THIN [analytics/refinery@4d42877e] (duration: 04m 00s)
  • 14:55 mforns@deploy1002: Started deploy [analytics/refinery@4d42877] (thin): Deploy Commons Impact Metrics query improvements THIN [analytics/refinery@4d42877e]
  • 14:53 ejegg: re-enabled fundraising scheduled jobs
  • 14:53 mforns@deploy1002: Finished deploy [analytics/refinery@4d42877]: Deploy Commons Impact Metrics query improvements [analytics/refinery@4d42877e] (duration: 14m 08s)
  • 14:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['sretest2002']
  • 14:48 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest2002']
  • 14:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['sretest2002']
  • 14:48 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest2002']
  • 14:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:47 ejegg: fundraising civicrm upgraded from 4f6f2dc3 to 7839feb6
  • 14:46 ejegg: disabled fundraising scheduled jobs for Civi upgrade
  • 14:42 filippo@deploy1002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
  • 14:41 filippo@deploy1002: helmfile [eqiad] START helmfile.d/services/citoid: apply
  • 14:41 filippo@deploy1002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
  • 14:41 filippo@deploy1002: helmfile [codfw] START helmfile.d/services/citoid: apply
  • 14:41 filippo@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 14:40 filippo@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 14:40 filippo@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 14:39 filippo@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 14:39 filippo@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 14:39 filippo@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 14:38 mforns@deploy1002: Started deploy [analytics/refinery@4d42877]: Deploy Commons Impact Metrics query improvements [analytics/refinery@4d42877e]
  • 14:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1177 (T364299)', diff saved to https://phabricator.wikimedia.org/P62700 and previous config saved to /var/cache/conftool/dbconfig/20240520-142621-marostegui.json
  • 14:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 14:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 14:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T364299)', diff saved to https://phabricator.wikimedia.org/P62699 and previous config saved to /var/cache/conftool/dbconfig/20240520-142558-marostegui.json
  • 14:19 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2010.codfw.wmnet with OS bullseye
  • 14:19 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 14:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62698 and previous config saved to /var/cache/conftool/dbconfig/20240520-141828-root.json
  • 14:18 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2008.codfw.wmnet with OS bullseye
  • 14:17 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main2007.codfw.wmnet with OS bullseye
  • 14:17 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main2006.codfw.wmnet with OS bullseye
  • 14:14 reedy@deploy1002: Synchronized wmf-config/core-Permissions.php: T360977 (duration: 15m 54s)
  • 14:12 vgutierrez: repool upload@eqsin with IPIP encapsulation enabled - T357257
  • 14:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P62697 and previous config saved to /var/cache/conftool/dbconfig/20240520-141050-marostegui.json
  • 14:06 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config: apply
  • 14:06 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 14:03 marostegui@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62696 and previous config saved to /var/cache/conftool/dbconfig/20240520-140321-root.json
  • 14:03 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config: apply
  • 14:02 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 14:01 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2002.wikimedia.org with OS bookworm
  • 14:00 vgutierrez: rolling restart of pybal on lvs5005 and lvs5006 - T357257
  • 13:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P62695 and previous config saved to /var/cache/conftool/dbconfig/20240520-135542-marostegui.json
  • 13:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62694 and previous config saved to /var/cache/conftool/dbconfig/20240520-134815-root.json
  • 13:47 reedy@deploy1002: Synchronized wmf-config/throttle.php: T365221 (duration: 15m 20s)
  • 13:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1161 (T352010)', diff saved to https://phabricator.wikimedia.org/P62693 and previous config saved to /var/cache/conftool/dbconfig/20240520-134613-ladsgroup.json
  • 13:46 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:46 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 13:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 13:45 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 13:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T364299)', diff saved to https://phabricator.wikimedia.org/P62692 and previous config saved to /var/cache/conftool/dbconfig/20240520-134034-marostegui.json
  • 13:39 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 13:38 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 13:33 marostegui@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62691 and previous config saved to /var/cache/conftool/dbconfig/20240520-133309-root.json
  • 13:29 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bullseye
  • 13:28 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 13:27 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS bullseye
  • 13:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 13:27 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS bullseye
  • 13:26 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS bullseye
  • 13:25 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:24 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 13:24 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 13:23 reedy@deploy1002: Synchronized wmf-config/: T360989 T365323 (duration: 15m 35s)
  • 13:22 hnowlan: migrating 80% of commons traffic to k8s
  • 13:19 topranks: adding outbound ACL on irb.2002 on lsw1 switches in codfw to test DHCP function T365204
  • 13:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 13:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62689 and previous config saved to /var/cache/conftool/dbconfig/20240520-131803-root.json
  • 13:17 vgutierrez: depool upload@eqsin before enabling IPIP encapsulation - T357257
  • 13:11 vgutierrez: Re-enable puppet on A:ncredir && A:cp-upload_ulsfo - T365354
  • 13:04 Emperor: depool, restart swift-proxy, repool moss-fe1001 as ~12% connection failures reported by envoy since late 14th May T360913
  • 13:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62688 and previous config saved to /var/cache/conftool/dbconfig/20240520-130257-root.json
  • 12:59 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 12:54 vgutierrez: disable puppet on A:ncredir && A:cp-upload_ulsfo before merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/1034074 - T365354
  • 12:52 marostegui: Deploy schema change on s7 (only frwiktionary) eqiad with replication dbmaint T365352
  • 12:48 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2002.wikimedia.org with OS bookworm
  • 12:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2181 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62687 and previous config saved to /var/cache/conftool/dbconfig/20240520-124749-root.json
  • 12:46 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:46 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change mgmt dns for sretest2002 - cmooney@cumin1002"
  • 12:45 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change mgmt dns for sretest2002 - cmooney@cumin1002"
  • 12:44 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest2002.mgmt.codfw.wmnet on all recursors
  • 12:44 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache sretest2002.mgmt.codfw.wmnet on all recursors
  • 12:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS bookworm
  • 12:01 marostegui: Deploy schema change on s4 eqiad with replication dbmaint T365352
  • 11:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
  • 11:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
  • 11:56 marostegui: Deploy schema change on s5 eqiad with replication dbmaint T365352
  • 11:47 marostegui: Deploy urgent schema change on s8 eqiad with replication dbmaint T365352
  • 11:40 hnowlan: migrating 30% of commons traffic to k8s
  • 11:38 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS bookworm
  • 11:30 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P62685 and previous config saved to /var/cache/conftool/dbconfig/20240520-113038-root.json
  • 11:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Migration to bookworm
  • 11:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Migration to bookworm
  • 11:15 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0)
  • 11:15 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 75%: After reimage', diff saved to https://phabricator.wikimedia.org/P62684 and previous config saved to /var/cache/conftool/dbconfig/20240520-111530-root.json
  • 11:15 marostegui@cumin1002: Updating IPMI password on 1 hosts - marostegui@cumin1002
  • 11:14 marostegui@cumin1002: START - Cookbook sre.hosts.ipmi-password-reset
  • 11:14 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99)
  • 11:14 marostegui@cumin1002: START - Cookbook sre.hosts.ipmi-password-reset
  • 11:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: Migration to bookworm
  • 11:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: Migration to bookworm
  • 11:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2181 T363792', diff saved to https://phabricator.wikimedia.org/P62682 and previous config saved to /var/cache/conftool/dbconfig/20240520-110217-root.json
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 50%: After reimage', diff saved to https://phabricator.wikimedia.org/P62681 and previous config saved to /var/cache/conftool/dbconfig/20240520-110023-root.json
  • 10:46 Dreamy_Jazz: Restarting MediaModeration scanning script - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 10:45 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P62680 and previous config saved to /var/cache/conftool/dbconfig/20240520-104517-root.json
  • 10:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2175.codfw.wmnet with OS bookworm
  • 10:30 marostegui@cumin1002: dbctl commit (dc=all): 'db2175 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P62679 and previous config saved to /var/cache/conftool/dbconfig/20240520-103011-root.json
  • 10:18 godog: bounce prometheus@k8s in eqiad - T343529
  • 10:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2175.codfw.wmnet with reason: host reimage
  • 10:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2175.codfw.wmnet with reason: host reimage
  • 09:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T352010)', diff saved to https://phabricator.wikimedia.org/P62678 and previous config saved to /var/cache/conftool/dbconfig/20240520-095729-ladsgroup.json
  • 09:57 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 09:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 09:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T352010)', diff saved to https://phabricator.wikimedia.org/P62677 and previous config saved to /var/cache/conftool/dbconfig/20240520-095706-ladsgroup.json
  • 09:45 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2175.codfw.wmnet with OS bookworm
  • 09:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2175.codfw.wmnet with reason: Migration to bookworm
  • 09:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2175.codfw.wmnet with reason: Migration to bookworm
  • 09:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2175 T361543', diff saved to https://phabricator.wikimedia.org/P62676 and previous config saved to /var/cache/conftool/dbconfig/20240520-094352-marostegui.json
  • 09:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P62675 and previous config saved to /var/cache/conftool/dbconfig/20240520-094159-ladsgroup.json
  • 09:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P62674 and previous config saved to /var/cache/conftool/dbconfig/20240520-092651-ladsgroup.json
  • 09:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1014.eqiad.wmnet with reason: Testing new mariadb version
  • 09:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1014.eqiad.wmnet with reason: Testing new mariadb version
  • 09:18 marostegui: Install 10.6.18 on db1125 and pc1014 T365338
  • 09:17 hnowlan: Increasing commons on k8s traffic to 15%
  • 09:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T352010)', diff saved to https://phabricator.wikimedia.org/P62673 and previous config saved to /var/cache/conftool/dbconfig/20240520-091143-ladsgroup.json
  • 09:02 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
  • 09:02 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
  • 08:57 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
  • 08:56 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
  • 08:47 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1125.eqiad.wmnet with OS bookworm
  • 08:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1125.eqiad.wmnet with reason: host reimage
  • 08:28 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1125.eqiad.wmnet with reason: host reimage
  • 08:19 urbanecm@deploy1002: Finished scap: Backport for [Growth] enwiki: Enable AddLink backend (T308144) (duration: 17m 07s)
  • 08:16 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1125.eqiad.wmnet with OS bookworm
  • 08:06 urbanecm@deploy1002: urbanecm: Continuing with sync
  • 08:05 urbanecm@deploy1002: urbanecm: Backport for [Growth] enwiki: Enable AddLink backend (T308144) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:02 urbanecm@deploy1002: Started scap: Backport for [Growth] enwiki: Enable AddLink backend (T308144)
  • 07:41 urbanecm@deploy1002: Finished scap: Backport for Update interwiki.php cache (T363658) (duration: 27m 04s)
  • 07:14 urbanecm@deploy1002: Started scap: Backport for Update interwiki.php cache (T363658)
  • 06:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on db2161.codfw.wmnet with reason: Schema change T364299
  • 06:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 20:00:00 on db2161.codfw.wmnet with reason: Schema change T364299
  • 05:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2161 T365339', diff saved to https://phabricator.wikimedia.org/P62672 and previous config saved to /var/cache/conftool/dbconfig/20240520-055908-root.json
  • 05:58 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2165 to s8 primary T365339', diff saved to https://phabricator.wikimedia.org/P62671 and previous config saved to /var/cache/conftool/dbconfig/20240520-055812-root.json
  • 05:57 marostegui: Starting s8 codfw failover from db2161 to db2165 - T365339
  • 05:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s8 T365339
  • 05:35 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2165 with weight 0 T365339', diff saved to https://phabricator.wikimedia.org/P62670 and previous config saved to /var/cache/conftool/dbconfig/20240520-053523-root.json
  • 05:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s8 T365339
  • 03:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1172 (T364299)', diff saved to https://phabricator.wikimedia.org/P62669 and previous config saved to /var/cache/conftool/dbconfig/20240520-034057-marostegui.json
  • 03:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 03:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance

2024-05-19

  • 22:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2168 (T352010)', diff saved to https://phabricator.wikimedia.org/P62668 and previous config saved to /var/cache/conftool/dbconfig/20240519-223525-ladsgroup.json
  • 22:35 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 22:35 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 22:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T352010)', diff saved to https://phabricator.wikimedia.org/P62667 and previous config saved to /var/cache/conftool/dbconfig/20240519-223502-ladsgroup.json
  • 22:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P62666 and previous config saved to /var/cache/conftool/dbconfig/20240519-221954-ladsgroup.json
  • 22:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P62665 and previous config saved to /var/cache/conftool/dbconfig/20240519-220445-ladsgroup.json
  • 21:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T352010)', diff saved to https://phabricator.wikimedia.org/P62664 and previous config saved to /var/cache/conftool/dbconfig/20240519-214936-ladsgroup.json
  • 18:56 vgutierrez: vgutierrez@cp4049:~$ sudo rm /var/lib/prometheus/node.d/realserver-mss.prom
  • 18:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 18:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 18:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T364299)', diff saved to https://phabricator.wikimedia.org/P62663 and previous config saved to /var/cache/conftool/dbconfig/20240519-182447-marostegui.json
  • 18:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P62662 and previous config saved to /var/cache/conftool/dbconfig/20240519-180939-marostegui.json
  • 17:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P62661 and previous config saved to /var/cache/conftool/dbconfig/20240519-175431-marostegui.json
  • 17:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T364299)', diff saved to https://phabricator.wikimedia.org/P62660 and previous config saved to /var/cache/conftool/dbconfig/20240519-173923-marostegui.json
  • 16:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Schema change
  • 16:39 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Schema change
  • 16:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1167 (T364299)', diff saved to https://phabricator.wikimedia.org/P62658 and previous config saved to /var/cache/conftool/dbconfig/20240519-163855-marostegui.json
  • 16:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 16:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 16:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 16:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 11:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T352010)', diff saved to https://phabricator.wikimedia.org/P62657 and previous config saved to /var/cache/conftool/dbconfig/20240519-112730-ladsgroup.json
  • 11:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P62656 and previous config saved to /var/cache/conftool/dbconfig/20240519-111222-ladsgroup.json
  • 10:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P62655 and previous config saved to /var/cache/conftool/dbconfig/20240519-105714-ladsgroup.json
  • 10:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T352010)', diff saved to https://phabricator.wikimedia.org/P62654 and previous config saved to /var/cache/conftool/dbconfig/20240519-104206-ladsgroup.json
  • 10:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T352010)', diff saved to https://phabricator.wikimedia.org/P62653 and previous config saved to /var/cache/conftool/dbconfig/20240519-102315-ladsgroup.json
  • 10:23 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 10:23 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 10:23 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 10:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 10:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T352010)', diff saved to https://phabricator.wikimedia.org/P62652 and previous config saved to /var/cache/conftool/dbconfig/20240519-102247-ladsgroup.json
  • 10:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P62651 and previous config saved to /var/cache/conftool/dbconfig/20240519-100739-ladsgroup.json
  • 09:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P62650 and previous config saved to /var/cache/conftool/dbconfig/20240519-095231-ladsgroup.json
  • 09:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T352010)', diff saved to https://phabricator.wikimedia.org/P62649 and previous config saved to /var/cache/conftool/dbconfig/20240519-093723-ladsgroup.json
  • 07:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2217 (T352010)', diff saved to https://phabricator.wikimedia.org/P62648 and previous config saved to /var/cache/conftool/dbconfig/20240519-074556-ladsgroup.json
  • 07:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
  • 07:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
  • 07:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214 (T352010)', diff saved to https://phabricator.wikimedia.org/P62647 and previous config saved to /var/cache/conftool/dbconfig/20240519-074532-ladsgroup.json
  • 07:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214', diff saved to https://phabricator.wikimedia.org/P62646 and previous config saved to /var/cache/conftool/dbconfig/20240519-073025-ladsgroup.json
  • 07:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214', diff saved to https://phabricator.wikimedia.org/P62645 and previous config saved to /var/cache/conftool/dbconfig/20240519-071517-ladsgroup.json
  • 07:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2214 (T352010)', diff saved to https://phabricator.wikimedia.org/P62644 and previous config saved to /var/cache/conftool/dbconfig/20240519-070008-ladsgroup.json
  • 05:32 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 05:32 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 05:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2214 (T352010)', diff saved to https://phabricator.wikimedia.org/P62643 and previous config saved to /var/cache/conftool/dbconfig/20240519-051029-ladsgroup.json
  • 05:10 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2214.codfw.wmnet with reason: Maintenance
  • 05:10 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2214.codfw.wmnet with reason: Maintenance
  • 01:43 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 01:43 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 01:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T352010)', diff saved to https://phabricator.wikimedia.org/P62642 and previous config saved to /var/cache/conftool/dbconfig/20240519-014335-ladsgroup.json
  • 01:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P62641 and previous config saved to /var/cache/conftool/dbconfig/20240519-012827-ladsgroup.json
  • 01:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P62640 and previous config saved to /var/cache/conftool/dbconfig/20240519-011320-ladsgroup.json
  • 00:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T352010)', diff saved to https://phabricator.wikimedia.org/P62639 and previous config saved to /var/cache/conftool/dbconfig/20240519-005811-ladsgroup.json

2024-05-18

  • 23:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T352010)', diff saved to https://phabricator.wikimedia.org/P62638 and previous config saved to /var/cache/conftool/dbconfig/20240518-230800-ladsgroup.json
  • 23:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 23:07 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 23:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62637 and previous config saved to /var/cache/conftool/dbconfig/20240518-230736-ladsgroup.json
  • 22:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P62636 and previous config saved to /var/cache/conftool/dbconfig/20240518-225228-ladsgroup.json
  • 22:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P62635 and previous config saved to /var/cache/conftool/dbconfig/20240518-223720-ladsgroup.json
  • 22:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T352010)', diff saved to https://phabricator.wikimedia.org/P62634 and previous config saved to /var/cache/conftool/dbconfig/20240518-222748-ladsgroup.json
  • 22:27 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 22:27 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 22:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T352010)', diff saved to https://phabricator.wikimedia.org/P62633 and previous config saved to /var/cache/conftool/dbconfig/20240518-222725-ladsgroup.json
  • 22:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62632 and previous config saved to /var/cache/conftool/dbconfig/20240518-222212-ladsgroup.json
  • 22:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P62631 and previous config saved to /var/cache/conftool/dbconfig/20240518-221216-ladsgroup.json
  • 21:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P62630 and previous config saved to /var/cache/conftool/dbconfig/20240518-215708-ladsgroup.json
  • 21:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T352010)', diff saved to https://phabricator.wikimedia.org/P62629 and previous config saved to /var/cache/conftool/dbconfig/20240518-214200-ladsgroup.json
  • 20:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62628 and previous config saved to /var/cache/conftool/dbconfig/20240518-200322-ladsgroup.json
  • 20:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 20:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 20:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T352010)', diff saved to https://phabricator.wikimedia.org/P62627 and previous config saved to /var/cache/conftool/dbconfig/20240518-200258-ladsgroup.json
  • 19:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P62626 and previous config saved to /var/cache/conftool/dbconfig/20240518-194750-ladsgroup.json
  • 19:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P62625 and previous config saved to /var/cache/conftool/dbconfig/20240518-193240-ladsgroup.json
  • 19:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T352010)', diff saved to https://phabricator.wikimedia.org/P62624 and previous config saved to /var/cache/conftool/dbconfig/20240518-191732-ladsgroup.json
  • 18:59 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw
  • 18:58 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw
  • 18:56 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2090.codfw.wmnet with OS bullseye
  • 18:36 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2090.codfw.wmnet with reason: host reimage
  • 18:33 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2090.codfw.wmnet with reason: host reimage
  • 18:16 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2090.codfw.wmnet with OS bullseye
  • 16:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 16:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 16:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T364299)', diff saved to https://phabricator.wikimedia.org/P62623 and previous config saved to /var/cache/conftool/dbconfig/20240518-162907-marostegui.json
  • 16:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P62622 and previous config saved to /var/cache/conftool/dbconfig/20240518-161400-marostegui.json
  • 15:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P62621 and previous config saved to /var/cache/conftool/dbconfig/20240518-155852-marostegui.json
  • 15:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2169 (T352010)', diff saved to https://phabricator.wikimedia.org/P62620 and previous config saved to /var/cache/conftool/dbconfig/20240518-155136-ladsgroup.json
  • 15:51 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 15:51 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 15:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62619 and previous config saved to /var/cache/conftool/dbconfig/20240518-155112-ladsgroup.json
  • 15:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T364299)', diff saved to https://phabricator.wikimedia.org/P62618 and previous config saved to /var/cache/conftool/dbconfig/20240518-154343-marostegui.json
  • 15:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P62617 and previous config saved to /var/cache/conftool/dbconfig/20240518-153604-ladsgroup.json
  • 15:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P62616 and previous config saved to /var/cache/conftool/dbconfig/20240518-152056-ladsgroup.json
  • 15:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62615 and previous config saved to /var/cache/conftool/dbconfig/20240518-150548-ladsgroup.json
  • 11:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62614 and previous config saved to /var/cache/conftool/dbconfig/20240518-112824-ladsgroup.json
  • 11:28 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:28 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:28 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 11:27 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 11:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T352010)', diff saved to https://phabricator.wikimedia.org/P62613 and previous config saved to /var/cache/conftool/dbconfig/20240518-112745-ladsgroup.json
  • 11:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P62612 and previous config saved to /var/cache/conftool/dbconfig/20240518-111237-ladsgroup.json
  • 10:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P62611 and previous config saved to /var/cache/conftool/dbconfig/20240518-105729-ladsgroup.json
  • 10:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T352010)', diff saved to https://phabricator.wikimedia.org/P62610 and previous config saved to /var/cache/conftool/dbconfig/20240518-104222-ladsgroup.json
  • 07:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T352010)', diff saved to https://phabricator.wikimedia.org/P62609 and previous config saved to /var/cache/conftool/dbconfig/20240518-071726-ladsgroup.json
  • 07:17 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 07:17 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 07:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T352010)', diff saved to https://phabricator.wikimedia.org/P62608 and previous config saved to /var/cache/conftool/dbconfig/20240518-071703-ladsgroup.json
  • 07:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P62607 and previous config saved to /var/cache/conftool/dbconfig/20240518-070155-ladsgroup.json
  • 06:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P62606 and previous config saved to /var/cache/conftool/dbconfig/20240518-064646-ladsgroup.json
  • 06:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2195 (T364299)', diff saved to https://phabricator.wikimedia.org/P62605 and previous config saved to /var/cache/conftool/dbconfig/20240518-063529-marostegui.json
  • 06:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 06:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 06:35 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T364299)', diff saved to https://phabricator.wikimedia.org/P62604 and previous config saved to /var/cache/conftool/dbconfig/20240518-063505-marostegui.json
  • 06:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T352010)', diff saved to https://phabricator.wikimedia.org/P62603 and previous config saved to /var/cache/conftool/dbconfig/20240518-063138-ladsgroup.json
  • 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P62602 and previous config saved to /var/cache/conftool/dbconfig/20240518-061958-marostegui.json
  • 06:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P62601 and previous config saved to /var/cache/conftool/dbconfig/20240518-060450-marostegui.json
  • 05:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T352010)', diff saved to https://phabricator.wikimedia.org/P62600 and previous config saved to /var/cache/conftool/dbconfig/20240518-055125-ladsgroup.json
  • 05:51 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 05:51 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 05:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T352010)', diff saved to https://phabricator.wikimedia.org/P62599 and previous config saved to /var/cache/conftool/dbconfig/20240518-055100-ladsgroup.json
  • 05:49 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T364299)', diff saved to https://phabricator.wikimedia.org/P62598 and previous config saved to /var/cache/conftool/dbconfig/20240518-054942-marostegui.json
  • 05:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P62597 and previous config saved to /var/cache/conftool/dbconfig/20240518-053550-ladsgroup.json
  • 05:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P62596 and previous config saved to /var/cache/conftool/dbconfig/20240518-052043-ladsgroup.json
  • 05:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T352010)', diff saved to https://phabricator.wikimedia.org/P62595 and previous config saved to /var/cache/conftool/dbconfig/20240518-050535-ladsgroup.json
  • 03:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T352010)', diff saved to https://phabricator.wikimedia.org/P62594 and previous config saved to /var/cache/conftool/dbconfig/20240518-030359-ladsgroup.json
  • 03:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 03:03 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 02:39 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2090.codfw.wmnet with OS bullseye
  • 01:18 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2090.codfw.wmnet with OS bullseye
  • 00:45 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 00:45 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 00:35 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic2090* for ban elastic2090 before reimage - ryankemper@cumin2002 - T353878
  • 00:35 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic2090* for ban elastic2090 before reimage - ryankemper@cumin2002 - T353878
  • 00:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 00:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
  • 00:02 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"

2024-05-17

  • 23:46 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
  • 23:43 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
  • 23:41 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 23:41 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 23:08 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 23:06 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1008.eqiad.wmnet with OS bullseye
  • 23:05 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1007.eqiad.wmnet with OS bullseye
  • 22:43 akosiaris@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 22:21 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS bullseye
  • 22:20 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1008.eqiad.wmnet with OS bullseye
  • 22:20 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 22:19 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1007.eqiad.wmnet with OS bullseye
  • 21:57 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 21:57 akosiaris@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 21:47 akosiaris@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 21:10 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 21:02 ryankemper@puppetmaster1001: conftool action : set/weight=10:pooled=yes; selector: name=elastic2090\.codfw\.wmnet
  • 20:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 20:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 19:43 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 19:42 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:38 dzahn@cumin1002: conftool action : set/pooled=no; selector: name=ml-serve2002.codfw.wmnet
  • 19:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 18:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2181 (T364299)', diff saved to https://phabricator.wikimedia.org/P62592 and previous config saved to /var/cache/conftool/dbconfig/20240517-184554-marostegui.json
  • 18:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 18:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 18:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T364299)', diff saved to https://phabricator.wikimedia.org/P62591 and previous config saved to /var/cache/conftool/dbconfig/20240517-184530-marostegui.json
  • 18:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P62590 and previous config saved to /var/cache/conftool/dbconfig/20240517-183022-marostegui.json
  • 18:22 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P62589 and previous config saved to /var/cache/conftool/dbconfig/20240517-181515-marostegui.json
  • 18:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T364299)', diff saved to https://phabricator.wikimedia.org/P62588 and previous config saved to /var/cache/conftool/dbconfig/20240517-180006-marostegui.json
  • 17:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T352010)', diff saved to https://phabricator.wikimedia.org/P62587 and previous config saved to /var/cache/conftool/dbconfig/20240517-173608-ladsgroup.json
  • 17:36 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 17:35 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 17:18 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 17:18 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 17:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:35 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2002.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:22 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:22 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2002 to codfw - jhancock@cumin2002"
  • 16:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2002 to codfw - jhancock@cumin2002"
  • 16:17 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 16:15 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 15:22 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 15:21 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 15:20 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye
  • 14:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 14:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 14:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T352010)', diff saved to https://phabricator.wikimedia.org/P62585 and previous config saved to /var/cache/conftool/dbconfig/20240517-140648-ladsgroup.json
  • 13:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P62584 and previous config saved to /var/cache/conftool/dbconfig/20240517-135138-ladsgroup.json
  • 13:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P62583 and previous config saved to /var/cache/conftool/dbconfig/20240517-133630-ladsgroup.json
  • 13:26 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:25 arnaudb@cumin1002: END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:24 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:23 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:22 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:22 arnaudb@cumin1002: START - Cookbook sre.mysql.clone Will create a clone of db2114.codfw.wmnet onto db1125.eqiad.wmnet
  • 13:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T352010)', diff saved to https://phabricator.wikimedia.org/P62582 and previous config saved to /var/cache/conftool/dbconfig/20240517-132122-ladsgroup.json
  • 12:56 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kubestagetcd[1004-1006].eqiad.wmnet
  • 12:56 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:56 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagetcd[1004-1006].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:55 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagetcd[1004-1006].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ldap-replica1006.wikimedia.org
  • 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica1006.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:46 kevinbazira@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ldap-replica1005.wikimedia.org
  • 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica1005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica1005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:24 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on kubestagetcd[1004-1006].eqiad.wmnet with reason: decom
  • 12:24 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on kubestagetcd[1004-1006].eqiad.wmnet with reason: decom
  • 12:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 12:12 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kubestagemaster[1001-1002].eqiad.wmnet
  • 12:12 jayme@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:12 jayme@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagemaster[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:11 jayme@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubestagemaster[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1002"
  • 12:11 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ldap-replica1005.wikimedia.org
  • 12:11 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 12:09 jayme@cumin1002: START - Cookbook sre.dns.netbox
  • 12:08 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 12:07 tchin@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/datasets-config: apply
  • 12:07 tchin@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/datasets-config: apply
  • 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ldap-replica2008.wikimedia.org
  • 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica2008.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:05 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica2008.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 12:02 jayme@cumin1002: START - Cookbook sre.hosts.decommission for hosts kubestagemaster[1001-1002].eqiad.wmnet
  • 11:56 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 11:53 jayme@cumin1002: conftool action : set/pooled=inactive; selector: name=kubestagemaster100[12].eqiad.wmnet
  • 11:51 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ldap-replica2008.wikimedia.org
  • 11:51 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on kubestagemaster[1001-1002].eqiad.wmnet with reason: decom
  • 11:51 jayme@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on kubestagemaster[1001-1002].eqiad.wmnet with reason: decom
  • 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ldap-replica2007.wikimedia.org
  • 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica2007.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 11:47 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ldap-replica2007.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 11:44 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 11:39 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ldap-replica2007.wikimedia.org
  • 11:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1224 (T352010)', diff saved to https://phabricator.wikimedia.org/P62579 and previous config saved to /var/cache/conftool/dbconfig/20240517-113142-ladsgroup.json
  • 11:31 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 11:31 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 11:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T352010)', diff saved to https://phabricator.wikimedia.org/P62578 and previous config saved to /var/cache/conftool/dbconfig/20240517-113119-ladsgroup.json
  • 11:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P62577 and previous config saved to /var/cache/conftool/dbconfig/20240517-111611-ladsgroup.json
  • 11:08 jayme@cumin1002: conftool action : set/pooled=yes:weight=10; selector: name=kubestagemaster100[3-5].eqiad.wmnet
  • 11:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P62576 and previous config saved to /var/cache/conftool/dbconfig/20240517-110101-ladsgroup.json
  • 10:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T352010)', diff saved to https://phabricator.wikimedia.org/P62575 and previous config saved to /var/cache/conftool/dbconfig/20240517-104553-ladsgroup.json
  • 09:44 isaranto@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 09:39 isaranto@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 09:25 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1016.eqiad.wmnet
  • 09:17 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host snapshot1016.eqiad.wmnet
  • 09:06 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host snapshot1015.eqiad.wmnet
  • 09:01 btullis@cumin1002: START - Cookbook sre.hosts.reboot-single for host snapshot1015.eqiad.wmnet
  • 08:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T352010)', diff saved to https://phabricator.wikimedia.org/P62574 and previous config saved to /var/cache/conftool/dbconfig/20240517-082636-ladsgroup.json
  • 08:26 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 08:26 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 08:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T352010)', diff saved to https://phabricator.wikimedia.org/P62573 and previous config saved to /var/cache/conftool/dbconfig/20240517-082613-ladsgroup.json
  • 08:17 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
  • 08:17 jayme@deploy1002: helmfile [eqiad] START helmfile.d/services/zotero: apply
  • 08:16 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
  • 08:16 jayme@deploy1002: helmfile [codfw] START helmfile.d/services/zotero: apply
  • 08:15 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/zotero: apply
  • 08:14 jayme@deploy1002: helmfile [staging] START helmfile.d/services/zotero: apply
  • 08:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P62572 and previous config saved to /var/cache/conftool/dbconfig/20240517-081105-ladsgroup.json
  • 07:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P62571 and previous config saved to /var/cache/conftool/dbconfig/20240517-075558-ladsgroup.json
  • 07:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T352010)', diff saved to https://phabricator.wikimedia.org/P62570 and previous config saved to /var/cache/conftool/dbconfig/20240517-074050-ladsgroup.json
  • 06:59 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2167 (T364299)', diff saved to https://phabricator.wikimedia.org/P62568 and previous config saved to /var/cache/conftool/dbconfig/20240517-065920-marostegui.json
  • 06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 06:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T364299)', diff saved to https://phabricator.wikimedia.org/P62567 and previous config saved to /var/cache/conftool/dbconfig/20240517-065857-marostegui.json
  • 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P62566 and previous config saved to /var/cache/conftool/dbconfig/20240517-064350-marostegui.json
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P62565 and previous config saved to /var/cache/conftool/dbconfig/20240517-062842-marostegui.json
  • 06:18 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 55 hosts
  • 06:17 ryankemper@cumin2002: START - Cookbook sre.hosts.remove-downtime for 55 hosts
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T364299)', diff saved to https://phabricator.wikimedia.org/P62564 and previous config saved to /var/cache/conftool/dbconfig/20240517-061334-marostegui.json
  • 06:10 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: T363975 eqiad cluster restart - ryankemper@cumin2002 - T363975
  • 05:52 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: T363975 eqiad cluster restart - ryankemper@cumin2002 - T363975
  • 05:52 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 55 hosts with reason: T363975
  • 05:50 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on 55 hosts with reason: T363975
  • 05:17 marostegui: Restart wikibugs
  • 05:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T352010)', diff saved to https://phabricator.wikimedia.org/P62563 and previous config saved to /var/cache/conftool/dbconfig/20240517-051721-ladsgroup.json
  • 05:17 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 05:17 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 05:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62562 and previous config saved to /var/cache/conftool/dbconfig/20240517-051658-ladsgroup.json
  • 05:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 05:05 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 05:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P62561 and previous config saved to /var/cache/conftool/dbconfig/20240517-050150-ladsgroup.json
  • 04:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P62560 and previous config saved to /var/cache/conftool/dbconfig/20240517-044642-ladsgroup.json
  • 04:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62559 and previous config saved to /var/cache/conftool/dbconfig/20240517-043134-ladsgroup.json
  • 02:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T352010)', diff saved to https://phabricator.wikimedia.org/P62558 and previous config saved to /var/cache/conftool/dbconfig/20240517-021211-ladsgroup.json
  • 02:12 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 02:11 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 02:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T352010)', diff saved to https://phabricator.wikimedia.org/P62557 and previous config saved to /var/cache/conftool/dbconfig/20240517-021148-ladsgroup.json
  • 01:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62556 and previous config saved to /var/cache/conftool/dbconfig/20240517-015640-ladsgroup.json
  • 01:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62555 and previous config saved to /var/cache/conftool/dbconfig/20240517-014132-ladsgroup.json
  • 01:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T352010)', diff saved to https://phabricator.wikimedia.org/P62554 and previous config saved to /var/cache/conftool/dbconfig/20240517-012622-ladsgroup.json

2024-05-16

  • 23:43 cwhite: restart apache on gerrit1003
  • 23:17 zabe@deploy1002: Synchronized private/PrivateSettings.php: Add secret for encrypting user password hashes - T150647 (duration: 16m 42s)
  • 23:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1173 (T352010)', diff saved to https://phabricator.wikimedia.org/P62553 and previous config saved to /var/cache/conftool/dbconfig/20240516-230951-ladsgroup.json
  • 23:09 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 23:09 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 23:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T352010)', diff saved to https://phabricator.wikimedia.org/P62552 and previous config saved to /var/cache/conftool/dbconfig/20240516-230939-ladsgroup.json
  • 23:05 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@312e2be]: Correct new range partition sensor granularity (duration: 00m 21s)
  • 23:04 ebernhardson@deploy1002: Started deploy [airflow-dags/search@312e2be]: Correct new range partition sensor granularity
  • 22:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P62551 and previous config saved to /var/cache/conftool/dbconfig/20240516-225430-ladsgroup.json
  • 22:47 jsn@deploy1002: Finished scap: Backport for [JsonCodec, ParserCache] Improve debugging of serializability failures (T365036) (duration: 21m 57s)
  • 22:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P62550 and previous config saved to /var/cache/conftool/dbconfig/20240516-223922-ladsgroup.json
  • 22:27 jsn@deploy1002: jsn and cscott: Continuing with sync
  • 22:27 jsn@deploy1002: jsn and cscott: Backport for [JsonCodec, ParserCache] Improve debugging of serializability failures (T365036) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:25 jsn@deploy1002: Started scap: Backport for [JsonCodec, ParserCache] Improve debugging of serializability failures (T365036)
  • 22:24 jsn@deploy1002: Sync cancelled.
  • 22:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T352010)', diff saved to https://phabricator.wikimedia.org/P62549 and previous config saved to /var/cache/conftool/dbconfig/20240516-222414-ladsgroup.json
  • 22:02 ebernhardson@deploy1002: Finished deploy [airflow-dags/search@cb359e4]: add dags to collect daily webrequest and satisfaction search metrics (duration: 00m 25s)
  • 22:02 ebernhardson@deploy1002: Started deploy [airflow-dags/search@cb359e4]: add dags to collect daily webrequest and satisfaction search metrics
  • 21:52 jsn@deploy1002: cscott and jsn: Backport for [JsonCodec, ParserCache] Improve debugging of serializability failures (T365036) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:49 jsn@deploy1002: Started scap: Backport for [JsonCodec, ParserCache] Improve debugging of serializability failures (T365036)
  • 21:31 jsn@deploy1002: Finished scap: Backport for Update VE core submodule to master (27296e0e3) (T230323 T365052) (duration: 25m 10s)
  • 21:11 jsn@deploy1002: jsn and esanders: Continuing with sync
  • 21:09 mutante: LDAP - added uid rickijay to group nda (T365138)
  • 21:08 jsn@deploy1002: jsn and esanders: Backport for Update VE core submodule to master (27296e0e3) (T230323 T365052) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:06 jsn@deploy1002: Started scap: Backport for Update VE core submodule to master (27296e0e3) (T230323 T365052)
  • 21:05 mutante: LDAP - added uid dmuthuri to group wmf T364320
  • 20:43 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 20:43 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 20:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T352010)', diff saved to https://phabricator.wikimedia.org/P62548 and previous config saved to /var/cache/conftool/dbconfig/20240516-204342-ladsgroup.json
  • 20:33 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1013.eqiad.wmnet
  • 20:33 eevans@cumin1002: START - Cookbook sre.hosts.remove-downtime for aqs1013.eqiad.wmnet
  • 20:33 mutante: contint2002 - as usual have to manually "a2dismod mpm_event" on a machine using apache that has just been installed to fix the race condition with apache modules
  • 20:33 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2002.wikimedia.org with OS bullseye
  • 20:31 jdrewniak@deploy1002: Finished scap: Backport for Fix exclude list for dark mode (T365084) (duration: 22m 36s)
  • 20:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P62547 and previous config saved to /var/cache/conftool/dbconfig/20240516-202834-ladsgroup.json
  • 20:14 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint2002.wikimedia.org with reason: host reimage
  • 20:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P62546 and previous config saved to /var/cache/conftool/dbconfig/20240516-201326-ladsgroup.json
  • 20:12 jdrewniak@deploy1002: jdrewniak and mabualruz: Continuing with sync
  • 20:11 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint2002.wikimedia.org with reason: host reimage
  • 20:11 jdrewniak@deploy1002: jdrewniak and mabualruz: Backport for Fix exclude list for dark mode (T365084) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 20:08 ryankemper: [Hadoop] Restarted `hadoop-hdfs-datanode` on `an-worker1172`
  • 20:08 jdrewniak@deploy1002: Started scap: Backport for Fix exclude list for dark mode (T365084)
  • 20:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T352010)', diff saved to https://phabricator.wikimedia.org/P62545 and previous config saved to /var/cache/conftool/dbconfig/20240516-200618-ladsgroup.json
  • 20:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 20:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 20:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T352010)', diff saved to https://phabricator.wikimedia.org/P62544 and previous config saved to /var/cache/conftool/dbconfig/20240516-200552-ladsgroup.json
  • 20:03 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hadoop.roll-restart-workers (exit_code=99) restart workers for Hadoop analytics cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 19:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T352010)', diff saved to https://phabricator.wikimedia.org/P62543 and previous config saved to /var/cache/conftool/dbconfig/20240516-195817-ladsgroup.json
  • 19:55 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS bullseye
  • 19:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P62542 and previous config saved to /var/cache/conftool/dbconfig/20240516-195044-ladsgroup.json
  • 19:46 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2166 (T364299)', diff saved to https://phabricator.wikimedia.org/P62541 and previous config saved to /var/cache/conftool/dbconfig/20240516-194613-marostegui.json
  • 19:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 19:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 19:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T364299)', diff saved to https://phabricator.wikimedia.org/P62540 and previous config saved to /var/cache/conftool/dbconfig/20240516-194548-marostegui.json
  • 19:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P62539 and previous config saved to /var/cache/conftool/dbconfig/20240516-193535-ladsgroup.json
  • 19:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P62538 and previous config saved to /var/cache/conftool/dbconfig/20240516-193040-marostegui.json
  • 19:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T352010)', diff saved to https://phabricator.wikimedia.org/P62537 and previous config saved to /var/cache/conftool/dbconfig/20240516-192027-ladsgroup.json
  • 19:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P62536 and previous config saved to /var/cache/conftool/dbconfig/20240516-191532-marostegui.json
  • 19:00 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T364299)', diff saved to https://phabricator.wikimedia.org/P62535 and previous config saved to /var/cache/conftool/dbconfig/20240516-190024-marostegui.json
  • 18:58 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS buster
  • 18:46 dzahn@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host contint2002.wikimedia.org with OS bullseye
  • 18:32 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 18:17 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS bullseye
  • 18:15 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host contint2002.wikimedia.org
  • 18:13 cmooney@cumin1002: START - Cookbook sre.hosts.dhcp for host contint2002.wikimedia.org
  • 18:04 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint2002.wikimedia.org with OS buster
  • 17:53 brennen@deploy1002: Finished deploy [phabricator/deployment@7d858df]: test scap deployment with keyholder key misconfigured for T313624 (duration: 00m 38s)
  • 17:52 brennen@deploy1002: Started deploy [phabricator/deployment@7d858df]: test scap deployment with keyholder key misconfigured for T313624
  • 17:45 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye
  • 17:34 dani@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 17:34 dani@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 17:34 dani@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 17:33 dani@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 17:33 dani@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 17:33 dani@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 17:02 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 17:02 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 17:02 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 17:01 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 17:01 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 17:00 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 17:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T352010)', diff saved to https://phabricator.wikimedia.org/P62529 and previous config saved to /var/cache/conftool/dbconfig/20240516-170035-ladsgroup.json
  • 17:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:00 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 17:00 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 16:59 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 16:58 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS buster
  • 16:57 ryankemper@cumin2002: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop analytics cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 16:57 dzahn@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host contint2002.wikimedia.org with OS bullseye
  • 16:57 ryankemper@cumin2002: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 16:41 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS bullseye
  • 16:41 ryankemper@cumin2002: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
  • 16:40 dzahn@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host contint2002.wikimedia.org with OS bullseye
  • 16:39 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62528 and previous config saved to /var/cache/conftool/dbconfig/20240516-163915-arnaudb.json
  • 16:39 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2002']
  • 16:38 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2002']
  • 16:37 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2002']
  • 16:37 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2002']
  • 16:37 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2002']
  • 16:37 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2002']
  • 16:32 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2002']
  • 16:31 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2002']
  • 16:31 robh@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2002']
  • 16:30 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2002']
  • 16:24 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62526 and previous config saved to /var/cache/conftool/dbconfig/20240516-162408-arnaudb.json
  • 16:12 topranks: announcing wikidough anycast ranges to Inernet (transit) in magru T362421
  • 16:09 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62525 and previous config saved to /var/cache/conftool/dbconfig/20240516-160902-arnaudb.json
  • 15:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62523 and previous config saved to /var/cache/conftool/dbconfig/20240516-155356-arnaudb.json
  • 15:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 100%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62522 and previous config saved to /var/cache/conftool/dbconfig/20240516-155034-arnaudb.json
  • 15:45 dhinus: systemctl restart mariadb@s4.service on clouddb1015 (using too much RAM) T365164
  • 15:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62521 and previous config saved to /var/cache/conftool/dbconfig/20240516-153850-arnaudb.json
  • 15:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 75%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62520 and previous config saved to /var/cache/conftool/dbconfig/20240516-153527-arnaudb.json
  • 15:25 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS bullseye
  • 15:24 dzahn@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host contint2002.wikimedia.org with OS bullseye
  • 15:23 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62519 and previous config saved to /var/cache/conftool/dbconfig/20240516-152343-arnaudb.json
  • 15:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 50%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62518 and previous config saved to /var/cache/conftool/dbconfig/20240516-152021-arnaudb.json
  • 15:08 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62517 and previous config saved to /var/cache/conftool/dbconfig/20240516-150837-arnaudb.json
  • 15:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 25%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62516 and previous config saved to /var/cache/conftool/dbconfig/20240516-150515-arnaudb.json
  • 15:03 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS bullseye
  • 14:53 arnaudb@cumin1002: dbctl commit (dc=all): 'db2174 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62515 and previous config saved to /var/cache/conftool/dbconfig/20240516-145330-arnaudb.json
  • 14:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 10%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62514 and previous config saved to /var/cache/conftool/dbconfig/20240516-145009-arnaudb.json
  • 14:49 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P62513 and previous config saved to /var/cache/conftool/dbconfig/20240516-144945-root.json
  • 14:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2174.codfw.wmnet with OS bookworm
  • 14:43 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on contint2002.wikimedia.org with reason: T334517
  • 14:43 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on contint2002.wikimedia.org with reason: T334517
  • 14:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 5%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62512 and previous config saved to /var/cache/conftool/dbconfig/20240516-143503-arnaudb.json
  • 14:34 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 75%: After reimage', diff saved to https://phabricator.wikimedia.org/P62511 and previous config saved to /var/cache/conftool/dbconfig/20240516-143439-root.json
  • 14:28 ladsgroup@deploy1002: Finished scap: Backport for Stop writing to the old columns of pagelinks in s6 (T352010) (duration: 15m 42s)
  • 14:28 hnowlan: migrated 5% of commons traffic to k8s
  • 14:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
  • 14:25 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
  • 14:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 2%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62510 and previous config saved to /var/cache/conftool/dbconfig/20240516-141957-arnaudb.json
  • 14:19 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 50%: After reimage', diff saved to https://phabricator.wikimedia.org/P62509 and previous config saved to /var/cache/conftool/dbconfig/20240516-141932-root.json
  • 14:15 ladsgroup@deploy1002: ladsgroup: Continuing with sync
  • 14:15 ladsgroup@deploy1002: ladsgroup: Backport for Stop writing to the old columns of pagelinks in s6 (T352010) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:13 ladsgroup@deploy1002: Started scap: Backport for Stop writing to the old columns of pagelinks in s6 (T352010)
  • 14:09 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint1002:~$ time mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["76318767"]' 2>&1 | tee -a ~/T315510-enwiki-5; date
  • 14:08 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2174.codfw.wmnet with OS bookworm
  • 14:07 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2174.codfw.wmnet with reason: reimage
  • 14:07 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2174.codfw.wmnet with reason: reimage
  • 14:06 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db2174', diff saved to https://phabricator.wikimedia.org/P62508 and previous config saved to /var/cache/conftool/dbconfig/20240516-140620-arnaudb.json
  • 14:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2176 (re)pooling @ 1%: post reimage repool', diff saved to https://phabricator.wikimedia.org/P62507 and previous config saved to /var/cache/conftool/dbconfig/20240516-140451-arnaudb.json
  • 14:04 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P62506 and previous config saved to /var/cache/conftool/dbconfig/20240516-140426-root.json
  • 14:04 jsn@deploy1002: Finished scap: Backport for Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001), Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001) (duration: 16m 11s)
  • 14:03 Emperor: depool, restart swift-proxy, repool ms-fe1010 as ~12% connection failures reported by envoy since late 14th May T360913
  • 13:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2176.codfw.wmnet with OS bookworm
  • 13:51 jsn@deploy1002: jsn and lucaswerkmeister-wmde: Continuing with sync
  • 13:50 jsn@deploy1002: jsn and lucaswerkmeister-wmde: Backport for Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001), Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:49 marostegui@cumin1002: dbctl commit (dc=all): 'es1024 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P62505 and previous config saved to /var/cache/conftool/dbconfig/20240516-134918-root.json
  • 13:48 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1024.eqiad.wmnet with OS bookworm
  • 13:47 jsn@deploy1002: Started scap: Backport for Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001), Make EntitySchemaValue::getArrayValue() match EntityIdValue (T362955 T362001)
  • 13:37 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
  • 13:34 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
  • 13:32 jsn@deploy1002: Finished scap: Backport for Enable async jobqueue-powered URL uploads on commons (T295007) (duration: 18m 18s)
  • 13:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1024.eqiad.wmnet with reason: host reimage
  • 13:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1024.eqiad.wmnet with reason: host reimage
  • 13:19 jsn@deploy1002: jsn and hnowlan: Continuing with sync
  • 13:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T352010)', diff saved to https://phabricator.wikimedia.org/P62503 and previous config saved to /var/cache/conftool/dbconfig/20240516-131800-ladsgroup.json
  • 13:17 arnaudb@cumin1002: START - Cookbook sre.hosts.reimage for host db2176.codfw.wmnet with OS bookworm
  • 13:16 jsn@deploy1002: jsn and hnowlan: Backport for Enable async jobqueue-powered URL uploads on commons (T295007) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:15 arnaudb@cumin1002: END (ERROR) - Cookbook sre.mysql.upgrade (exit_code=97) for db2176.codfw.wmnet
  • 13:15 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2176.codfw.wmnet
  • 13:14 arnaudb@cumin1002: dbctl commit (dc=all): 'T364290 db2176', diff saved to https://phabricator.wikimedia.org/P62502 and previous config saved to /var/cache/conftool/dbconfig/20240516-131429-arnaudb.json
  • 13:14 jsn@deploy1002: Started scap: Backport for Enable async jobqueue-powered URL uploads on commons (T295007)
  • 13:12 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1024.eqiad.wmnet with OS bookworm
  • 13:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1024 T364289', diff saved to https://phabricator.wikimedia.org/P62501 and previous config saved to /var/cache/conftool/dbconfig/20240516-131111-root.json
  • 13:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62500 and previous config saved to /var/cache/conftool/dbconfig/20240516-130252-ladsgroup.json
  • 12:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62499 and previous config saved to /var/cache/conftool/dbconfig/20240516-124743-ladsgroup.json
  • 10:48 fnegri@cumin1002: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
  • 10:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P62497 and previous config saved to /var/cache/conftool/dbconfig/20240516-104601-ladsgroup.json
  • 10:43 claime: New redirects for T25216 T204830 T31186 operational
  • 10:37 fnegri@cumin1002: START - Cookbook sre.wikireplicas.update-views
  • 10:32 claime: cumin 'A:all-mw' -b30 "run-puppet-agent -q" - T25216 T204830 T31186
  • 10:31 claime: cumin 'A:all-mw' "enable-puppet 'New redirects T25216 T204830 T31186 - cgoubert'"
  • 10:31 marostegui@cumin1002: dbctl commit (dc=all): 'Test pc4 master switch', diff saved to https://phabricator.wikimedia.org/P62496 and previous config saved to /var/cache/conftool/dbconfig/20240516-103148-marostegui.json
  • 10:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P62495 and previous config saved to /var/cache/conftool/dbconfig/20240516-103055-ladsgroup.json
  • 10:30 marostegui@cumin1002: dbctl commit (dc=all): 'Test pc4 master switch', diff saved to https://phabricator.wikimedia.org/P62494 and previous config saved to /var/cache/conftool/dbconfig/20240516-103039-marostegui.json
  • 10:30 cgoubert@deploy1002: Finished scap: Deploy new redirects to mw-on-k8s - T25216 T204830 T31186 (duration: 08m 06s)
  • 10:22 cgoubert@deploy1002: Started scap: Deploy new redirects to mw-on-k8s - T25216 T204830 T31186
  • 10:21 claime: New redirects ok on mwdebug - T25216 T204830 T31186
  • 10:19 cgoubert@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 10:19 cgoubert@deploy1002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 10:18 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2015 and pc1015 to pc4 as depooled spares T362786', diff saved to https://phabricator.wikimedia.org/P62493 and previous config saved to /var/cache/conftool/dbconfig/20240516-101829-marostegui.json
  • 10:15 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 100%: After reimage', diff saved to https://phabricator.wikimedia.org/P62492 and previous config saved to /var/cache/conftool/dbconfig/20240516-101553-root.json
  • 10:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P62491 and previous config saved to /var/cache/conftool/dbconfig/20240516-101548-ladsgroup.json
  • 10:15 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2016 and pc1016 to pc4 T362786', diff saved to https://phabricator.wikimedia.org/P62490 and previous config saved to /var/cache/conftool/dbconfig/20240516-101543-marostegui.json
  • 10:11 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2014 and pc1014 to pc4 T362786', diff saved to https://phabricator.wikimedia.org/P62489 and previous config saved to /var/cache/conftool/dbconfig/20240516-101122-marostegui.json
  • 10:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 100%: post fix repool', diff saved to https://phabricator.wikimedia.org/P62488 and previous config saved to /var/cache/conftool/dbconfig/20240516-101018-arnaudb.json
  • 10:10 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2013 and pc1013 to pc2 T362786', diff saved to https://phabricator.wikimedia.org/P62487 and previous config saved to /var/cache/conftool/dbconfig/20240516-101009-marostegui.json
  • 10:09 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2012 and pc1012 to pc2 T362786', diff saved to https://phabricator.wikimedia.org/P62486 and previous config saved to /var/cache/conftool/dbconfig/20240516-100858-marostegui.json
  • 10:07 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc2011 to pc1 T362786', diff saved to https://phabricator.wikimedia.org/P62485 and previous config saved to /var/cache/conftool/dbconfig/20240516-100744-marostegui.json
  • 10:04 marostegui@cumin1002: dbctl commit (dc=all): 'Add pc1011 to pc1 T362786', diff saved to https://phabricator.wikimedia.org/P62484 and previous config saved to /var/cache/conftool/dbconfig/20240516-100418-marostegui.json
  • 10:02 claime: cumin 'A:all-mw' "disable-puppet 'New redirects T25216 T204830 T31186 - cgoubert'"
  • 10:00 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 75%: After reimage', diff saved to https://phabricator.wikimedia.org/P62483 and previous config saved to /var/cache/conftool/dbconfig/20240516-100040-root.json
  • 09:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1202 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P62482 and previous config saved to /var/cache/conftool/dbconfig/20240516-095927-ladsgroup.json
  • 09:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T352010)', diff saved to https://phabricator.wikimedia.org/P62481 and previous config saved to /var/cache/conftool/dbconfig/20240516-095817-ladsgroup.json
  • 09:58 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 09:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 09:56 ladsgroup@cumin1002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 09:56 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 09:54 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 75%: post fix repool', diff saved to https://phabricator.wikimedia.org/P62480 and previous config saved to /var/cache/conftool/dbconfig/20240516-095459-arnaudb.json
  • 09:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1173 (T352010)', diff saved to https://phabricator.wikimedia.org/P62479 and previous config saved to /var/cache/conftool/dbconfig/20240516-094717-ladsgroup.json
  • 09:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 09:47 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 09:45 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 50%: After reimage', diff saved to https://phabricator.wikimedia.org/P62478 and previous config saved to /var/cache/conftool/dbconfig/20240516-094534-root.json
  • 09:44 godog: clean up MediaWiki.rest_api_latency and MediaWiki.rest_api_errors - T365111
  • 09:38 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 50%: post fix repool', diff saved to https://phabricator.wikimedia.org/P62476 and previous config saved to /var/cache/conftool/dbconfig/20240516-093803-arnaudb.json
  • 09:30 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 25%: After reimage', diff saved to https://phabricator.wikimedia.org/P62475 and previous config saved to /var/cache/conftool/dbconfig/20240516-093028-root.json
  • 09:28 logmsgbot: @deploy1002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 09:28 logmsgbot: @deploy1002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 09:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 (re)pooling @ 25%: post fix repool', diff saved to https://phabricator.wikimedia.org/P62474 and previous config saved to /var/cache/conftool/dbconfig/20240516-092257-arnaudb.json
  • 09:18 dani@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 09:18 dani@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 09:18 dani@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 09:17 dani@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 09:17 dani@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 09:17 dani@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 09:16 arnaudb@cumin1002: dbctl commit (dc=all): 'vslow/dump T364814 fix', diff saved to https://phabricator.wikimedia.org/P62473 and previous config saved to /var/cache/conftool/dbconfig/20240516-091613-arnaudb.json
  • 09:15 marostegui@cumin1002: dbctl commit (dc=all): 'es1021 (re)pooling @ 10%: After reimage', diff saved to https://phabricator.wikimedia.org/P62472 and previous config saved to /var/cache/conftool/dbconfig/20240516-091522-root.json
  • 09:15 arnaudb@cumin1002: dbctl commit (dc=all): 'vslow/dump T364814 fix', diff saved to https://phabricator.wikimedia.org/P62471 and previous config saved to /var/cache/conftool/dbconfig/20240516-091515-arnaudb.json
  • 09:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Promote db2204 to vslow/dump T364814', diff saved to https://phabricator.wikimedia.org/P62470 and previous config saved to /var/cache/conftool/dbconfig/20240516-091400-arnaudb.json
  • 09:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Group test readd', diff saved to https://phabricator.wikimedia.org/P62469 and previous config saved to /var/cache/conftool/dbconfig/20240516-090753-arnaudb.json
  • 09:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Group test removal', diff saved to https://phabricator.wikimedia.org/P62468 and previous config saved to /var/cache/conftool/dbconfig/20240516-090732-arnaudb.json
  • 09:03 Dreamy_Jazz: Stopping MediaModeration scanning script on `medium.dblist`
  • 09:03 Dreamy_Jazz: Stopping MediaModeration scanning script on `enwiki`
  • 08:59 Dreamy_Jazz: Scanning `enwiki` with MediaModeration script - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 08:58 Dreamy_Jazz: Starting MediaModeration scanning script on `medium.dblist` - https://wikitech.wikimedia.org/wiki/MediaModeration
  • 08:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2204 with weight 500 T364814', diff saved to https://phabricator.wikimedia.org/P62466 and previous config saved to /var/cache/conftool/dbconfig/20240516-085123-arnaudb.json
  • 08:44 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2207 to s2 primary T364814', diff saved to https://phabricator.wikimedia.org/P62465 and previous config saved to /var/cache/conftool/dbconfig/20240516-084420-root.json
  • 08:41 arnaudb: Starting s2 codfw failover from db2204 to db2207 - T364814
  • 08:33 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 08:33 jiji@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 08:33 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 08:33 jiji@deploy1002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
  • 08:33 jiji@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 08:33 jiji@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 08:23 hashar@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.43.0-wmf.5 refs T361399
  • 08:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db2126 depool', diff saved to https://phabricator.wikimedia.org/P62463 and previous config saved to /var/cache/conftool/dbconfig/20240516-081207-arnaudb.json
  • 08:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T360332)', diff saved to https://phabricator.wikimedia.org/P62462 and previous config saved to /var/cache/conftool/dbconfig/20240516-081136-arnaudb.json
  • 08:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2165 (T364299)', diff saved to https://phabricator.wikimedia.org/P62461 and previous config saved to /var/cache/conftool/dbconfig/20240516-081107-marostegui.json
  • 08:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 08:10 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T364299)', diff saved to https://phabricator.wikimedia.org/P62460 and previous config saved to /var/cache/conftool/dbconfig/20240516-081044-marostegui.json
  • 08:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1021.eqiad.wmnet with reason: host reimage
  • 08:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1021.eqiad.wmnet with reason: host reimage
  • 07:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62458 and previous config saved to /var/cache/conftool/dbconfig/20240516-075628-arnaudb.json
  • 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P62457 and previous config saved to /var/cache/conftool/dbconfig/20240516-075537-marostegui.json
  • 07:51 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1021.eqiad.wmnet with OS bookworm
  • 07:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Remove db2207 from API/vslow/dump T364814', diff saved to https://phabricator.wikimedia.org/P62456 and previous config saved to /var/cache/conftool/dbconfig/20240516-075024-arnaudb.json
  • 07:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Set db2207 with weight 0 T364814', diff saved to https://phabricator.wikimedia.org/P62455 and previous config saved to /var/cache/conftool/dbconfig/20240516-074927-arnaudb.json
  • 07:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s2 T364814
  • 07:48 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 26 hosts with reason: Primary switchover s2 T364814
  • 07:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1021 T364289', diff saved to https://phabricator.wikimedia.org/P62454 and previous config saved to /var/cache/conftool/dbconfig/20240516-074837-root.json
  • 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'Increase es1024 weight', diff saved to https://phabricator.wikimedia.org/P62453 and previous config saved to /var/cache/conftool/dbconfig/20240516-074625-marostegui.json
  • 07:44 mabualruz@deploy1002: Finished scap: Backport for Correct behaviour of ConfigHelper, add tests (T365084) (duration: 17m 31s)
  • 07:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P62452 and previous config saved to /var/cache/conftool/dbconfig/20240516-074121-arnaudb.json
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P62451 and previous config saved to /var/cache/conftool/dbconfig/20240516-074030-marostegui.json
  • 07:37 marostegui@cumin1002: dbctl commit (dc=all): 'Increase es1024 weight', diff saved to https://phabricator.wikimedia.org/P62450 and previous config saved to /var/cache/conftool/dbconfig/20240516-073750-marostegui.json
  • 07:37 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1025 to es5 primary master T365094', diff saved to https://phabricator.wikimedia.org/P62449 and previous config saved to /var/cache/conftool/dbconfig/20240516-073719-marostegui.json
  • 07:30 mabualruz@deploy1002: mabualruz: Continuing with sync
  • 07:30 mabualruz@deploy1002: mabualruz: Backport for Correct behaviour of ConfigHelper, add tests (T365084) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 07:26 mabualruz@deploy1002: Started scap: Backport for Correct behaviour of ConfigHelper, add tests (T365084)
  • 07:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T360332)', diff saved to https://phabricator.wikimedia.org/P62448 and previous config saved to /var/cache/conftool/dbconfig/20240516-072614-arnaudb.json
  • 07:25 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T364299)', diff saved to https://phabricator.wikimedia.org/P62447 and previous config saved to /var/cache/conftool/dbconfig/20240516-072521-marostegui.json
  • 07:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1173 (T360332)', diff saved to https://phabricator.wikimedia.org/P62446 and previous config saved to /var/cache/conftool/dbconfig/20240516-072355-arnaudb.json
  • 07:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 07:23 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P62445 and previous config saved to /var/cache/conftool/dbconfig/20240516-065823-root.json
  • 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P62444 and previous config saved to /var/cache/conftool/dbconfig/20240516-064317-root.json
  • 06:34 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Making es4 standalone T364447
  • 06:34 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Making es4 standalone T364447
  • 06:33 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Making es5 standalone T364447
  • 06:33 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Making es5 standalone T364447
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P62443 and previous config saved to /var/cache/conftool/dbconfig/20240516-062812-root.json
  • 06:18 marostegui: Make es5 standalone and disconnect replication T364447
  • 06:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Making es5 standalone T364447
  • 06:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Making es5 standalone T364447
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P62442 and previous config saved to /var/cache/conftool/dbconfig/20240516-061306-root.json
  • 06:05 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1020 to es4 primary master T364816', diff saved to https://phabricator.wikimedia.org/P62441 and previous config saved to /var/cache/conftool/dbconfig/20240516-060532-marostegui.json
  • 05:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1173 (re)pooling @ 10%: After schema change', diff saved to https://phabricator.wikimedia.org/P62440 and previous config saved to /var/cache/conftool/dbconfig/20240516-055759-root.json
  • 05:43 marostegui: Make es4 standalone and disconnect replication T364447
  • 05:37 marostegui@cumin1002: dbctl commit (dc=all): 'Increase es1021 weight', diff saved to https://phabricator.wikimedia.org/P62439 and previous config saved to /var/cache/conftool/dbconfig/20240516-053746-marostegui.json
  • 05:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Making es4 standalone T364447
  • 05:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Making es4 standalone T364447
  • 05:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 05:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 05:23 marostegui: Deploy schema change dbmaint db1173 eqiad s6 T355609
  • 05:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1173 T364523', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20240516-051853-root.json
  • 05:18 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db1231 to s6 primary and set section read-write T364523', diff saved to https://phabricator.wikimedia.org/P62437 and previous config saved to /var/cache/conftool/dbconfig/20240516-051808-marostegui.json
  • 05:17 marostegui: Starting s6 eqiad failover from db1173 to db1231 - T364523
  • 04:58 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s6 T364523
  • 04:58 marostegui@cumin1002: dbctl commit (dc=all): 'Set db1231 with weight 0 T364523', diff saved to https://phabricator.wikimedia.org/P62435 and previous config saved to /var/cache/conftool/dbconfig/20240516-045831-marostegui.json
  • 04:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s6 T364523
  • 04:04 eileen: civicrm upgraded from 26e7422a to 4f6f2dc3
  • 02:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T352010)', diff saved to https://phabricator.wikimedia.org/P62434 and previous config saved to /var/cache/conftool/dbconfig/20240516-020200-ladsgroup.json
  • 02:01 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 02:01 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 02:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T352010)', diff saved to https://phabricator.wikimedia.org/P62433 and previous config saved to /var/cache/conftool/dbconfig/20240516-020137-ladsgroup.json
  • 01:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P62432 and previous config saved to /var/cache/conftool/dbconfig/20240516-014630-ladsgroup.json
  • 01:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P62431 and previous config saved to /var/cache/conftool/dbconfig/20240516-013122-ladsgroup.json
  • 01:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T352010)', diff saved to https://phabricator.wikimedia.org/P62430 and previous config saved to /var/cache/conftool/dbconfig/20240516-011613-ladsgroup.json
  • 01:12 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2009.codfw.wmnet with OS bullseye
  • 00:28 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS bullseye


Other archives

2000s

2010s

2020s