Search Platform/Weekly Updates/2023-05-23
Appearance
Summary
Another WDQS incident this week disrupted our flow of work.
Dealing with page redirect in the context of the Search Update Pipeline is more complex than expected, and involves multiple teams (data engineering, mediawiki core, ML). Hopefully that additional work will benefit more teams, in particular ML.
What we've accomplished
Improve multilingual zero-results rate
- documentation and some implementation of the framework to evaluate impact https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Language_Analyzer_Harmonization_Notes
= Search Update Pipeline
- Discussions with Data Engineering and Mediawiki core about page redirects - https://phabricator.wikimedia.org/T325315#8871319
Finalize Search SLOs
- Cleanup of the final list of SLOs - https://docs.google.com/document/d/1gYROXo8Fl7JSxReHAVI22EhcPvG-INVkq79a1C3tfK0/edit#heading=h.2v65rh3w6ii0
Operations / SRE
- Dealing with the WDQS outage - https://wikitech.wikimedia.org/wiki/Incidents/2023-05-23_wdqs_CODFW_5xx_errors
- Fixed prometheus metrics not being reported by flink-1.16 - https://phabricator.wikimedia.org/T336872
Misc
- Trying to make sense to the perceived limitations of Search when being used in the context of a ChatGPT plugin
- The language team finished the work on (https://phabricator.wikimedia.org/T322284) sent a patch to enable this new config. This should allow us in the end to quickly switch traffic from one DC to another (unblocking https://phabricator.wikimedia.org/T143553)
- Turkish is reindexed to enable unpacking, with write up on mediawiki https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Unpacking_Notes#better_apostrophe_Impact_on_Turkish_Wikipedia_(T337064) - https://phabricator.wikimedia.org/T337064