Search Platform/Weekly Updates/2023-06-30
Summary
This is the last update of the quarter. Most of our projects are going to overflow to next quarter as expected.
Search update pipeline work is on track to be functionally complete by the end of next quarter, with some risks of the deployment strategy not being ready. Discussions are ongoing on how we can prioritize that work across Data Platform Engineering teams to get the support we need on time.
Improvements to multilingual zero-results rate are coming along. The most time consuming part was about measuring the impact, not about implementing the improvements. We will review the metrics used for next quarter to see if we can have metrics that are easier to compute, while still giving us a good sense of the impact of the improvements.
Search SLOs have been defined, we should be able to implement them during next quarter.
Note that we will have multiple team members on vacation during next quarter, so we will be in reduced capacity.
What we've accomplished
Improve multilingual zero-results rate
- Ongoing work on aggressive splitting and word_break_helper - https://phabricator.wikimedia.org/T219108 / https://phabricator.wikimedia.org/T170625
Search Update Pipeline
- Ongoing work on support for redirects - https://phabricator.wikimedia.org/T325315
- Ongoing work on support for reordering and optimization of change events - https://phabricator.wikimedia.org/T325672
Operations / SRE
- Fix metric collection for Blazegraph on Debian Bullseye - https://phabricator.wikimedia.org/T336540
- Investigation of potential performance issues with new WDQS servers - https://phabricator.wikimedia.org/T336443
Misc
- Progress on migrating from ORES to Liftwing for ingesting articletopics in Search -https://phabricator.wikimedia.org/T328276
- First version of an SLO dashboard to track update lag for WDQS - https://phabricator.wikimedia.org/T324811
- Optimize the elasticsearch analysis settings for wikibase - https://phabricator.wikimedia.org/T334194