Search Platform/Goals/OKR 2023-2024 Q1
Q1 Overview
Q1 will be focused on finishing work that was started in the previous fiscal year. The Search Update Pipeline will hopefully be fully completed in Q1, with maybe some overflow to Q2 for the deployment and migration.
Q1 OKRs
Things we ship
Search Update Pipeline
Higher level Objective: Users can reliably query and search for content, so that they can depend on Wikimedia Foundation projects as knowledge infrastructure (ERF: Technical Infrastructure, Product Platform)
KR:
- X% of search update load is migrated away from JobRunner
- Search update lag is < X minutes X% of the time over a 3 month period
- Less than X% of updates are in error over a 3 month period
Description: The search update pipeline is currently broken, resulting in updates being sporadically lost often enough that users are reporting bugs. The Saneitizer is supposed to resolve this, but was shut off for a while, and restarted, but is running with ~4 weeks of lag. We want to resolve issues in the update pipeline so that our indexes are not out of sync by more than XX days. Work needs to be done to understand the current rate of lost updates. We want to reduce the error rate to something close to 0.
The current update pipeline processes Mediawiki updates as a stream, with < 5 minutes of lag. Most additional data is processed as batch, with a lag > 1h. We want to harmonize the system so that all updates are processed as streams, reducing the lag of secondary data sources from > 1h to < 5 minutes.
Docs:
Phab: https://phabricator.wikimedia.org/T317045
Milestones:
- SLI and SLO are created for Search update lag and error rates
- Supports all edit times and optimizations
- Support for redirects
- Support for page re-render
- Deployment on k8s with Flink operators
- Testing plan is created (stretch: testing plan is executed)
- Migration plan is created (stretch: migration for all wikis is completed)
Improve multilingual zero-results rate
Objective: Searchers of emerging languages can search in their own language
KR: Increase recall (reduce ZRR and/or increase number of results returned) for 75% of relevant languages.
Description: Following the work of unpacking all the language analyzers, we can now work on harmonising language processing across wikis and deploy global improvements.
To ensure that our users can more easily understand how search is working and to ensure that improvements to search are replicated across languages, we want differences in how we treat different languages to be linguistic, not accidental. For example: how we treat CamelCase or apostrophe should be the same in all languages.
In Q1 we will continue to focus on increasing recall (with decreasing zero-results rates and increasing number of results as proxy metrics), assuming that increased recall improves the odds of content discovery, especially on smaller language wikis. Note that this is an imperfect KPI for search relevancy overall.
Phab: https://phabricator.wikimedia.org/T219550
Milestones:
- Smarter handling of acronyms for word_break_helper in language analyzers
- Investigate applying aggressive_splitting everywhere, not just on English-language wikis
- Repair multi-script tokens split by the ICU tokenizer
- Standardize ASCII-folding/ICU-folding across analyzers
- Stretch: Look into enabling hiragana/katakana mapping everywhere
Search SLOs
Description: To ensure that we can understand the quality of our search and invest the appropriate efforts in operating it, we want to have clear SLOs for key aspects of the Search experience.
Doc: Search SLOs
Phab: https://phabricator.wikimedia.org/T335576
Milestones:
- Required metrics are collected
- Standard SLO dashboard is created
Things we plan
Split the WDQS Graph
KR: SDS3.1 Reduce the number of unsatisfied requests for Wikidata by 50%
Description: Splitting the WDQS graph has been identified as the highest priority work to scaling WDQS, and thus ensuring that we can continue to serve queries in the medium term. The first experiment will be investigating a split of Scholarly Articles. We want to understand the impacts of such a split, in terms of complexity for the users of WDQS, added technical complexity, and long term impact on stability.