Q1 Overview

Q1 will be focused on finishing work that was started in the previous fiscal year. The Search Update Pipeline will hopefully be fully completed in Q1, with maybe some overflow to Q2 for the deployment and migration.

Q1 OKRs

Things we ship

Search Update Pipeline

Higher level Objective: Users can reliably query and search for content, so that they can depend on Wikimedia Foundation projects as knowledge infrastructure (ERF: Technical Infrastructure, Product Platform)

KR:

X% of search update load is migrated away from JobRunner
Search update lag is < X minutes X% of the time over a 3 month period
Less than X% of updates are in error over a 3 month period

Description: The search update pipeline is currently broken, resulting in updates being sporadically lost often enough that users are reporting bugs. The Saneitizer is supposed to resolve this, but was shut off for a while, and restarted, but is running with ~4 weeks of lag. We want to resolve issues in the update pipeline so that our indexes are not out of sync by more than XX days. Work needs to be done to understand the current rate of lost updates. We want to reduce the error rate to something close to 0.

The current update pipeline processes Mediawiki updates as a stream, with < 5 minutes of lag. Most additional data is processed as batch, with a lag > 1h. We want to harmonize the system so that all updates are processed as streams, reducing the lag of secondary data sources from > 1h to < 5 minutes.

Docs:

Phab: https://phabricator.wikimedia.org/T317045

Milestones:

SLI and SLO are created for Search update lag and error rates
Supports all edit times and optimizations
Support for redirects
Support for page re-render
Deployment on k8s with Flink operators
Testing plan is created (stretch: testing plan is executed)
Migration plan is created (stretch: migration for all wikis is completed)

Improve multilingual zero-results rate

Objective: Searchers of emerging languages can search in their own language

KR: Increase recall (reduce ZRR and/or increase number of results returned) for 75% of relevant languages.

Description: Following the work of unpacking all the language analyzers, we can now work on harmonising language processing across wikis and deploy global improvements.

To ensure that our users can more easily understand how search is working and to ensure that improvements to search are replicated across languages, we want differences in how we treat different languages to be linguistic, not accidental. For example: how we treat CamelCase or apostrophe should be the same in all languages.

In Q1 we will continue to focus on increasing recall (with decreasing zero-results rates and increasing number of results as proxy metrics), assuming that increased recall improves the odds of content discovery, especially on smaller language wikis. Note that this is an imperfect KPI for search relevancy overall.

Phab: https://phabricator.wikimedia.org/T219550

Milestones:

Smarter handling of acronyms for word_break_helper in language analyzers
Investigate applying aggressive_splitting everywhere, not just on English-language wikis
Repair multi-script tokens split by the ICU tokenizer
Standardize ASCII-folding/ICU-folding across analyzers
Stretch: Look into enabling hiragana/katakana mapping everywhere

Search SLOs

Description: To ensure that we can understand the quality of our search and invest the appropriate efforts in operating it, we want to have clear SLOs for key aspects of the Search experience.

Doc: Search SLOs

Phab: https://phabricator.wikimedia.org/T335576

Milestones:

Required metrics are collected
Standard SLO dashboard is created

Things we plan

Split the WDQS Graph

KR: SDS3.1 Reduce the number of unsatisfied requests for Wikidata by 50%

Description: Splitting the WDQS graph has been identified as the highest priority work to scaling WDQS, and thus ensuring that we can continue to serve queries in the medium term. The first experiment will be investigating a split of Scholarly Articles. We want to understand the impacts of such a split, in terms of complexity for the users of WDQS, added technical complexity, and long term impact on stability.