Search Platform/Weekly Updates/2023-03-17
Appearance
Summary
We are mostly on track to deliver what we planned for Q3.
- The Spark 3 upgrade might slip by a few weeks to finalize the deployment, but code should be ready before the end of the quarter.
- The Search Update Pipeline work was identified early as too ambitious for this quarter and will continue as planned in Q4.
- We are very close to having unpacked all of our Elasticsearch analysis chains, which will enable us to roll out future improvements to all languages at the same time.
What we've accomplished
Search Update Pipeline
- Slow progress on validating the use of Flink k8s Operators - https://phabricator.wikimedia.org/T328675
- Added code to monitor update lag of most CirrusSearch updates (new revision/page refresh/page deletion/file uploads).
Search Analysis
- Unpack Romanian, Sorani Elasticsearch Analyzers. Notes on mediawiki.org: https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Unpacking_Notes#Romanian_and_Sorani_Notes_(T325091) - https://phabricator.wikimedia.org/T325091
- Finished "better_apostrophe" filter for Turkish - https://phabricator.wikimedia.org/T329762
Operations / SRE
- 10G networking enabled on all Elasticsearch servers, which should allow for faster maintenance and faster recovery in case of loosing servers.
Misc
- Starting discussion on Q4 planning. It will be mostly about the Search Update Pipeline.
- WDQS Scaling workshops continue. The current focus is on splitting the graph. We are starting on getting a more detailed plan.