Search Platform/Weekly Updates/2024-07-19
Appearance
Summary
WDQS Graph Split is late, but moving forward nicely. We have a working updater deployed, it will need a couple of weeks to generate enough data. We will then be able to expose the new SPARQL endpoints.
We are starting work on migrating private wikis to the new Search Update Pipeline.
We finally have a definite answer to how many languages are supported by our search engine, but that answer isn't simple: https://techblog.wikimedia.org/2024/07/18/how-many-languages-does-wikimedia-search-support/
What we've accomplished
Search Update Pipeline / Private Wikis
- Delving into EventBus / EventStreams to understand how it works and ensure we can safely produce private streams without risk of producing private events into public streams (via guidence from Andrew Otto). Patches and configuration mostly complete, deployment will take ~ 2 weeks for train deploys and verification. - https://phabricator.wikimedia.org/T346046
Search Update Pipeline
- Let wikidata updates bypass deduplication window to reduce lag - https://phabricator.wikimedia.org/T365831
- SUP: Retry 429 (rate limit) at HTTP client level - https://phabricator.wikimedia.org/T367691
WDQS graph splitting
- The split graph updater is running in production
- Writing a Internal Federation Guide on wiki
- Query throttling and internal federation appears to work OK on experimental endpoints - https://phabricator.wikimedia.org/T361950
Misc
- New blog post on how many languages are supported by Search: https://techblog.wikimedia.org/2024/07/18/how-many-languages-does-wikimedia-search-support/
- Homogenise jackson version in WDQS - https://phabricator.wikimedia.org/T365158
- Search and haswebstatement not working for EntitySchema - https://phabricator.wikimedia.org/T368010 / https://phabricator.wikimedia.org/T369495
- PHP Deprecated: Implicit conversion from float 75000.00000000001 to int loses precision - https://phabricator.wikimedia.org/T366589
- Completion suggester can promote a bad build (not actually fixed, but we have metrics in place in case the issue is reproduced) - https://phabricator.wikimedia.org/T363521