Search Platform/Weekly Updates/2024-06-28
Appearance
Summary
Vacations and sabbaticals makes this a slow week.
We still had to fix a few minor issues on the new Search Update Pipeline. Work on WDQS Graph Split is moving forward but it will still take a couple of weeks before we have endpoints publicly available.
What we've accomplished
Search Update Pipeline
- Working on a fix for https://phabricator.wikimedia.org/T331127 (support cross-index page moves)
- Looking into https://phabricator.wikimedia.org/T368010 (wikidata entity schema)
- Thinking about adding support for page rerender upsert to allow refreshing non-indexed pages
- opened a patch https://gitlab.wikimedia.org/repos/search-platform/cirrus-streaming-updater/-/merge_requests/143 to hopefully help this kind of maintainance task
- Root cause of the problem is that the entity schema was erroneously tagged as non-content
- Thinking about adding support for page rerender upsert to allow refreshing non-indexed pages
WDQS graph splitting
- Discussion with traffic to solve our problem regarding how internal federation can bypass throttling (https://phabricator.wikimedia.org/T361950)
- Seems like we can rely on a header (X-Client-Ip is standard enough that it should be safe to use it)
- Going back to LVS is not ideal and they prefer that we use envoy as a load balancer, some research has to be done to see if and how this is done in other places in our infra
- Started reviewing the graph split code in the updater, added some more tests to illustrate comments
Other work
- Dumps incident documentation for https://phabricator.wikimedia.org/T368098
- Mute helmfile apply notifications from cirrus-streaming-updater deploys - https://phabricator.wikimedia.org/T366346