Search Platform/Weekly Updates/2024-05-17
Appearance
Summary
Consultation period for the WDQS Graph Split proposal is over, we had a few comments, but no request to fundamentally change our approach. A Signpost article by Blueraspberry is adding some visibility to our efforts (https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2024-05-16/Op-Ed).
What we've accomplished
Search Update Pipeline
- Some failures in Wikidata updates, linked to a bug in Flink. The bug is patched in our version and has been submitted (and merged) upstream - https://phabricator.wikimedia.org/T364837
- We started to shift updates to the new SUP for codfw.
WDQS graph splitting
- Signpost article published. Only a few comments on the talk page at the moment, and it seems that the issue is mostly understood and accepted. It has generated some additional feedback on the project pages - https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2024-05-16/Op-Ed
- Massive refactor of the data-reload cookbook - https://phabricator.wikimedia.org/T349069
- Work started on Adapt the WDQS Streaming Updater to update multiple WDQS subgraphs - https://phabricator.wikimedia.org/T361935
Search Metrics
- Metric collection code moved from jupyter notebooks to our usual git repo - https://gitlab.wikimedia.org/repos/search-platform/discolytics/-/merge_requests/33
- Dashboard proposal at https://phabricator.wikimedia.org/T364600#9801594
Misc
- Peter is preparing a proposal for the Flink Forward conference - https://www.flink-forward.org/berlin-2024
- serviceops moving to calico for network policies, change applied to all k8s clusters running flink jobs, worked smoothly - https://phabricator.wikimedia.org/T287491