Search Platform/Weekly Updates/2024-10-04
Appearance
Summary
We are starting work on exposing the stream of RDF updates that is used by WDQS publicly. This will allow 3rd party deployments of WDQS to keep up to date more easily and more crucially for us, it will allow external organizations working on RDF backends to test ingesting WDQS data live, which is the ground work needed before a potential migration away from Blazegraph to another RDF backend.
What we've accomplished
Search Update Pipeline - Private Wikis
- Follow up on cleaning up how we use kafka to publish private events - T374335 The SUP producer should ship private wiki update events to a separate stream
- Follow up on how Closed Wikis interact with SUP - T374987 "Account autocreation denied for CirrusSearch Streaming Updater by ClosedWikiProvider"
Improve multilingual zero-results rate
- T332342 Standardize ASCII-folding/ICU-folding across analyzers Full write up on MediaWiki.
WDQS Expose RDF stream publicly
- Starting work on T374919 Adapt the rdf-streaming-updater flink job to use wikimedia-eventutilities-flink