Search Platform/Weekly Updates/2024-06-21
Appearance
Summary
While we are making progress on a split updater and on the underlying infrastructure for new SPARQL endpoints, it is likely that those endpoint will only be available mid-July.
What we've accomplished
WDQS graph splitting
- Data import completed on wdqs2023
- Loaded 15B triples (we expect 10B triples per subgraph for a 20B triple full graph)
- Took 7 days 15h for just the data reload, it still needs to catch up on initial lag
- Validates our automation process - https://phabricator.wikimedia.org/T364077 / https://phabricator.wikimedia.org/T349069
- Permission to create new Kafka topics for the split updater - https://phabricator.wikimedia.org/T367510
- Initial implementation of a split updater, working on a test suite - https://phabricator.wikimedia.org/T361935
Misc
- Started to migrate Cirrus statsd metrics to statsv - https://phabricator.wikimedia.org/T359033