Search Platform/Weekly Updates/2023-05-05
Appearance
Summary
Planning, budget, vacations, sickness, not a lot of concrete outputs this week.
What we've accomplished
Search Analysis
- Blog post about he work on language analysis: https://diff.wikimedia.org/2023/04/28/language-harmony-and-unpacking-a-year-in-the-life-of-a-search-nerd/. Lot of positive feedback and more ideas for the future.
Operations / SRE
- Using a different library for WDQS data xfers https://phabricator.wikimedia.org/T321605
- Bootstrap WDQS on Bullseye https://phabricator.wikimedia.org/T331300
- Decommission query-preview.wikidata.org: this endpoint was exposed to allow our communities to test the WDQS streaming updater and not needed anymore. The full cleanup of any configuration related to this temporary service is now completed. https://phabricator.wikimedia.org/T333656
Misc
- Mike had conversations with Chris (ML team) about Search and ML. There are interesting ideas around LLM-powered search experience. And collaboration opportunities, either around infrastructure or around features.
- Presentation from Pratt students on how editors use search. Nothing profoundly new, but a few notes on problems we already know:
- People don't know about the search function / page, or don't know how to find it.
- Advanced search is confusing, for example "search in:" is about namespaces, but some people don't even know what a namespace is. They expect a filter / facet function instead.
- Search results don't seem as relevant as Google when searching Wikipedia.
- Capex requests (server refresh and new servers for capacity expansion) completed. It looks like we are overall over budget as an organization, we need to cut some of the spending.