Jump to content

Search Platform/Weekly Updates/2024-11-08

From Wikitech

Ongoing work

Search Update Pipeline / Weighted tags

Language Stuff: Kuromoji

  • I've massaged all the Japanese tokenization data as much as I can and written up instructions for speaker evaluation. I've got one volunteer to look over the samples and evaluate the tokenization (both the Kuromoji and ICU tokenizers), and I've asked a couple more people.

What we've accomplished

Misc / Operations