Event Platform/History
The first 'event platform' at WMF was EventLogging. This system originally used ZeroMQ to transport messages between its various services, but was later improved to use Kafka. It was built to collect client-side performance measures and click tracking data, and has also been adopted in our mobile apps. It used a Meta-Wiki (meta.wikimedia.org) namespace to store its schemas, which were then referenced by stable revision ID to validate incoming events before producing them to the primary ZeroMQ/Kafka topics for consumption.
The original EventLogging with ZeroMQ eventually reached a scaling problem in 2014. It ran from a single server without automatic failover and did not support multiple eventlogging-processor
servers running concurrency, sustaining around 1000 events per second.[1] In 2014, EventLogging intake was migrated to Kafka, with the eventlogging-processor instances able to scale horizontally to multiple servers with Kafka naturally distributing traffic and effectively failing over as-needed.
EventLogging was designed for instrumenting features for telemetry, tracking interactions and recording measurements to give us insights into how features are used by real users. The system did not have built-in support for responding to incoming events or taking actions in response to them. Such larger pipelines were supported using the python-eventlogging client and deploying a separate microservice based on a minimal template.
In 2015, an effort was made to extend the analytics focus of EventLogging to "from and to production" event-driven pipelines. This effort was dubbed "EventBus" and culminated in three new components: the EventBus extension for MediaWiki, the mediawiki/event-schemas
Git repository, and eventlogging-service-eventbus
. eventlogging-service-eventbus is the the internal frontend that accepts HTTP POST. It validated and produced events against more tightly controlled production schemas, and produced them to Kafka. EventBus was used to build the ChangeProp service. We originally intended to merge the analytics and production uses of EventLogging.
In 2018, we started the "Modern Event Platform" program, which included EventBus's original analytics+production unification goal as well as other parts of WMF's event processing stack using open source (non-homegrown) components where possible. The EventLogging python codebase was specific to WMF and MediaWiki, making it more difficult to accomplish this unification, so it was decided to build a new more generic and extensible JSONSchema event service, eventually entitled EventGate.
In 2019, EventGate, along with other Modern Event Platform components, replaced eventlogging-service-eventbus, and is intended to eventually replace the 'analytics' deployment of EventLogging services (e.g. eventlogging-processor).
Today, events logged via EventLogging don't go to a MySQL database, instead data analysts can find the data in Kafka and Hadoop, which enables a greater volume of events to be stored.
In 2022, The "Event Platform Value Stream" working group was created and tasks with working on the Stream Connectors and Stream Processing components. Evolution of these components is being driven by work to improve the ability to externalize the current state of MediaWiki pages using event streams.
In 2025, after a long migration, the legacy EventLogging backend system was finally decommissioned.
- ↑ Event Infrastructure at WMF (2007-2018) by Andrew Otto (WMF-restricted)