Metrics Platform/Client/Implementations
This page provides an overview of the existing Event Platform Client (EPC) implementations and describes the outstanding differences among them. Our goal is for the various implementations to match one another as closely as possible while taking differences imposed by varying programming language constructs, project-specific design patterns, etc. into account.
Note: At present there is a 1:1 relationship between programming language and client: effectively, the JavaScript client is the MediaWiki frontend client; the PHP client is the MediaWiki backend client; the Java client is the Android app client; and the Swift client is the iOS app client. In principle, however, the libraries should be language-specific but platform-agnostic, and should be able to be shared by multiple clients implemented in the same language.
JavaScript
The EPC JavaScript implementation is provided in the EventLogging extension. Specifically, it lives in the core.js file shipped to clients as part of the ResourceLoader module ext.eventLogging
, and its methods are exposed via mw.eventLog.
Public methods
core
object and thereby exported via mw.eventLog. These include streamInSample()
, and all methods under mw.eventLog.id
and mw.eventLog.storage
. These should not be invoked by external callers and will be removed from the de facto public interface in a future refactor.mw.eventLog.submit( streamName, eventData )
Evaluates the stream and submitted data for submission to the Event Platform intake service according to any sampling and filtering rules specified in the stream configuration. If the event passes all sampling and filtering rules, it is supplemented with additional metadata and submitted to the event platform intake service.
Params
- streamName (string): The destination stream name. Unless stream configuration is globally disabled, streamName must correspond to a stream configured in $wgEventStreams, or the request will fail.
- eventData (object): An object containing the event data.
PHP
The EPC PHP implementation is also part of the EventLogging extension, where it can be found in includes/EventLogging.php along with helper methods in includes/EventLoggingHelper.php.
No sampling support is provided in the PHP client at present. Sampling support can be added if and when there is a use case for it and an appropriate sampling unit defined.
Public methods
EventLogging::submit($streamName, $event, $logger = null)
Evaluates the stream and submitted data for submission to the Event Platform intake service according to any sampling and filtering rules specified in the stream configuration. If the event passes all sampling and filtering rules, it is supplemented with additional metadata and submitted to the event platform intake service.
After filtering and supplementing the event, the implementation delegates to EventBus::send
for event submission.
Params
- $streamName (string): The destination stream name. Unless stream configuration is globally disabled, streamName must correspond to a stream configured in $wgEventStreams, or the request will fail.
- $event (array): An associative (string-keyed) array containing the event data.
- $logger (?Psr\Log\LoggerInterface): An optional Logger instance. This is intended only for automated testing, and should not be used by production callers.
Java
The Java EPC library is currently implemented in the org.wikipedia.analytics.eventplatform
package in the Wikipedia for Android app repository (see app/src/main/java/org/wikipedia/analytics/eventplatform/
). The main implementation is in the EventPlatformClient class, and the remaining classes are largely POJOs used for serializing to and deserializing from JSON strings using Gson and Retrofit.
Stream configurations are fetched from the MediaWiki API via Meta-Wiki on app startup and stored in SharedPreferences for use in future sessions. There is currently no attempt to retry fetching the stream configs in case of failure, and no attempt to retain events that occur before stream configs are fetched for submission when they become available. Outgoing events are enqueued in an OutputBuffer and submitted in batches every 30 seconds. Additionally, if the queue exceeds 128 events in size, all events are immediately sent to the event platform intake service.
Public methods
EventPlatformClient.submit(Event event)
Evaluates the stream and submitted data for submission to the Event Platform intake service according to any sampling and filtering rules specified in the stream configuration. If the event passes all sampling and filtering rules, it is supplemented with additional metadata and enqueued for submission to the event platform intake service.
Params
- event (Event): The event data. The Event class is intended as a base class containing all fields that are required of all app analytics events. It allows modeling event data as Gson POJOs and can be subclassed for specific event types (e.g., UserContributionEvent). The stream name and schema are passed in to the constructor.
Swift
The Swift EPC library is contained in the Event Platform
group within the WMF Framework
module of the Wikipedia app for iOS. The main client functionality is implemented in EventPlatformClient class, along with the StorageManager and SamplingController support classes defined in separate files. Additional files in the group contain Core Data model definitions for event storage.
Stream configurations are fetched from the MediaWiki API via Meta-Wiki on app startup. If the stream configuration request fails, it is retried up to 10 times, with an increasing delay period between retries. Stream configurations are not held in persistent storage for subsequent launches.
Before stream configurations are fetched, any events received by the client are stored unconditionally in an InputBuffer with a maximum size of 128 events. If the input buffer reaches its maximum size, the oldest events are removed as needed to make room for new events. After stream configurations are loaded, events in the InputBuffer are evaluated and conditionally moved to persistent storage in Core Data, where they are held for eventual submission to the Event Platform intake service. Subsequently, the InputBuffer is no longer used, and all events received are evaluated and conditionally held in Core Data for submission.
Every 30 seconds, stale entries are pruned from the event storage table, and an attempt is made to submit all remaining entries pending submission. A stale entry is defined as one that has either been successfully submitted or has existed in the storage table for more than 30 days.
Public methods
EventPlatformClient.submit<E: EventInterface>(stream: Stream, event: E, domain: String? = nil)
Evaluates the stream and submitted data for submission to the Event Platform intake service according to any sampling and filtering rules specified in the stream configuration. If the event passes all sampling and filtering rules, it is supplemented with additional metadata and enqueued for submission to the event platform intake service.
Params
- stream (Stream): The destination stream name.
Stream
is an enum defined in the EventPlatformClient class that contains the expected destination stream names as values. - event (E: EventInterface): The event data. EventInterface is a protocol (interface) requiring that the event data contain a
schema
field and implementCodable
. - domain (String?): An optional domain string, intended to be used where the wiki domain for the current app language does not apply to the event being submitted.
See also
- https://github.com/linehan/wmf-epc early prototype implementations