Jump to content

Metrics Platform/Implementations

From Wikitech

This page provides an overview of Metrics Platform Client (MPC) API and behaviors, its implementations and the key differences between them. MPC implementations can be found in data-engineering/metrics-platform.

Quickstart

To code an instrument, submit Metrics Platform events to an event stream by calling submitClick or submitInteraction. See these example patches: gerrit:1059293, gerrit:1029238.

Setup

To set up a local development environment for writing instrument code, follow the setup guide for MediaWiki and Metrics Platform. For more context and tooling recommendations, see the introduction to Metrics Platform development.

Submit a click event

submitClick provides a simplified method to submit click events that use the Metrics Platform base schemas. To specify a custom value for action or to use a custom schema, use submitInteraction.

Here's an example of an instrument in JavaScript that uses mw.eventLog.submitClick to submit an event when a user clicks on an interwiki link.

// 'a.extiw' will match anchors that have a extiw class. extiw is used for interwiki links.
$( '#content' ).on( 'click', 'a.extiw', function ( jqEvent ) {
    var link = jqEvent.target;
    var linkClickInteractionData = {
        action_source: link.href,
        action_context: link.title,
    };

    var streamName = 'mediawiki.product_metrics.example_click';
    mw.eventLog.submitClick( streamName, linkClickInteractionData );
});

The resulting event:

  • by default, includes action: click
  • includes optional interaction data (action_source and action_context)
  • by default, is validated against the latest Metrics Platform base schema for web
  • is published to the specified event stream (in this case, mediawiki.product_metrics.example_click)

In PHP, you can use EventLogging::getMetricsPlatformClient()->submitClick() to send a server-side event:

$interactionData = [
	'action_source' => 'value for action source'
	// ... Other interaction data fields
];

$streamName = 'mediawiki.product_metrics.example_click';
EventLogging::getMetricsPlatformClient()->submitClick( $streamName, $interactionData );

In Java, instantiate a MetricsClient object before invoking MetricsClient::submitClick:

// Parameters derived from client at instantiation of the Metrics Client object with options to pass in dynamic arguments at the time of event submission
ClientData clientData = new ClientData(
   agentData,
   pageData,
   mediawikiData,
   performerData,
);
InteractionData interactionData = new InteractionData(
   'action_value',
   'action_subtype_value',
   'action_source_value',
   'action_context_value',
);

metricsClient.submitClick(clientData, interactionData);

Submit an interaction event

An interaction event is meant to represent a basic interaction with some target or some event occurring. For example, a user hovers over a UI element or an app notifies the server of its current state.

Here's an example of an instrument in JavaScript that uses mw.eventLog.submitInteraction to submit an event when a user hover uvers an interwiki link.

// 'a.extiw' will match anchors that have a extiw class. extiw is used for interwiki links.
$( '#content' ).on( 'mouseover', 'a.extiw', function ( jqEvent ) {
	var link = jqEvent.target;
	var linkHoverInteractionData = {
		action_source: link.href,
		action_context: link.title,
	};

    var action = 'hover';
    var streamName = 'mediawiki.product_metrics.example_hover';
    var schemaId = '/analytics/product_metrics/web/base/1.3.0';
	mw.eventLog.submitInteraction( streamName, schemaId, action, linkHoverInteractionData );
} );

The resulting event:

  • includes the specified value of action
  • includes optional interaction data you have provided in linkHoverInteractionData (action_source and action_context)
  • is validated against the specified schema (in this case, the Metrics Platform base schema for web)
  • is published to the specified event stream (in this case, mediawiki.product_metrics.example_hover)

In PHP, you can use EventLogging::getMetricsPlatformClient()->submitInteraction() to send a server-side event:

$interactionData = [
    'action_source' => 'value for action source',
    // ... Other interaction data fields
];

$streamName = 'mediawiki.product_metrics.example_hover';
$action = 'hover';
$schemaID = '/analytics/product_metrics/web/base/1.3.0';

EventLogging::getMetricsPlatformClient()->submitInteraction( $streamName, $interactionData, $action, $schemaID );

In Java, instantiate a MetricsClient object before invoking MetricsClient::submitInteraction:

Map<String, Object> customData = new HashMap<String, Object>();
customData.put("font_size", "small");
customData.put("is_full_width", true);
customData.put("screen_size", 1080);

metricsClient.submitInteraction(
   "custom_schema_id",
   "some_prefix.some_event_name",
   clientData,
   interactionData,
   customData
);

Language-specific notes and key differences

At present there is a 1:1 relationship between programming language and client: effectively, the JavaScript MPC is the MediaWiki frontend client; the PHP client is the MediaWiki backend client; the Java client is the Android app client; and the Swift client is the iOS app client. In principle, however, the libraries should be language-specific but platform-agnostic, and should be able to be shared by multiple clients implemented in the same language. Our goal is for all MPCs to provide the same API and behave the same way, and to be as clear as possible when they do not.

JavaScript

In the context of MediaWiki frontend development, the JavaScript MPC is provided by the EventLogging extension. It is delivered as part of the ext.eventLogging ResourceLoader module and its methods are exposed via mw.eventLog.

The JavaScript MPC and stream configurations are both delivered as part of the module and the stream configurations are not fetched again until the user navigates to another page. This minimizes the number of HTTP requests per pageview.

The EventLogging extension maintains a list of streams to be included in the module, $wgEventLoggingStreamNames, which can be used to minimize the size of the module. When $wgEventLoggingStreamNames is falsy the JavaScript MPC will not validate whether the destination stream is configured before submitting the event to the destination event service.

PHP

The PHP MPC is also part of the EventLogging extension. Its methods are exposed via MediaWiki\Extension\EventLogging\EventLogging.

Like the JavaScript MPC, when $wgEventLoggingStreamNames is falsy the PHP MPC will not validate whether the destination stream is configured before submitting the event to the destination event service.

The PHP MPC does not support sampling by pageview ID or session ID because they are not known to the server (i.e., the PHP layer of MediaWiki). Sampling support can be added if and when there is a use case for it and an appropriate sampling unit defined.

Java

The Java MPC is a standalone library that requires the instantiation of a MetricsClient object, which accepts an instance of ClientMetadata in its constructor. The ClientMetadata instance must provide app-specific getters for the values of contextual attributes. (As described above (API), these values are submitted along with whatever custom data is passed into submitMetricsEvent). The nested class method Builder.build of the enclosing MetricsClient class is responsible for instantiating an EventProcessor, a MetricsClient, and for fetching stream configurations.

Stream configurations are fetched periodically, with frequency defined by streamConfigFetchInterval in class MetricsClient, which also periodically processes the submission of queued events to the destination event intake service. If stream configurations are not yet available, queued events are held temporarily in the event queue (input buffer) per available space (as allocated in eventQueue's definition in MetricsClient, which can be overridden using method eventQueueCapacity).

If the event queue is full, events are dropped from the queue with a logged warning (from MetricsClient method submit). The metrics client will continue to send queued events to the destination event service at regular intervals (defined by sendEventsInterval). Successfully submitted events are removed from the event queue. Failed event submissions produce an error and remain in the event queue to be retried on the next queued events submission attempt.

Swift

The Swift MPC library is contained in the Event Platform group within the WMF Framework module of the Wikipedia app for iOS. The main client functionality is implemented in EventPlatformClient , along with the StorageManager and SamplingController support classes defined in separate files. Additional files in the group contain Core Data model definitions for event storage.

Stream configurations are fetched from the MediaWiki API via Meta-Wiki on app startup. If the stream configuration request fails, it is retried up to 10 times, with an increasing delay period between retries. Stream configurations are not held in persistent storage for subsequent launches.

Before stream configurations are fetched, any events received by the client are stored unconditionally in an InputBuffer with a maximum size of 128 events. If the input buffer reaches its maximum size, the oldest events are removed as needed to make room for new events. After stream configurations are loaded, events in the InputBuffer are evaluated and conditionally moved to persistent storage in Core Data, where they are held for eventual submission to the Event Platform intake service. Subsequently, the InputBuffer is no longer used, and all events received are evaluated and conditionally held in Core Data for submission.

Every 30 seconds, stale entries are pruned from the event storage table, and an attempt is made to submit all remaining entries for which submission is pending. A stale entry is defined as one that has either been successfully submitted or has existed in the storage table for more than 30 days.

The database storage model is a deviation from the Event Platform Client specification and was carried over from the previous analytics client implementation at the iOS app team's request. There is no plan to update other clients to match this behavior.

Reference

EventLogging::submit (PHP)

After filtering and supplementing the event, the implementation delegates to EventBus::send for event submission.

Use submitInteraction and submitClick when creating new instrumentation.

Parameters:

  • streamName (string): The destination stream name. Unless stream configuration is globally disabled, streamName must correspond to a stream configured in $wgEventStreams, or the request will fail.
  • event (array): An associative (string-keyed) array containing the event data.
  • logger (?Psr\Log\LoggerInterface): An optional Logger instance. This is intended only for automated testing, and should not be used by production callers.

EventPlatformClient.submit (Java)

Evaluates the stream and submitted data for submission to the Event Platform intake service according to any sampling and filtering rules specified in the stream configuration. If the event passes all sampling and filtering rules, it is supplemented with additional metadata and enqueued for submission to the event platform intake service.

Use submitInteraction and submitClick when creating new instrumentation.

Parameters:

  • event (Event): The event data. The Event class is intended as a base class containing all fields that are required of all app analytics events. It allows modeling event data as Gson POJOs and can be subclassed for specific event types (e.g., UserContributionEvent). The stream name and schema are passed in to the constructor.

EventPlatformClient.submit (Swift)

Evaluates the stream and submitted data for submission to the Event Platform intake service according to any sampling and filtering rules specified in the stream configuration. If the event passes all sampling and filtering rules, it is supplemented with additional metadata and enqueued for submission to the event platform intake service.

Use submitInteraction and submitClick when creating new instrumentation.

Parameters:

  • stream (Stream): The destination stream name. Stream is an enum defined in the EventPlatformClient class that contains the expected destination stream names as values.
  • event (E: EventInterface): The event data. EventInterface is a protocol (interface) requiring that the event data contain a schema field and implement Codable.
  • domain (String?): An optional domain string, intended to be used where the wiki domain for the current app language does not apply to the event being submitted.

Types:

  • Stream: Enum defined in EventPlatformClient that contains allowed destination stream names as values
  • EventInterface
  • EventInterface: Protocol requiring that the event data has a schema property and implements Codable

mw.eventLog.submit (JavaScript)

Evaluates the stream and submitted data for submission to the Event Platform intake service according to any sampling rules specified in the stream configuration. If the event passes those sampling rules, then it is supplemented with additional metadata and submitted to the intake service.

Use submitInteraction and submitClick when creating new instrumentation.

Parameters:

  • streamName (string): The destination stream name. Unless stream configuration is globally disabled, streamName must correspond to a stream configured in $wgEventStreams, or the request will fail.
  • eventData (object): An object containing event data relevant to the occurrence and specific to the instrumentation (for example: the UI element that was clicked)

mw.eventLog.submitClick (JavaScript)

Submits a click event to a stream and validates it against the core interaction base schema.

Parameters:

  • streamName (string): Name of the event stream where the event should be submitted
  • (optional) interactionData (key/value pairs): Additional event data. Each value must be a string. Available keys:
    • action_subtype
    • action_source
    • action_context
    • element_id
    • element_friendly_name

mw.eventLog.submitInteraction (JavaScript)

Submits an interaction event to a stream.

Parameters:

  • streamName (string): Name of the event stream where the event should be submitted
  • schemaId (string): Name of the schema to validate the event against
  • action (string): Name of the interaction
  • (optional) interactionData (key/value pairs): Additional event data. Each value must be a string. Available keys:
    • action_subtype
    • action_source
    • action_context

See also