Metrics Platform/Create an instrument
This guide helps you create a Metrics Platform instrument from start to finish, including:
- Creating a measurement plan
- Creating an instrumentation specification
- Following the data collection guidelines
- Configuring the stream
- Coding the instrument
- Documenting the instrument
- Reviewing your data
- Decommissioning the instrument
Create a measurement plan
Instruments collect data about user interactions so that we can answer questions about product experiences. Before you can start collecting data, create a measurement plan (template) that documents what data you plan to collect, why, and how you plan to analyze the data.
You can write your measurement plan in a document or on a Phabricator task, depending on the scale of the project. For examples of measurement plans, see the folder on Google Drive.
Create an instrumentation specification
Once you have a measurement plan, the next step is to create an instrumentation specification (template). The instrumentation spec defines all the data you'll collect for your instrument. The spec is also a useful tool for engineers to ensure that all events are being produced and received correctly. For a template and examples of instrumentation specs, see the folder on Google Drive.
Specifying event data
Start by mapping each metric defined in your measurement plan to one or more events. An event is a data object that represents a user interaction happening at a definite time.
For example:
Metric | Events |
---|---|
Proportion of pages that are read, measured by users scrolling at least halfway down the article | User loads page |
User scrolls down 50% |
Once you have a set of events, define the data that each event should send based on the standard set of properties supported by the Metrics Platform. Read the following sections to learn about what event data is available and how to use it.
Here's an example of an event data specification for a "User scrolls down 50%" event:
Interaction data:
action
: set to "scroll" in the instrumentaction_subtype
: set to either "up" or "down" in the instrumentaction_context
: percent of the page where the user completes scrolling (example: "0.5" for 50%), set in the instrumentpage_content_language
: populated by the Metrics Platformpage_title
: populated by the Metrics Platformmediawiki_skin
: populated by the Metrics Platformperformer_pageview_id
: populated by the Metrics Platform
The action
property
Each instrument is required to set the action
property when submitting an event. For click interactions, the value of action
should be set to "click". For other types of interactions, you can choose a custom value for action
, such as "session_init".
Interaction data
In addition to action
, Metrics Platform supports interaction data that you can customize to fit the needs of your instrument. You can set these properties to any meaningful value to provide the data required for your event.
Contextual attributes
Contextual attributes are fields in the event data that provide information about the performer who triggered the event and the wiki where the event occurred. The values of contextual attributes are populated automatically by Metrics Platform when the event is generated. While each client includes a few contextual attributes automatically, most attributes must be enabled in the instrument's event stream configuration. For a list of available attributes, see Contextual attributes.
Choosing a schema
Every instrument must designate a schema that will be used to validate the event data. Most instruments should use one of the Metrics Platform base schemas:
If your event requires data that is not supported by the base schemas, you can create a custom schema for your instrument.
Follow the data collection guidelines
All data collection activities must follow the data collection guidelines. Once you've identified the applicable risk tier, you can use your measurement plan and instrumentation spec to complete the steps in the guidelines under "What should WMF teams do next?".
Configure the stream
Once you've completed the necessary steps in the data collection guidelines, you're ready to launch your instrument. The first step to launching an instrument is configuring the event stream where the events will be published. See the stream configuration guide.
Code the instrument
You can write your instrument code in the WikimediaEvents extension or in your product codebase. See the API docs to learn how to code an instrument.
Document the instrument
Now that your instrument is active, complete these steps to document your instrument:
- Publish a summary of your instrument on wiki: Create a wiki page that summarizes your measurement plan and the data collected by your instrument. This helps provide transparency to the wider community about the data that WMF collects. You can create this page as a subpage of your codebase or project page. For example, see mw:Extension:WikiLambda/Metrics.
- Update the instrument list: Add your instrument to the instrument list with links to documentation.
- Make your instrument documentation discoverable: Your measurement plan and instrumentation spec are important resources for data analysts and code maintainers. To help people find these documents, link to them from your codebase wiki page, README, project page, DataHub page, or other frequently used documents. Duplicated documentation is more likely to be outdated, so always try to maintain a single source of information.
Review your data
The Event Platform has documented the process for viewing and querying events in both Beta and Product environments. This is the best current method to check that your instrument is generating events as expected.
Decommission the instrument
While instruments that measure product health may be long lived, instruments that capture metrics related to an experiment should be disabled once the experiment is complete. See Decommission an instrument.