Talk:Data Platform/Systems/Event Data retention
Outdated parts
This page seems a bit outdated in some aspects:
- The clientIp field (hashed IP) was removed from the event capsule more than a year ago.
- "It is being reviewed right now" (now = July 2016) - should note the outcome of the review.
Regards, HaeB (talk) 15:34, 3 May 2017 (UTC)
- Thanks for spotting this! I've updated the reference to clientIp. Sadly enough, the "It is being reviewed right now" part is still valid :], meaning it is still being reviewed. This task has been depending on the DBA team for a long time because of lack of resources (they have lots of urgent work). As you saw, we Analytics have recently taken that task and will be tackling this ourselves. Mforns (talk) 20:03, 3 May 2017 (UTC)
Browsing history
Regarding "browsing history: the pages visited by a user": I understand that this part refers to the information that a particular page was viewed by a particular user. Clearly, the names of the viewed pages per se are not sensitive personal information (we even publish them all the time as part of our public pageview data). Regards, HaeB (talk) 15:34, 3 May 2017 (UTC)
- Yes, exactly. "Any information that both" has a PII (meaning anything that can potentially identify a user, like ID, editCount, userAgent, etc.) AND contains browser patterns or other similar data that can convey personal preferences. I changed a bit the text to make clear that we refer to the combination of browser history AND its identified user. Thanks for the comment :] Mforns (talk) 20:11, 3 May 2017 (UTC)
Configuration for EventLogging purge strategy
One piece of information I couldn't find in this otherwise comprehensive document is where the purging strategy is configured for each EventLogging schema. I see the per-field whitelist, but not the overall strategy corresponding to Purging Strategies. Awight (talk) 14:55, 4 March 2020 (UTC)