Jump to content

Service/Etcd

From Wikitech

Etcd

Description

etcd (https://etcd.io/) is an open source key-value store with a focus on reliability that is used to store configuration and state data for distributed systems. At WMF we run a number of etcd clusters, this document addresses the two etcd Main clusters, one each installed in the primary datacenters, eqiad and codfw. A number of applications, including mediawiki read/write configuration store state data on etcd.

Categories

Relevant service categories (wiki categories) for grouping by similar services, owner, etc.

Service Type

Etcd is a foundational service

Service Dependenies

No hard dependencies beyond hardware and networking. It is worth pointing out that server hardware and networking have their own failure rates that are in the 99% range. Etcd as configured is able to deal with a certain type of failures in a local datacenter.


Confd: a lightweight configuration management tool focused on keeping local configuration files up-to-date using data stored in etcd


Ownership

Etcd is owned by the Service Operations SRE team, which is responsible for all aspects including operation, scalability, backups and software updates.

Technology Department / Site Reliability Engineering / Service Operations


Supporting documentation and relevant information

  • Design documents
  • Operational documentation
  • Phabricator component query links
  • Netbox links
  • LibreNMS
  • Icinga
  • Links to other relevant SRE Tooling™
  • Links to Runbooks
  • Related service request types
  • Any supporting or underpinning services (e.g. dependencies)
  • Who is entitled to request/view the service