Jump to content

Adding and removing transit providers

From Wikitech

This page contains a checklist of actions to be taken on mutations regarding transit circuits in the "production" realm (in other words: this does not apply to the Wikimedia corporate/office network). Based on input from netops/Ferran, but may be incomplete.

Adding transit circuits

Preparation work

  • Ensure there is a sound business case for procuring an additional transit circuit (e.g. SRE/business case/Network - 4th transit for drmrs)
  • Create a procurement task (e.g. task T314929)
  • Consider various criteria for selecting the actual transit provider (criteria will vary depending on needs)
    • Diversity (circuit, router, linecard, X-connect if applicable)
    • Cost
    • Backbone capacity and connectivity to the Internet/customer cone size (e.g. consider Caida AS rank, use NetFlow samples to determine transit providers with shortest AS path for popular IP prefixes)
    • DDoS mitigation capabilities
    • Deployment timeframe
    • Jumbo MTU (if applicable, for GRE tunnels between sites)
    • (list is not exhaustive, feel free to consider additional criteria)

Implementation (once PO signed)

  1. Create circuit in Netbox (with available information, with status provisioning, update Netbox as more info arrives)
  2. Assign router port by adding a planned Netbox cable between the circuit and the disabled interface (+ run Homer)
  3. Communicate configuration info to provider (eg. AS, prefixes, MTU)
  4. Once Letter of Authorization has been received, create a cross-connect task in Phabricator
    1. + purchase optics and spares if needed
    2. Ensure cross-connect path diversity if needed
    3. When getting close to the cross-connect setup ETA, enable router port in Netbox (add no-mon in description, run Homer) so remote hands can check light
    4. Communicate X-connect ETA/details to provider
  5. Once IP/MTU/etc config received from provider, add them to Netbox (+ run Homer)
  6. Once physical connectivity has been established, update Netbox (remove no-mon, set patch cable to active, add cross-connect details)
  7. Some providers will require a turn up call at this point
  8. Adjust AS14907 ASPA to reflect the transit AS (task T372161)
  9. Adjust AS14907 import & export routing policies in the appropriate IRR databases (depending on the site where the circuit is added, this does NOT have to be ARIN-only!) to reflect correct AS-sets for the transit AS (example: import: from [transit_as] accept ANY; export: to [transit_as] announce AS-WIKIMEDIA)
  10. Configure BGP session via the router's transits config in operations/homer-public/config/devices.yaml, and configure export policies for anycast (e.g. https://gerrit.wikimedia.org/r/c/operations/homer/public/+/870904)
  11. Verify prefixes sent/received, check looking glass for propagation + correct communities
  12. Ensure the transit AS is part of the 'critical BGP peer list' in the check_bgp Icinga config, for correct alerting
  13. Update LibreNMS bills to account for this new provider (site global + contract specific)

Removing transit circuit

  1. (only if transit AS not used elsewhere within Wikimedia AS) Remove the transit AS 'critical BGP peer list' in the check_bgp Icinga config - this may help reduce false positive alerts
  2. Remove the BGP session from the router's transits config in operations/homer-public/config/devices.yaml, and remove export policies for anycast (e.g. https://gerrit.wikimedia.org/r/c/operations/homer/public/+/870904) - then run Homer to stop the BGP session
  3. Verify Internet connectivity has failed over to remaining transit providers
  4. In Netbox, disable the interface, set the circuit's status to decommissioning, then run Homer
  5. (only if transit AS not used elsewhere within Wikimedia AS) Remove the transit AS from the AS14907 ASPA record (task T372161)
  6. Remove the transit AS from the AS14907 import/export policies in the appropriate IRR databases (again, does not have to be ARIN only, and if the transit AS is used elsewhere within the Wikimedia AS, it does not have to be removed in all IRR databases)
  7. Update LibreNMS bills to remove references to the circuit (site global + contract specific)
  8. Re-assign the task to DCops for physical de-cabling