Wikimedia Cloud Services team/EnhancementProposals/2020 Network refresh/Implementation details
This page contains the implementation details for the 2020 Network refresh project.
eqiad
In the eqiad datacenter, related to the eqiad1 openstack deployment.
specs for eqiad1
On cloudgw side, each server:
- Hardware, misc box
- CPU: 16 CPU
- RAM: 32 GB
- Disk: 500GB
- 2 x 10Gbps NICs. NICs are bonded/teamed/aggregated for redundancy.
- Software
- standard puppet management
- prometheus metrics, icinga monitoring
- netfilter for NAT/firewalling
- keepalived or corosync/pacemaker for HA
On cloudsw side, each device:
- Juniper QFX5100 switches with L3 routing licenses
network setup in eqiad1
allocations
IPv4 allocations:
185.15.56.0/24
185.15.56.0/25 - Openstack instances NAT
185.15.56.128/26 - reserved for the above groth
185.15.56.192/27 - unused
185.15.56.224/28 - unused
185.15.56.240/28 - infrastructure
185.15.56.240/29 - 1120 - cloud-instances-transport1
185.15.56.248/31 - 1104 - cloudsw1-c8<->cloudsw1-d5 - cloud-xlink1
185.15.56.250/31 - unused
185.15.56.252/30 - loopbacks
VLAN allocations:
1102 - cr1<->cloudsw1-c8 - cloud-transit1-eqiad
1103 - cr2<->cloudsw1-d5 - cloud-transit2-eqiad
1104 - cloudsw1-c8<->cloudsw1-d5 - cloud-xlink1-eqiad
1105 - cloud-instances1-eqiad
1106 - cloud-storage1-eqiad
1107 - cloudsw1<->cloudgw - cloud-gw-transport-eqiad ?
1118 - cloud-hosts1-eqiad
1120 - cloud-instances-transport1-eqiad
stage 0
starting point, current network setup
VLAN | Switched on | L2 Members | L3 Gateway (“to internet”) |
cloud-hosts1-eqiad | asw2-b | cr1/2
all cloudvirt eth0 all Ceph OSD eth0 |
cr1/2 (via asw2-b) |
cloud-instances2-eqiad | asw2-b | all cloud VPS
all cloudvirt eth1 cloudnet1003/1004 eth1 |
cloudnet1003/1004 eth1 |
cloud-instances-transport1-eqiad | asw2-b | cloudnet1003/1004 eth0 | cr1/2 |
cloud-storage1-eqiad | asw2-b | all cloudcephosd eth1 | (none) |
The cloud-hosts vlan, which is part of the production realm, is curently routed on cr1/2-eqiad:ae2.1118. Which are the interfaces facing asw2-b-eqiad.
In the optic of better separation of WMCS and production realm, that routing should be moved to cr1/2-eqiad:xe-3/0/4.1118, the interfaces facing cloudsw.
This already contributes to goals (A) and (C). This was a low complexity change. See https://phabricator.wikimedia.org/T261866 for the implementation.
stage 1
: validate cloudgw changes in codfw
This is a NOOP in the eqiad DC.
stage 2
: enable L3 routing on cloudsw nodes
This will contribute to goals (A), (B), (C) and (D) of the project.

Steps and implementation on https://phabricator.wikimedia.org/T265288
At this stage:
VLAN | Switched on | L2 Members | L3 Gateway (“to internet”) |
cloud-hosts1-eqiad | asw2-b*
cloudsw |
cr1/2
all cloudvirt eth0 all Ceph OSD eth0 |
cr1/2 (via cloudsw) |
cloud-instances2-eqiad | asw2-b*
cloudsw |
all cloud VPS
all cloudvirt eth1 cloudnet1003/1004 eth1 |
cloudnet1003/1004 eth1 |
cloud-instances-transport1-eqiad | asw2-b*
cloudsw |
cloudsw
cloudnet1003/1004 eth0 |
cloudsw |
cloud-transit1/2-eqiad | cloudsw | cr1/2
cloudsw |
cr1/2 |
cloud-storage1-eqiad | asw2-b*
cloudsw |
all cloudcephosd eth1 | (none) |
* To be removed when hosts are moved away from that device
stage 3
: enable L3 routing on cloudgw nodes
connectivity between cloudgw and the cloud-hosts1-b-eqiad subnet.- L3:
- a single IP address allocated by standard methods for ssh management, puppet, monitoring, etc. Gateway for this subnet lives on core routers, but is switches through cloudgw after stage 1.
- L2:
- cloudgw has 2 NICs bonded/teamed/aggregated and then trunked with 3 vlans:
- cloud-hosts1-b-eqiad (vlan 1118) 10.64.20.0/24
- cloud-instances-transport1-b-eqiad (vlan 1120) 208.80.155.88/29
- cloud-new-transport-eqiad (vlan 11XX) final CIDR TBD
- cloudgw has 2 NICs bonded/teamed/aggregated and then trunked with 3 vlans:
- L3:
- connectivity between Neutron (cloudnet) and cloudgw:
- L3:
- keep the current cloud-instances-transport1-b-eqiad (vlan 1120) 208.80.155.88/29
- keep the current cloud-instances2-b-eqiad (vlan 1105) 172.16.0.0/21
- L2:
- cloudnet keep 2 NICs, each with different setup:
- cloud-hosts1-b-eqiad (vlan 1118) 10.64.20.0/24
- other trunked with vlan 1105 and vlan 1120 (cloud-virt-instance-trunk).
- cloudgw has 2 NICs bonded/teamed/aggregated and then trunked with 3 vlans:
- cloud-hosts1-b-eqiad (vlan 1118) 10.64.20.0/24
- cloud-instances-transport1-b-eqiad (vlan 1120) 208.80.155.88/29
- cloud-new-transport-eqiad (vlan 11XX) final CIDR TBD
- cloudnet keep 2 NICs, each with different setup:
- L3:
- connectivity between cloudgw and cloudsw:
- L3:
- allocate new transport range and vlan 11XX.
- static routes between cloudgw and cloudsw
- L2:
- cloudsw has ports aggregated and trunked with vlan 11XX to connect with cloudgw.
- cloudgw has 2 NICs bonded/teamed/aggregated and then trunked with 3 vlans:
- cloud-hosts1-b-eqiad (vlan 1118) 10.64.20.0/24
- cloud-instances-transport1-b-eqiad (vlan 1120) 208.80.155.88/29
- cloud-new-transport-eqiad (vlan 11XX) final CIDR TBD
- L3:
- connectivity between cloudsw and prod core router:
- L1: cloudsw are directly connected to the prod core routers using 1x10G port each
- L2: 2 vlans are trunked between the two sides: vlan 1118 (cloud-hosts) and 1102 (public interco vlan)
- L3: allocate two new interco /31s prefixes (208.80.154.210/31 and 208.80.154.212/31), configure eBGP in
stage 2A
codfw
In the codfw datacenter, related to the codfw1dev openstack deployment.
specs for codfw1dev
For cloudgw, repurpose labtestvirt2003 as cloudgw2001-dev.
For cloudsw, we assume we wont have the device anytime soon.
network setup in codfw1dev
Specific configuration details for each stage.
allocations
IPv4 allocations:
185.15.57.0/24
185.15.57.0/29 - Openstack instances NAT (floating IPs)
185.15.57.8/30 - 2107 - cloud-gw-transport-codfw (cloudgw <-> neutron)
185.15.57.16/28 - unused
185.15.57.32/27 - unused
185.15.57.64/26 - unused
185.15.57.128/25 - infrastructure
208.80.153.184/29 - 2120 - cloud-instances-transport1-b-codfw (cr-codfw <-> cloudgw)
VLAN allocations:
2105 - cloud-instances1-codfw (172.16.128.0/24)
2107 - cloud-gw-transport-codfw (cloudgw <-> neutron) (185.15.57.8/31)
2118 - cloud-hosts1-codfw (10.192.20.0/24)
2120 - cloud-instances-transport1-codfw (cr-codfw <-> cloudgw) (208.80.153.184/29)
stage 0
: starting point, current network setup
TODO: for reference, include here some bits about the starting setup of the network?
stage 1
: validate cloudgw changes in codfw
Given we don't have hardware for testing the cloudsw setup in codfw, we assume we are working with core routers and asw.
In this stage, we validate all the cloudgw changes that will be later implemented in eqiad. We use the labtestvirt2003.codfw.wmnet server acting as cloudgw in this PoC.
- connectivity between cloudgw and the cloud-hosts1-b-codfw subnet.
- L3:
- a single IP address allocated by standard methods for ssh management, puppet, monitoring, etc. Gateway for this subnet lives in the core router (we don't have cloudsw)
- L2:
- cloudgw has 2 NICs, the control plane one (eno1) connected to:
- cloud-hosts1-b-codfw (vlan 2118) 10.192.20.0/24 (untagged)
- cloudgw has 2 NICs, the control plane one (eno1) connected to:
- L3:
- connectivity between Neutron (cloudnet) and cloudgw:
- L3:
- cloudnet keeps the current connection to the cloud-hosts1-b-codfw subnet for ssh management, puppet, monitoring, etc. Gateway for this subnet lives in the core router (we don't have cloudsw).
- drop (or leave unused) the current cloud-instances-transport1-b-codfw (vlan 2120) 208.80.153.184/29
- add cloud-gw-transport-codfw (cloudgw <-> neutron) (vlan 2107) 185.15.57.8/30
- keep the current cloud-instances2-b-codfw (vlan 2105) 172.16.128.0/24
- L2:
- cloudnet keep 2 NICs, each with different setup:
- cloud-hosts1-b-codfw (vlan 2118) 10.192.20.0/24
- other trunked with at least vlan 2105 and vlan 2107 (cloud-virt-instance-trunk).
- cloudgw has 2 NICs, the data plane one (eno2) being trunked with:
- cloud-instances-transport1-b-codfw (cr-codfw<->cloudgw) (vlan 2120) 208.80.153.184/29
- cloud-gw-transport-codfw (cloudgw <-> neutron) (vlan 2107) 185.15.57.8/30
- cloudnet keep 2 NICs, each with different setup:
- L3:
- connectivity between cloudgw and cr-codfw:
- L3:
- interconnect using cloud-instances-transport1-b-codfw (cr-codfw<->cloudgw) (vlan 2120) 208.80.153.184/29
- L2:
- cloudgw has 2 NICs, the data plane one (eno2) being trunked with:
- cloud-instances-transport1-b-codfw (cr-codfw<->cloudgw) (vlan 2120) 208.80.153.184/29
- cloud-gw-transport-codfw (cloudgw <-> neutron) (vlan 2107) 185.15.57.8/30
- cloudgw has 2 NICs, the data plane one (eno2) being trunked with:
- L3:
neutron operations
- define new subnet object
- update external fixed IP address, now using an address from vlan 2107 cloud-gw-transport-codfw (185.15.57.8/30)
- disable SNAT (now done in cloudgw)
root@cloudcontrol2001-dev:~# openstack router show cloudinstances2b-gw -f yaml
admin_state_up: UP
availability_zone_hints: ''
availability_zones: nova
created_at: '2018-03-29T14:18:50Z'
description: ''
distributed: false
external_gateway_info: '{"network_id": "57017d7c-3817-429a-8aa3-b028de82cdcc", "enable_snat":
true, "external_fixed_ips": [{"subnet_id": "31214392-9ca5-4256-bff5-1e19a35661de",
"ip_address": "208.80.153.190"}]}'
flavor_id: null
ha: true
id: 5712e22e-134a-40d3-a75a-1c9b441717ad
interfaces_info: '[{"port_id": "21e10025-d464-45a6-82ac-25894e9164e4", "ip_address":
"172.16.128.1", "subnet_id": "7adfcebe-b3d0-4315-92fe-e8365cc80668"}, {"port_id":
"5dc9c3b7-245f-43f7-8db1-baf7bdf175fd", "ip_address": "169.254.192.4", "subnet_id":
"651250de-53ca-4487-97ce-e6f65dc4b8ec"}, {"port_id": "727a378d-3558-4132-933a-e2e72c28e532",
"ip_address": "169.254.192.5", "subnet_id": "651250de-53ca-4487-97ce-e6f65dc4b8ec"}]'
name: cloudinstances2b-gw
project_id: admin
revision_number: 2
routes: ''
status: ACTIVE
tags: ''
updated_at: '2019-10-02T10:30:11Z'
root@cloudcontrol2001-dev:~# openstack subnet create --network wan-transport-codfw --gateway 185.15.57.9 --no-dhcp --subnet-range 185.15.57.8/30 cloud-gw-transport-codfw
+-------------------+--------------------------------------+
| Field | Value |
+-------------------+--------------------------------------+
| allocation_pools | 185.15.57.10-185.15.57.10 |
| cidr | 185.15.57.8/30 |
| created_at | 2020-10-09T08:48:11Z |
| description | |
| dns_nameservers | |
| enable_dhcp | False |
| gateway_ip | 185.15.57.9 |
| host_routes | |
| id | 2596edb4-5a40-41b9-9e67-f1f9e40e329c |
| ip_version | 4 |
| ipv6_address_mode | None |
| ipv6_ra_mode | None |
| name | cloud-gw-transport-codfw |
| network_id | 57017d7c-3817-429a-8aa3-b028de82cdcc |
| project_id | admin |
| revision_number | 0 |
| segment_id | None |
| service_types | |
| subnetpool_id | None |
| tags | |
| updated_at | 2020-10-09T08:48:11Z |
+-------------------+--------------------------------------+
root@cloudcontrol2001-dev:~# openstack router set --external-gateway wan-transport-codfw --fixed-ip subnet=cloud-gw-transport-codfw,ip-address=185.15.57.10 cloudinstances2b-gw
root@cloudcontrol2001-dev:~# openstack subnet delete cloud-instances-transport1-b-codfw
root@cloudcontrol2001-dev:~# openstack router set --disable-snat cloudinstances2b-gw --external-gateway wan-transport-codfw
That command disables both routing_source_ip and dmz_cidr according to this diff (note the specific rules are missing):
iptables-save diff |
---|
--- enabled.txt 2020-09-23 10:20:45.373366952 +0000
+++ disabled.txt 2020-09-23 10:20:56.397471627 +0000
@@ -1,4 +1,4 @@
-# Generated by iptables-save v1.8.5 on Wed Sep 23 10:20:45 2020
+# Generated by iptables-save v1.8.5 on Wed Sep 23 10:20:56 2020
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
@@ -19,9 +19,10 @@
-A neutron-l3-agent-INPUT -m mark --mark 0x1/0xffff -j ACCEPT
-A neutron-l3-agent-INPUT -p tcp -m tcp --dport 9697 -j DROP
-A neutron-l3-agent-scope -o qr-21e10025-d4 -m mark ! --mark 0x4010000/0xffff0000 -j DROP
+-A neutron-l3-agent-scope -o qg-1290224c-b1 -m mark ! --mark 0x4000000/0xffff0000 -j DROP
COMMIT
-# Completed on Wed Sep 23 10:20:45 2020
-# Generated by iptables-save v1.8.5 on Wed Sep 23 10:20:45 2020
+# Completed on Wed Sep 23 10:20:56 2020
+# Generated by iptables-save v1.8.5 on Wed Sep 23 10:20:56 2020
*mangle
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
@@ -45,7 +46,6 @@
-A neutron-l3-agent-FORWARD -s 172.16.128.19/32 -j neutron-l3-agent-float-snat
-A neutron-l3-agent-FORWARD -s 172.16.128.20/32 -j neutron-l3-agent-float-snat
-A neutron-l3-agent-FORWARD -s 172.16.128.26/32 -j neutron-l3-agent-float-snat
--A neutron-l3-agent-POSTROUTING -o qg-1290224c-b1 -m connmark --mark 0x0/0xffff0000 -j CONNMARK --save-mark --nfmask 0xffff0000 --ctmask 0xffff0000
-A neutron-l3-agent-PREROUTING -j neutron-l3-agent-mark
-A neutron-l3-agent-PREROUTING -j neutron-l3-agent-scope
-A neutron-l3-agent-PREROUTING -m connmark ! --mark 0x0/0xffff0000 -j CONNMARK --restore-mark --nfmask 0xffff0000 --ctmask 0xffff0000
@@ -56,12 +56,11 @@
-A neutron-l3-agent-floatingip -d 185.15.57.2/32 -j MARK --set-xmark 0x4010000/0xffff0000
-A neutron-l3-agent-floatingip -d 185.15.57.4/32 -j MARK --set-xmark 0x4010000/0xffff0000
-A neutron-l3-agent-floatingip -d 185.15.57.6/32 -j MARK --set-xmark 0x4010000/0xffff0000
--A neutron-l3-agent-mark -i qg-1290224c-b1 -j MARK --set-xmark 0x2/0xffff
-A neutron-l3-agent-scope -i qr-21e10025-d4 -j MARK --set-xmark 0x4010000/0xffff0000
-A neutron-l3-agent-scope -i qg-1290224c-b1 -j MARK --set-xmark 0x4000000/0xffff0000
COMMIT
-# Completed on Wed Sep 23 10:20:45 2020
-# Generated by iptables-save v1.8.5 on Wed Sep 23 10:20:45 2020
+# Completed on Wed Sep 23 10:20:56 2020
+# Generated by iptables-save v1.8.5 on Wed Sep 23 10:20:56 2020
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
@@ -80,10 +79,6 @@
-A neutron-l3-agent-OUTPUT -d 185.15.57.2/32 -j DNAT --to-destination 172.16.128.19
-A neutron-l3-agent-OUTPUT -d 185.15.57.4/32 -j DNAT --to-destination 172.16.128.20
-A neutron-l3-agent-OUTPUT -d 185.15.57.6/32 -j DNAT --to-destination 172.16.128.26
--A neutron-l3-agent-POSTROUTING -s 208.80.153.190/32 -j ACCEPT
--A neutron-l3-agent-POSTROUTING -s 172.16.128.0/24 -d 10.0.0.0/8 -j ACCEPT
--A neutron-l3-agent-POSTROUTING -s 172.16.128.0/24 -d 208.80.152.0/22 -j ACCEPT
--A neutron-l3-agent-POSTROUTING ! -i qg-1290224c-b1 ! -o qg-1290224c-b1 -m conntrack ! --ctstate DNAT -j ACCEPT
-A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
-A neutron-l3-agent-PREROUTING -d 185.15.57.2/32 -j DNAT --to-destination 172.16.128.19
-A neutron-l3-agent-PREROUTING -d 185.15.57.4/32 -j DNAT --to-destination 172.16.128.20
@@ -92,12 +87,10 @@
-A neutron-l3-agent-float-snat -s 172.16.128.20/32 -j SNAT --to-source 185.15.57.4 --random-fully
-A neutron-l3-agent-float-snat -s 172.16.128.26/32 -j SNAT --to-source 185.15.57.6 --random-fully
-A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
--A neutron-l3-agent-snat -o qg-1290224c-b1 -j SNAT --to-source 185.15.57.1 --random-fully
--A neutron-l3-agent-snat -m mark ! --mark 0x2/0xffff -m conntrack --ctstate DNAT -j SNAT --to-source 208.80.153.190 --random-fully
-A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3-agent-snat
COMMIT
-# Completed on Wed Sep 23 10:20:45 2020
-# Generated by iptables-save v1.8.5 on Wed Sep 23 10:20:45 2020
+# Completed on Wed Sep 23 10:20:56 2020
+# Generated by iptables-save v1.8.5 on Wed Sep 23 10:20:56 2020
*raw
:PREROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
@@ -106,4 +99,4 @@
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
COMMIT
-# Completed on Wed Sep 23 10:20:45 2020
+# Completed on Wed Sep 23 10:20:56 2020
|
stage 1
: validation checklist
This list should help validate the new model works as expected.
- The cloudgw deployment is reproducible, server can be properly reimaged
- Traffic is isolated between data plane and control plane networks
- VM (no floating IP) contacting the internet gets NATed using routing_source_ip
- VM (no floating IP) contacting an address covered by dmz_cidr doesn't get NATed using routing_source_ip
- VM (using floating IP) isn't affected by either routing_source_ip or dmz_cidr
- VM (no floating IP) can contact auth DNS server
- VM (no floating IP) can contact rec DNS server
- VM (using floating IP) can contact auth DNS server
- VM (using floating IP) can contact rec DNS server
- VM (no floating IP) can contact LDAP server
- VM (using floating IP) can contact LDAP server
- VM (no floating IP) can mount NFS (dumps)
- VM (floating IP) can mount NFS (dumps)
- VM (no floating IP) can mount NFS (scratch)
- VM (floating IP) can mount NFS (scratch)
- VM (no floating IP) can mount NFS (maps)
- VM (floating IP) can mount NFS (maps)
- VM (no floating IP) can mount NFS (tools)
- VM (floating IP) can mount NFS (tools)
- VM (no floating IP) can connect to wiki-replicas
- VM (floating IP) can connect to wiki-replicas
- VM (no floating IP) can use openstack endpoints
- VM (floating IP) can use openstack endpoints
- All VMs can use the internal openstack metadata address
- puppetmasters VM work as expected
- security groups work as expected
stage 2
: enable L3 routing on cloudsw nodes
Given we don't have hardware for testing the cloudsw setup in codfw, we assume we are working with core routers and asw.
Therefore this stage is a NOOP in the codfw datacenter.
stage 3
: enable L3 routing on cloudgw nodes
All the codfw changes were done in the stage 1
, therefore this stage is a NOOP in the codfw datacenter.