Jump to content

Liberica

From Wikitech

Liberica is the next-generation load balancer replacing PyBal. It is designed with a modular approach and written in Go. The source code is available on GitLab.

Daemons

Control Plane

The control plane (cp) daemon orchestrates the rest of the Liberica daemons. It uses a configuration file to define a list of services and relies on etcd for real server discovery. Communication with other Liberica daemons and gobgpd occurs over gRPC.

Forwarding Plane

The forwarding plane (fp) daemon exposes the following forwarding planes to Liberica’s control plane via its gRPC API:

  • IPVS
  • Katran

Healthcheck Forwarder

The healthcheck forwarder (hcforwarder) daemon forwards health check traffic to real servers in the same way production traffic reaches them (using IPIP encapsulation). It consists of two components: a Go daemon that exposes a gRPC API and Prometheus metrics, and an eBPF program that handles the network traffic. The healthcheck daemon targets a specific real server by setting a SOMARK. The hcforwarder uses this SOMARK to identify the real server's IP and perform IPIP encapsulation.

Healthcheck

The Liberica healthcheck daemon performs health checks and tracks their results. It notifies subscribers (usually the control plane daemon) of any changes in the state of real servers. Additionally, it exposes health check results as Prometheus metrics.

Operating Liberica

Liberica provides a CLI tool called liberica for fetching the current state of its various daemons. This tool uses the same gRPC API employed by the Liberica control plane daemon to gather insights from different components.

Pooling a liberica instance

liberica control plane will automatically pool the load balancer when the control plane is started, it can be done manually using systemctl:

systemctl start liberica-cp.service

On a succesful start eventually liberica should configure the BGP paths for all the configured services, the journal log should contain something like this:

time=2025-01-28T14:03:20.036Z level=INFO msg="bootstrapping completed, configuring BGP path" service=ncredirlb6_80
time=2025-01-28T14:03:20.037Z level=INFO msg="bootstrapping completed, configuring BGP path" service=ncredir-httpslb_443
time=2025-01-28T14:03:20.037Z level=INFO msg="bootstrapping completed, configuring BGP path" service=ncredirlb_80
time=2025-01-28T14:03:20.037Z level=INFO msg="bootstrapping completed, configuring BGP path" service=ncredir-httpslb6_443

Additionally BGP status can be checked using gobgp CLI:

$ gobgp neighbor # list BGP neighbors
Peer           AS  Up/Down State       |#Received  Accepted
10.64.130.1 64810 00:26:03 Establ      |        0         0

$ gobgp neighbor 10.64.130.1 adj-out # check exported IPv4 prefixes 
   ID  Network              Next Hop             AS_PATH              Attrs
   4   208.80.154.232/32    10.64.130.16         64600                [{Origin: i} {Communities: 14907:11}]
   
$ gobgp neighbor 10.64.130.1 adj-out -a ipv6 # check exported IPv6 prefixes
   ID  Network                Next Hop                    AS_PATH              Attrs
   4   2620:0:861:ed1a::9/128 2620:0:861:109:10:64:130:16 64600                [{Origin: i} {Communities: 14907:11}]

Depooling a liberica instance

Depooling a liberica based load balancer should be as easy as stopping the control plane using the following command:

systemctl stop liberica-cp.service

After a successful depool gobgp should show an empty list of neighbors, this can be verified using gobgp CLI:

$ sudo -i gobgp neighbor 
Peer AS Up/Down State       |#Received  Accepted

Alerts

LibericaDiffFPCheck

This alert is triggered after finding a mismatch between realservers that should be effectively pooled according to the control plane and the ones that are actually pooled on the forwarding plane. Both can be queried using the liberica CLI tool:

vgutierrez@lvs4009:~$ liberica cp services  # control plane status
upload-httpslb6_443:
        2620:0:863:101:10:128:0:12      1 healthy: true | pooled: yes
        2620:0:863:101:10:128:0:14      1 healthy: true | pooled: yes
        2620:0:863:101:10:128:0:35      1 healthy: true | pooled: yes
        2620:0:863:101:10:128:0:21      1 healthy: true | pooled: yes
        2620:0:863:101:10:128:0:36      1 healthy: true | pooled: yes
        2620:0:863:101:10:128:0:24      1 healthy: true | pooled: yes
        2620:0:863:101:10:128:0:10      1 healthy: true | pooled: yes
        2620:0:863:101:10:128:0:37      1 healthy: true | pooled: yes
[...]
vgutierrez@lvs4009:~$ liberica fp services  # forwarding plane status
[...]
2620:0:863:ed1a::2:b:443 mh
        2620:0:863:101:10:128:0:37      1
        2620:0:863:101:10:128:0:10      1
        2620:0:863:101:10:128:0:24      1
        2620:0:863:101:10:128:0:36      1
        2620:0:863:101:10:128:0:21      1
        2620:0:863:101:10:128:0:35      1
        2620:0:863:101:10:128:0:14      1
        2620:0:863:101:10:128:0:12      1

This alert should be an indicator of a bug or a transient issue in Liberica, the easiest way of fixing it should be restarting liberica using the sre.loadbalancer.upgrade cookbook from a cumin host:

$ sudo cookbook sre.loadbalancer.upgrade --query "P{lvs4009.ulsfo.wmnet}" --reason "clear LibericaDiffFPCheck alert" restart

LibericaStaleConfig

This alert is triggered if a configuration deployed by puppet hasn't been loaded in the next hour. This can be fixed reloading Liberica configuration using the sre.loadbalancer.admin cookbook from a cumin host:

$ sudo cookbook sre.loadbalancer.admin --query "P{lvs4009.ulsfo.wmnet}" --reason "clear LibericStaleConfig alert" config_reload

See also