June 20-22 Announcing HashiConf Europe full schedule: keynotes, sessions, labs & more Register Now
  • Overview
    • Consul on Kubernetes
    • Control access with Consul API Gateway
    • Discover Services with Consul
    • Enforce Zero Trust Networking with Consul
    • Load Balancing with Consul
    • Manage Traffic with Consul
    • Multi-Platform Service Mesh with Consul
    • Network Infrastructure Automation with Consul
    • Observability with Consul
  • Enterprise
  • Tutorials
  • Docs
  • API
  • CLI
  • Community
GitHub
Download
Try HCP Consul
    • v1.12.x (latest)
    • v1.11.x
    • v1.10.x
    • v1.9.x
    • v1.8.x
    • Overview
      • Overview
      • What is a Service Mesh?
      • Overview
      • Chef, Puppet, etc.
      • Nagios
      • SkyDNS
      • SmartStack
      • Serf
      • Eureka
      • Istio
      • Envoy and Other Proxies
      • Custom Solutions
    • Overview
    • Manual Bootstrap
    • Consul Agent
    • Glossary
    • Required Ports
    • Bootstrapping a Datacenter
    • Cloud Auto-join
    • Server Performance
    • Kubernetes
  • API
  • Commands (CLI)
    • Register Services - Service Definitions
    • Find Services - DNS Interface
    • Monitor Services - Check Definitions
    • Overview
    • How Service Mesh Works
    • Configuration
      • Overview
      • Ingress Gateway
      • Mesh
      • Exported Services
      • Proxy Defaults
      • Service Defaults
      • Service Intentions
      • Service Resolver
      • Service Router
      • Service Splitter
      • Terminating Gateway
      • Overview
      • Envoy
      • Built-in Proxy
      • Proxy Integration
      • Managed (Deprecated)
      • Overview
      • Proxy Service Registration
      • Sidecar Service Registration
    • Service-to-service permissions - Intentions
    • Service-to-service permissions - Intentions (Legacy Mode)
    • Transparent Proxy
      • Overview
      • UI Visualization
      • Overview
      • Discovery Chain
    • Connectivity Tasks
    • Distributed Tracing
      • Overview
        • WAN Federation
        • Enabling Service-to-service Traffic Across Datacenters
        • Enabling Service-to-service Traffic Across Admin Partitions
      • Ingress Gateways
      • Terminating Gateways
    • Nomad
    • Kubernetes
      • Overview
      • Go Integration
      • Overview
      • Built-In CA
      • Vault
      • ACM Private CA
    • Develop and Debug
    • Security
    • Overview
    • Installation
    • Technical Specifications
    • Common Errors
    • Upgrades
    • Overview
    • Architecture
      • Installing Consul on Kubernetes
      • Installing Consul K8s CLI
        • Minikube
        • Kind
        • AKS (Azure)
        • EKS (AWS)
        • GKE (Google Cloud)
        • Red Hat OpenShift
        • Self Hosted Kubernetes
        • Consul Clients Outside Kubernetes
        • Consul Servers Outside Kubernetes
        • Single Consul Datacenter in Multiple Kubernetes Clusters
        • Consul Enterprise
        • Overview
        • Federation Between Kubernetes Clusters
        • Federation Between VMs and Kubernetes
        • Overview
        • Systems Integration
          • Overview
          • Bootstrap Token
          • Enterprise License
          • Gossip Encryption Key
          • Partition Token
          • Replication Token
          • Server TLS
          • Service Mesh Certificates
          • Snapshot Agent Config
        • WAN Federation
      • Compatibility Matrix
      • Overview
      • Transparent Proxy
      • Ingress Gateways
      • Terminating Gateways
      • Ingress Controllers
      • Configuring a Connect CA Provider
      • Health Checks
        • Metrics
    • Service Sync
      • Overview
      • Upgrade An Existing Cluster to CRDs
    • Annotations and Labels
    • Consul DNS
      • Upgrading Consul on Kubernetes
      • Upgrading Consul K8s CLI
      • Uninstall
      • Certificate Rotation
      • Gossip Encryption Key Rotation
      • Configure TLS on an Existing Cluster
      • Common Error Messages
      • FAQ
    • Helm Chart Configuration
    • Consul K8s CLI Reference
    • Overview
    • Requirements
    • Task Resource Usage
      • Installation
      • Secure Configuration
      • Migrate Existing Tasks
      • Installation
      • Secure Configuration
      • ACL Controller
    • Architecture
    • Consul Enterprise
    • Configuration Reference
    • Overview
      • Installation
      • Requirements
      • Configure
      • Run Consul-Terraform-Sync
    • Architecture
      • Overview
      • Status
      • Tasks
      • Overview
      • task
    • Configuration
    • Tasks
    • Terraform Modules
      • Overview
      • License
      • Terraform Cloud Driver
      • Overview
      • Terraform
      • Terraform Cloud
    • Compatibility
    • Consul KV
    • Sessions
    • Watches
    • Overview
      • General
      • CLI Reference
      • Configuration Reference
    • Configuration Entries
    • Telemetry
    • Sentinel
    • RPC
    • Overview
      • ACL System Overview
      • Tokens
      • Policies
      • Roles
      • Rules Reference
      • Legacy Mode
      • Token Migration
      • ACLs in Federated Datacenters
        • Overview
        • Kubernetes
        • JWT
        • OIDC
        • AWS IAM
    • Encryption
      • Overview
      • Core
      • Network Infrastructure Automation
    • Overview
    • Admin Partitions
    • Audit Logging
    • Automated Backups
    • Automated Upgrades
    • Enhanced Read Scalability
    • Single sign-on - OIDC
    • Redundancy Zones
    • Advanced Federation
    • Network Segments
    • Namespaces
    • NIA with TFE
    • Sentinel
      • Overview
      • FAQ
    • Overview
    • Improving Consul Resilience
    • Anti-Entropy
    • Consensus Protocol
    • Gossip Protocol
    • Jepsen Testing
    • Network Coordinates
    • Consul Integration Program
    • NIA Integration Program
    • Vault Integration
    • Proxy Integration
  • Consul Tools
    • Overview
    • Compatibility Promise
    • Specific Version Details
      • Overview
      • General Process
      • Upgrading to 1.2.4
      • Upgrading to 1.6.9
      • Upgrading to 1.8.13
      • Upgrading to 1.10.0
    • Common Error Messages
    • FAQ
    • Overview
      • v1.11.x
      • v1.10.x
      • v1.9.x
      • v0.1.x
      • v0.2.x
      • v0.4.x
      • v0.3.x
      • v0.2.x
      • v0.5.x
      • v0.6.0-beta
    • Overview
    • ACL
  • Guides
Type '/' to Search

»WAN Federation via Mesh Gateways

1.8.0+: This feature is available in Consul versions 1.8.0 and higher

This topic requires familiarity with mesh gateways.

WAN federation via mesh gateways allows for Consul servers in different datacenters to be federated exclusively through mesh gateways.

When setting up a multi-datacenter Consul cluster, operators must ensure that all Consul servers in every datacenter must be directly connectable over their WAN-advertised network address from each other.

WAN federation without mesh gateways

This requires that operators setting up the virtual machines or containers hosting the servers take additional steps to ensure the necessary routing and firewall rules are in place to allow the servers to speak to each other over the WAN.

Sometimes this prerequisite is difficult or undesirable to meet:

  • Difficult: The datacenters may exist in multiple Kubernetes clusters that unfortunately have overlapping pod IP subnets, or may exist in different cloud provider VPCs that have overlapping subnets.

  • Undesirable: Network security teams may not approve of granting so many firewall rules. When using platform autoscaling, keeping rules up to date becomes untenable.

Operators looking to simplify their WAN deployment and minimize the exposed security surface area can elect to join these datacenters together using mesh gateways to do so.

WAN federation with mesh gateways

»Architecture

There are two main kinds of communication that occur over the WAN link spanning the gulf between disparate Consul datacenters:

  • WAN gossip: We leverage the serf and memberlist libraries to gossip around failure detector knowledge about Consul servers in each datacenter. By default this operates point to point between servers over 8302/udp with a fallback to 8302/tcp (which logs a warning indicating the network is misconfigured).

  • Cross-datacenter RPCs: Consul servers expose a special multiplexed port over 8300/tcp. Several distinct kinds of messages can be received on this port, such as RPC requests forwarded from servers in other datacenters.

In this network topology individual Consul client agents on a LAN in one datacenter never need to directly dial servers in other datacenters. This means you could introduce a set of firewall rules prohibiting 10.0.0.0/24 from sending any traffic at all to 10.1.2.0/24 for security isolation.

You may already have configured mesh gateways to allow for services in the service mesh to freely connect between datacenters regardless of the lateral connectivity of the nodes hosting the Consul client agents.

By activating WAN federation via mesh gateways the servers can similarly use the existing mesh gateways to reach each other without themselves being directly reachable.

»Configuration

»TLS

All Consul servers in all datacenters should have TLS configured with certificates containing these SAN fields:

server.<this_datacenter>.<domain>              (normal)
<node_name>.server.<this_datacenter>.<domain>  (needed for wan federation)
server.<this_datacenter>.<domain>              (normal)
<node_name>.server.<this_datacenter>.<domain>  (needed for wan federation)

This can be achieved using any number of tools, including consul tls cert create with the -node flag.

»Mesh Gateways

There needs to be at least one mesh gateway configured to opt-in to exposing the servers in its configuration. When using the consul connect envoy CLI this is done by using the flag -expose-servers. All this does is to register the mesh gateway into the catalog with the additional piece of service metadata of {"consul-wan-federation":"1"}. If you are registering the mesh gateways into the catalog out of band you may simply add this to your existing registration payload.

Before activating the feature on an existing cluster you should ensure that there is at least one mesh gateway prepared to expose the servers registered in each datacenter otherwise the WAN will become only partly connected.

»Consul Server Options

There are a few necessary additional pieces of configuration beyond those required for standing up a multi-datacenter Consul cluster.

Consul servers in the primary datacenter should add this snippet to the configuration file:

connect {
  enabled = true
  enable_mesh_gateway_wan_federation = true
}
connect {
  enabled = true
  enable_mesh_gateway_wan_federation = true
}

Consul servers in all secondary datacenters should add this snippet to the configuration file:

primary_gateways = [ "<primary-mesh-gateway-ip>:<primary-mesh-gateway-port>", ... ]
connect {
  enabled = true
  enable_mesh_gateway_wan_federation = true
}
primary_gateways = [ "<primary-mesh-gateway-ip>:<primary-mesh-gateway-port>", ... ]
connect {
  enabled = true
  enable_mesh_gateway_wan_federation = true
}

The start_join_wan or retry_join_wan are only used for the traditional federation process. They must be omitted when federating Consul servers via gateways.

The primary_gateways configuration can also use go-discover syntax just like retry_join_wan.

»Bootstrapping

For ease of debugging (such as avoiding a flurry of misleading error messages) when intending to activate WAN federation via mesh gateways it is best to follow this general procedure:

»New secondary

  1. Upgrade to the desired version of the consul binary for all servers, clients, and CLI.
  2. Start all consul servers and clients on the new version in the primary datacenter.
  3. Ensure the primary datacenter has at least one running, registered mesh gateway with the service metadata key of {"consul-wan-federation":"1"} set.
  4. Ensure you are prepared to launch corresponding mesh gateways in all secondaries. When ACLs are enabled actually registering these requires upstream connectivity to the primary datacenter to authorize catalog registration.
  5. Ensure all servers in the primary datacenter have updated configuration and restart.
  6. Ensure all servers in the secondary datacenter have updated configuration.
  7. Start all consul servers and clients on the new version in the secondary datacenter.
  8. When ACLs are enabled, shortly afterwards it should become possible to resolve ACL tokens from the secondary, at which time it should be possible to launch the mesh gateways in the secondary datacenter.

»Existing secondary

  1. Upgrade to the desired version of the consul binary for all servers, clients, and CLI.
  2. Restart all consul servers and clients on the new version.
  3. Ensure each datacenter has at least one running, registered mesh gateway with the service metadata key of {"consul-wan-federation":"1"} set.
  4. Ensure all servers in the primary datacenter have updated configuration and restart.
  5. Ensure all servers in the secondary datacenter have updated configuration and restart.

»Verification

From any two datacenters joined together double check the following give you an expected result:

  • Check that consul members -wan lists all servers in all datacenters with their local ip addresses and are listed as alive.

  • Ensure any API request that activates datacenter request forwarding. such as /v1/catalog/services?dc=<OTHER_DATACENTER_NAME> succeeds.

github logoEdit this page
IntroGuidesDocsCommunityPrivacySecurityBrandConsent Manager