»Connect Custom Proxy Integration
Any proxy can be extended to support Connect. Consul ships with a built-in proxy for a good development and out of the box experience, but production users will require other proxy solutions.
A proxy must serve one or both of the following two roles: it must accept inbound connections or establish outbound connections identified as a particular service. One or both of these may be implemented depending on the case, although generally both must be supported for full sidecar functionality.
There are also two different levels of compatibility as a sidecar: L4 or L7. L4 integration is simpler and adequate to secure all traffic but treats all traffic as TCP so no advanced routing or metrics features can be supported. Full L7 support is built on top of L4 support and includes supporting most or all of the L7 traffic routing features in Connect by dynamically configuring routing, retries and more L7 features. Currently The built-in proxy only supports L4 while Envoy supports the full L7 feature set.
Places where the integration approach diverges for L4/L7 support is indicated below.
»Accepting Inbound Connections
For inbound connections, the proxy must accept TLS connections on some port.
The certificate served should be obtained from the
/v1/agent/connect/ca/leaf/ API endpoint. The client certificate should be
validated against the root certificates provided by the
/v1/agent/connect/ca/roots endpoint. After validating the client
certificate from the caller, depending upon the protocol of the proxied
service service the proxy must either authorize the entire connection (L4) or
each request (L7).
Connection authorization can be performed one of two ways:
The first is by calling the
/v1/agent/connect/authorizeendpoint. The authorize endpoint is expected to be called in the connection path, so if the local Consul agent is down or unresponsive it will impact the success rate of new connections. The agent uses locally cached data to authorize the connection and typically responds in microseconds. Therefore, the impact to the TLS handshake is typically microseconds.
Note: This endpoint is only suited for networking layer 4 (e.g. TCP) integration. The endpoint will always treat intentions with Permissions defined (i.e., layer 7 criteria) as deny intentions during evaluation.
Alternatively, proxies may list intentions that match the destination by querying the intention match API endpoint, and represent them in the native configuration of the proxy itself (such as RBAC for Envoy). For performance and reliability reasons this is the desirable method for implementing intention enforcement. The cached intentions should be consulted for each incoming connection (L4) or request (L7) to determine if the should be accepted or rejected.
All of these API endpoints operate on agent-local data that is updated in the background. The leaf, roots, and intentions should be updated in the background by the proxy.
The leaf cert, root cert, and intentions endpoints support blocking queries, which should be used to get near-immediate updates for root key rotations, new leaf certs before expiry, and intention changes.
Although Consul follows the SPIFFE spec for certificates, some currently supported CA providers don't allow strict adherence. For example, CA certificates may not have the correct trust-domain SPIFFE URI SAN for the cluster. If SPIFFE validation is performed in the proxy, be aware that it should be possible to opt out, otherwise certain CA providers supported by Consul will not be compatible with the use of that proxy. Currently neither Envoy nor the built-in proxy validate the SPIFFE URI of the chain beyond the leaf certificate.
Authentication is based on "service identity" (TLS), and is implemented at the transport layer. Depending upon the protocol of the proxied service, authorization is performed either on a per-connection (L4) or per-request (L7) basis.
Note: Features like (local) rate limiting or max connections are configurations that we expect to push into proxies and have them enforce separately to the AuthZ call based on the state they already have about request rates etc.
»Persistent TCP Connections and Intentions
For a proxied service configured with a protocol of TCP, potentially long-lived TCP connections will be authorized only when they are established. Since many services (e.g. databases) typically use persistent connection pools, a change in intentions that newly denies access currently does not terminate existing connections in violation of the updated intention. In this case it may appear as if the intention is not being enforced.
Consul eventually may support a mechanism for tracking specific connections in the agent and then allow the agent to tell the proxy to close those connections when their authorization state changes, but for now that is not on the roadmap.
It is recommended therefore to do one of the following:
- Have connections terminate after a configurable maximum lifetime of say several hours. This balances the overhead of establishing new connections while keeping an upper bound on how long after Intention changes existing connections remain open.
- Periodically re-authorize every open connection. The AuthZ call itself is not expensive and should be a local, in-memory operation so authorizing thousands of open connections once every minute or so is likely to be negligible overhead, but enforces a tighter upper bound on how long it takes to enforce Intention changes without affecting protocol efficiency of persistent connections.
»Certificate Serial in AuthZ
Intentions currently utilize TLS' URI Subject Alternative Name (SAN) for
enforcement. In the future, Consul will support revoking specific certificates
by serial number. The AuthZ API in the Go SDK has a field to pass the serial
consul/connect/tls.go). Proxies may provide this value during
»Establishing Outbound Connections
For outbound connections, the proxy should communicate to a Connect-capable
endpoint for a service and provide a client certificate from the
/v1/agent/connect/ca/leaf/ API endpoint. The certificate served by the
remote endpoint may be verified against the root certificates from the
Any proxy can discover proxy configuration registered with a local service
instance using the
This endpoint supports hash-based blocking, enabling long-polling for changes
to the registration/configuration. Any changes to the registration/config will
result in the new config being returned immediately. An example implementation
may be found in our built-in proxy which
utilizes our Go SDK, and uses the HTTP "pull" API (via our
The discovery chain for each upstream service should be fetched from the
API endpoint. This will return a compiled graph of configurations needed by
sidecars for a particular upstream service. If you are only implementing L4
support in your proxy, set the
OverrideProtocol value to "tcp" when
fetching the discovery chain so that L7 features such as HTTP routing rules are
For each target in the resulting
discovery chain, a list of healthy, Connect-capable endpoints may be fetched
/v1/health/connect/:service_id API endpoint per the Service
Discovery section below.
The rest of the nodes in the chain include configurations that should be translated into the nearest equivalent for things like HTTP routing, connection timeouts, connection pool settings, rate limits, etc. See the full discovery chain documentation and relevant config entry documentation for details of supported configuration parameters.
We expect config here to evolve reasonably rapidly. While we do not intend to make backwards incompatible API changes, there are likely to be new configurations and features added regularly. Some proxies may not be able to support all features or may have differing semantics with the way they support them. We intend to find a suitable format to document the behavior differences between proxy implementations as they mature.
Proxies can use Consul's service discovery API
/v1/health/connect/:service_id to return all available, Connect-capable
endpoints for a given service. This endpoint supports a
which makes use of agent caching and thus has
performance benefits. The API package provides a
UseCache query option to
leverage this. In addition to performance improvements, use of the cache makes
the mesh more resilient to Consul server outages - the mesh "fails static" with
the last known set of service instances still used rather than errors on new
Proxies can decide whether to perform just-in-time queries to the API when a new connection needs to be routed, or to use blocking queries to load the current set of endpoints for a service and keep that list updated. The SDK and built-in proxy currently use just-in-time resolution however many existing proxies are likely to find it easier to integrate by pulling the set of endpoints and maintaining it in local memory using blocking queries.
Upstreams can be defined with Prepared Query target types. These upstreams should use Consul's prepared query API. It's worth noting that the PreparedQuery API does not support blocking, so proxies choosing to populate endpoints in memory will need to poll the endpoint at a suitable and ideally configurable frequency.
Note: Long-term the
entries are intended to replace
Prepared Queries in Consul entirely, but for now these are still used in some
Consul does not start or manage sidecar proxies processes. Proxies running on a physical host or VM are designed to be started and run by process supervisor systems such as init, systemd, supervisord, etc. Or, if deployed within a cluster scheduler (Kubernetes, Nomad) running as a sidecar container in the same namespace.
The proxy will use the
CONSUL_HTTP_ADDR environment variables to
contact Consul to fetch certificates, provided the
environment variable contains a Consul ACL that has the necessary permissions
to read configuration for that service. If you use our Go
api package then
those environment variables will be read and the client configured for you
The ID of the proxy service comes from the user. See
envoy as an example. You may start it with the
-proxy-id flag and pass the ID of the proxy service you registered elsewhere.
A nicer UX is available for end-users using the
argument, which causes the command to query Consul for a proxy that is
registered as a sidecar for the specified
<service>. If there is exactly one
such proxy, that ID will be used to start the proxy. Your controller only needs
-proxy-id as an argument; the Consul CLI will handle resolving the
ID for the name specified in