»Autopilot Operator HTTP API

The /operator/autopilot endpoints allow for automatic operator-friendly management of Consul servers including cleanup of dead servers, monitoring the state of the Raft cluster, and stable server introduction.

Please check the Autopilot tutorial for more details.

»Read Configuration

This endpoint retrieves its latest Autopilot configuration.

MethodPathProduces
GET/operator/autopilot/configurationapplication/json

The table below shows this endpoint's support for blocking queries, consistency modes, agent caching, and required ACLs.

Blocking QueriesConsistency ModesAgent CachingACL Required
NOnonenoneoperator:read

»Parameters

  • dc (string: "") - Specifies the datacenter to query. This will default to the datacenter of the agent being queried. This is specified as part of the URL as a query string.

  • stale (bool: false) - If the cluster does not currently have a leader an error will be returned. You can use the ?stale query parameter to read the Raft configuration from any of the Consul servers.

»Sample Request

$ curl \
    http://127.0.0.1:8500/v1/operator/autopilot/configuration

»Sample Response

{
  "CleanupDeadServers": true,
  "LastContactThreshold": "200ms",
  "MaxTrailingLogs": 250,
  "ServerStabilizationTime": "10s",
  "RedundancyZoneTag": "",
  "DisableUpgradeMigration": false,
  "UpgradeVersionTag": "",
  "CreateIndex": 4,
  "ModifyIndex": 4
}

For more information about the Autopilot configuration options, see the agent configuration section.

»Update Configuration

This endpoint updates the Autopilot configuration of the cluster.

MethodPathProduces
PUT/operator/autopilot/configurationapplication/json

The table below shows this endpoint's support for blocking queries, consistency modes, agent caching, and required ACLs.

Blocking QueriesConsistency ModesAgent CachingACL Required
NOnonenoneoperator:write

»Parameters

  • dc (string: "") - Specifies the datacenter to query. This will default to the datacenter of the agent being queried. This is specified as part of the URL as a query string.

  • cas (int: 0) - Specifies to use a Check-And-Set operation. The update will only happen if the given index matches the ModifyIndex of the configuration at the time of writing.

  • CleanupDeadServers (bool: true) - Specifies automatic removal of dead server nodes periodically and whenever a new server is added to the cluster.

  • LastContactThreshold (string: "200ms") - Specifies the maximum amount of time a server can go without contact from the leader before being considered unhealthy. Must be a duration value such as 10s.

  • MaxTrailingLogs (int: 250) specifies the maximum number of log entries that a server can trail the leader by before being considered unhealthy.

  • MinQuorum (int: 0) - specifies the minimum number of servers needed before Autopilot can prune dead servers.

  • ServerStabilizationTime (string: "10s") - Specifies the minimum amount of time a server must be stable in the 'healthy' state before being added to the cluster. Only takes effect if all servers are running Raft protocol version 3 or higher. Must be a duration value such as 30s.

  • RedundancyZoneTag (string: "") - Controls the node-meta key to use when Autopilot is separating servers into zones for redundancy. Only one server in each zone can be a voting member at one time. If left blank, this feature will be disabled.

  • DisableUpgradeMigration (bool: false) - Disables Autopilot's upgrade migration strategy in Consul Enterprise of waiting until enough newer-versioned servers have been added to the cluster before promoting any of them to voters.

  • UpgradeVersionTag (string: "") - Controls the node-meta key to use for version info when performing upgrade migrations. If left blank, the Consul version will be used.

»Sample Payload

{
  "CleanupDeadServers": true,
  "LastContactThreshold": "200ms",
  "MaxTrailingLogs": 250,
  "MinQuorum": 3,
  "ServerStabilizationTime": "10s",
  "RedundancyZoneTag": "",
  "DisableUpgradeMigration": false,
  "UpgradeVersionTag": "",
  "CreateIndex": 4,
  "ModifyIndex": 4
}

»Read Health

This endpoint queries the health of the autopilot status.

MethodPathProduces
GET/operator/autopilot/healthapplication/json

The table below shows this endpoint's support for blocking queries, consistency modes, agent caching, and required ACLs.

Blocking QueriesConsistency ModesAgent CachingACL Required
NOnonenoneoperator:read

»Parameters

  • dc (string: "") - Specifies the datacenter to query. This will default to the datacenter of the agent being queried. This is specified as part of the URL as a query string.

»Sample Request

$ curl \
    http://127.0.0.1:8500/v1/operator/autopilot/health

»Sample response

{
  "Healthy": true,
  "FailureTolerance": 0,
  "Servers": [
    {
      "ID": "e349749b-3303-3ddf-959c-b5885a0e1f6e",
      "Name": "node1",
      "Address": "127.0.0.1:8300",
      "SerfStatus": "alive",
      "Version": "0.7.4",
      "Leader": true,
      "LastContact": "0s",
      "LastTerm": 2,
      "LastIndex": 46,
      "Healthy": true,
      "Voter": true,
      "StableSince": "2017-03-06T22:07:51Z"
    },
    {
      "ID": "e36ee410-cc3c-0a0c-c724-63817ab30303",
      "Name": "node2",
      "Address": "127.0.0.1:8205",
      "SerfStatus": "alive",
      "Version": "0.7.4",
      "Leader": false,
      "LastContact": "27.291304ms",
      "LastTerm": 2,
      "LastIndex": 46,
      "Healthy": true,
      "Voter": false,
      "StableSince": "2017-03-06T22:18:26Z"
    }
  ]
}
  • Healthy is whether all the servers are currently healthy.

  • FailureTolerance is the number of redundant healthy servers that could be fail without causing an outage (this would be 2 in a healthy cluster of 5 servers).

  • Servers holds detailed health information on each server:

    • ID is the Raft ID of the server.

    • Name is the node name of the server.

    • Address is the address of the server.

    • SerfStatus is the SerfHealth check status for the server.

    • Version is the Consul version of the server.

    • Leader is whether this server is currently the leader.

    • LastContact is the time elapsed since this server's last contact with the leader.

    • LastTerm is the server's last known Raft leader term.

    • LastIndex is the index of the server's last committed Raft log entry.

    • Healthy is whether the server is healthy according to the current Autopilot configuration.

    • Voter is whether the server is a voting member of the Raft cluster.

    • StableSince is the time this server has been in its current Healthy state.

    The HTTP status code will indicate the health of the cluster. If Healthy is true, then a status of 200 will be returned. If Healthy is false, then a status of 429 will be returned.

»Read the Autopilot State

This endpoint queries the health of the autopilot status.

MethodPathProduces
GET/operator/autopilot/stateapplication/json

The table below shows this endpoint's support for blocking queries, consistency modes, agent caching, and required ACLs.

Blocking QueriesConsistency ModesAgent CachingACL Required
NOnonenoneoperator:read

»Parameters

  • dc (string: "") - Specifies the datacenter to query. This will default to the datacenter of the agent being queried. This is specified as part of the URL as a query string.

»Sample Request

$ curl \
    http://127.0.0.1:8500/v1/operator/autopilot/state

»Response Format

{
  "Healthy": true,
  "FailureTolerance": 1,
  "OptimisticFailureTolerance": 4,
  "Servers": {
    "5e26a3af-f4fc-4104-a8bb-4da9f19cb278": {},
    "10b71f14-4b08-4ae5-840c-f86d39e7d330": {},
    "1fd52e5e-2f72-47d3-8cfc-2af760a0c8c2": {},
    "63783741-abd7-48a9-895a-33d01bf7cb30": {},
    "6cf04fd0-7582-474f-b408-a830b5471285": {}
  },
  "Leader": "5e26a3af-f4fc-4104-a8bb-4da9f19cb278",
  "Voters": [
    "5e26a3af-f4fc-4104-a8bb-4da9f19cb278",
    "10b71f14-4b08-4ae5-840c-f86d39e7d330",
    "1fd52e5e-2f72-47d3-8cfc-2af760a0c8c2"
  ],
  "RedundancyZones": {
    "az1": {},
    "az2": {},
    "az3": {}
  },
  "ReadReplicas": [
    "63783741-abd7-48a9-895a-33d01bf7cb30",
    "6cf04fd0-7582-474f-b408-a830b5471285"
  ],
  "Upgrade": {}
}
  • Healthy is whether all the servers are currently healthy.

  • FailureTolerance is the number of redundant healthy servers that could be fail without causing an outage (this would be 2 in a healthy cluster of 5 servers).

  • OptimisticFailuretolerance

    Enterprise
    is the maximum number of servers that could fail in the right order over the right period of time without causing an outage. This value is only useful when using the Redundancy Zones feature with autopilot.

  • Servers is a mapping of server ID to an object holding detailed information about that server. The format of the detailed info is documented in its own section.

  • Leader is the server ID of current leader. This value can be used as an index into the Servers object.

  • Voters is a list of server IDs that are voters. These values can be used as indexes into the Servers object.

  • RedundancyZones

    Enterprise
    is mapping of redundancy zone name to redundancy zone information. The format of the redundancy zone information is documented in its own section.

  • ReadReplicas

    Enterprise
    is a list of server IDs that autopilot has identified as read replicas. These will never be promoted. These values can be used as indexes into the Servers map.

  • Upgrade

    Enterprise
    is an object holding all the information about any ongoing automated upgrade. The format of this object is detailed in its own section.

»Server Response Format

{
  "ID": "1c3e3278-3f88-4a97-9f6a-1058584e8058",
  "Name": "node1",
  "Address": "198.18.0.1:8300",
  "NodeStatus": "alive",
  "Version": "1.9.0+ent",
  "LastContact": "1.321ms",
  "LastTerm": 4,
  "LastIndex": 42,
  "Healthy": true,
  "StableSince": "2020-08-12T12:13:14Z",
  "RedundancyZone": "az1",
  "UpgradeVersion": "1.2.3",
  "ReadReplica": false,
  "Status": "voter",
  "Meta": {
    "build": "1.2.3",
    "zone": "az1"
  },
  "NodeType": "redundancy-zone-voter"
}
  • ID is the Raft ID of the server.

  • Name is the node name of the server.

  • Address is the address of the server.

  • NodeStatus is the SerfHealth check status for the server.

  • Version is the Consul version of the server.

  • LastContact is the time elapsed since this server's last contact with the leader.

  • LastTerm is the server's last known Raft leader term.

  • LastIndex is the index of the server's last committed Raft log entry.

  • Healthy is whether the server is healthy according to the current Autopilot configuration.

  • StableSince is the time this server has been in its current Healthy state.

  • RedundancyZone

    Enterprise
    is the name of the redundancy zone this server is within.

  • UpgradeVersion

    Enterprise
    is the version that will be used for automated upgrade calculations.

  • ReadReplica

    Enterprise
    indicates whether this server is a read replica or not.

  • Status indicates the current Raft status of this server. Possible values are: leader, voter, non-voter, or staging.

  • Meta is the node metadata of this server. Values within this map are used for determining a server's redundancy zone and upgrade version.

  • NodeType is the desired type autopilot thinks this server should have. In Consul OSS the only possible value is voter as all present servers should having voting rights. In Consul Enterprise the possible values also include read-replica, zone-voter, zone-standby and zone-extra-voter. zone-voter indicates that autopilot wants this server to be the voter for a particular redundancy zone. When a zone has no voter all nodes will be typed as this until one is promoted. When that happens the other non-voters in the zone will be typed as zone-standby. This indicates that they are currently desired to be standby servers in case the voter from the zone fails. Finally, the zone-extra-voter status indicates that autopilot wants this server to be a voter due to a failure of all servers in another zone and that when one of the servers in that failed zone are restored, this server will be demoted.

»Redundancy Zone Response Format
Enterprise

{
  "Servers": [
    "10b71f14-4b08-4ae5-840c-f86d39e7d330",
    "b007061c-6d15-4c90-b3d6-2fef276a0650"
  ],
  "Voters": ["b007061c-6d15-4c90-b3d6-2fef276a0650"],
  "FailureTolerance": 1
}

Each zone in the responses RedundancyZones mapping will have this structure.

  • Servers is a list of server IDs of all the servers in this zone. These values can be used as indexes into the top level response's Servers mapping.

  • Voters is a list of server IDs of all servers in this zone that have voting rights. Typically this will be a list with 1 value but in some failure scenarios or upgrade scenarios the size could increase. These values can be used as indexes into the top level response's Servers mapping.

  • FailureTolerance is the number of servers in this zone that could fail without causing a total zone failure and subsequent promotion of a server from another zone as a fallback.

»Upgrade Information Response Format
Enterprise

{
  "Status": "awaiting-new-servers",
  "TargetVersion": "1.9.1+ent",
  "TargetVersionVoters": ["f0344689-3e1f-4125-b55d-e888d3abf514"],
  "TargetVersionNonVoters": [
    "619a4ba6-1a0b-476e-8a1a-28aeee7735a2",
    "fd683fe6-541f-4ebf-bc5a-6eae51571ddb"
  ],
  "TargetVersionReadReplicas": ["9f1e27ae-1129-45ef-97dd-6d8c3ec47e6a"],
  "OtherVersionVoters": [
    "0cbdd493-235f-48f2-98d9-1bf2443b9d72",
    "21812bd7-2f21-4565-9892-2fdd3d4e1a99",
    "c654ba5c-cc76-4056-a5ca-6e78d95f27ad"
  ],
  "OtherVersionNonVoters": [
    "6d973f11-6bdb-4f7d-8a90-c1300066da4c",
    "6241ab45-371e-4b2a-a0f1-d847c3b7b1b0"
  ],
  "OtherVersionReadReplicas": ["42d10fc3-581b-4403-832d-945b3a0d8841"]
}
  • Status is the automated upgrade status. Possible values are:

    • disabled indicates that automated upgrades are disabled either from user configuration or due to being unlicensed.

    • idle indicates that there is no ongoing upgrade and that all servers are running the same Consul version.

    • await-new-voters indicates that a newer versioned server has been added but that autopilot is waiting for more servers of that version to be added before proceeding with the upgrade.

    • promoting indicates that enough servers of the target version have been added and autopilot will now promote them to voters.

    • demoting indicates that autopilot is currently demoting the servers not running the target version.

    • leader-transfer indicates that autopilot is in the process of transferring leadership to a server running the target version.

    • await-new-servers indicates that the majority of the upgrade is complete but that more servers running the target version need to be added to completely replace all of the previous servers.

    • await-server-removal indicates that the upgrade is complete and it is now safe to remove the previous servers.

  • TargetVersion is the version that Autopilot is upgrading to. This will be the maximum version of all servers UpgradeVersion field in the top level Servers mapping.

  • TargetVersionVoters is a list of IDs of servers running the target version and that currently have voting rights.

  • TargetVersionNonVoters is a list of IDs of servers running the target version and that currently do not have voting rights.

  • TargetVersionReadReplicas is a list of IDs of servers running the target version and are read replicas.

  • OtherVersionVoters is a list of IDs of servers not running the target version and that currently have voting rights.

  • OtherVersionNonVoters is a list of IDs of servers not running the target version and that currently do not have voting rights.

  • OtherVersionReadReplicas is a list of IDs of servers not running the target version and are read replicas.