Autoscale

This document describes the KraftCloud Services Autoscale REST API (v1) for configuring and monitoring service group autoscale.

Autoscale Basics

Service groups allow you to load balance traffic for an Internet-facing service like a webserver by creating multiple instances within the same service group. While you can add or remove instances to a service group to scale your service, doing this manually makes it hard to react to changes in service load. On the other hand, always keeping a large number of instances running to cope with bursts is not an option either. This is where autoscale comes into play. With autoscale enabled, KraftCloud takes the heavy lifting of constantly monitoring the load of your service and automatically creates or deletes instances as needed.

To enable autoscale a typical workflow looks like this:

Create a new service group with the desired properties (e.g., published ports, DNS name).
Create a new instance of your application and assign it to the service group. This instance is going to be the autoscale master and cloned by KraftCloud to scale your service.
Create an autoscale configuration for the service group and set the instance as master. The configuration allows you to define the metrics and policies based on which KraftCloud performs autoscale. It also specifies the desired minimum and maximum number of instances as well as warmup and cooldown periods.

Warmup and Cooldown

When KraftCloud decides to scale out your service it grants new instances a grace period in which they have time to complete boot, warm up caches and start having an effect on the load level. Only after this warmup phase, new instances are contributing to the evaluation of the autoscale metric. This is to let the effects of the new instances on the service stabilize and prevent extensive scale out. Note that new instances already receive traffic and serve load while they are still warming up.

Conversely, KraftCloud uses a cooldown phase to control scale in. During this phase, instances selected for scale in are given a chance to drain existing connections while already being excluded from the number of active instances in the service group. New connections or HTTP requests on existing connections¹ are assigned to different instances. If there are still open connections after the cooldown phase, the remaining connections are forcefully closed.

For scale-to-zero the cooldown time defines the window until KraftCloud shuts down the master instance if the service was idle for the entire time.

¹Only if the http connection handler has been set.

Autoscale Policies

With autoscale policies you define under which circumstances KraftCloud should scale your service and what metric (e.g., CPU utilization) should be used for the decision.

KraftCloud currently supports the following autoscale policies:

Policy	Description
Step	Defines concrete adjustments for selected value ranges in the metric

An autoscale configuration usually comprises multiple policies, for example, to control scale in and scale out in separate policies. When KraftCloud performs autoscale decisions it always evaluates all policies and does not stop at the first applicable policy. If none of the policies apply, KraftCloud maintains the current number of instances.

Step Policy

A step policy consists of a set of steps that define

a lower bound,
an upper bound,
and an adjustment.

The lower and upper bounds are always in the dimension of the selected metric. A bound can be set to null (or not provided at all) to make the step unbounded in the respective direction. The interpretation of the adjustment depends on the how the step policy is configured. Positive values increase the number of instances, negative values decrease the number of instances.

Adjustment type	Description
`change`	Relative change in the number of instances (e.g., +2 instances)
`exact`	Absolute target number of instances in the service group (e.g., 10 instances)
`percent`	Change by percentage of the current number of instances in the service group (e.g., +50%)

An example step policy for scaling out may look like this:

{
  "name": "scale-out-policy",
  "type": "step",
  "metric": "cpu",
  "adjustment_type": "percent",
  "steps": [
    { "lower_bound": 500, "upper_bound": 700,  "adjustment": 50  },
    { "lower_bound": 700, "upper_bound": null, "adjustment": 100 }
  ]
}

If the CPU utilization per instance is below 500 millicores, no scaling action happens. If the CPU utilization is between 500 and 700 millicores, the policy instructs KraftCloud to increase the number of instances by 50%. If the CPU utilization exceeds 700 millicores, the number of instances is doubled. Thus, if the per-instance CPU load is at 600 millicores (i.e., 60%) and the current number of instances in the service group is 4, KraftCloud will create 2 additional instances.

There are a set of rules that steps of the same policy must adhere to:

The lower bound must be smaller than the upper bound
The lower and upper bound cannot be null in the same step
Steps must not overlap
Steps must be sorted in ascending order
There must be no gaps between individual steps

Autoscale Metrics

You can base autoscale decisions on different metrics. Currently, KraftCloud supports the following metrics:

Metric	Description
`cpu`	Per-instance CPU utilization measured in millicores (e.g., 100 millicores corresponds to 10% CPU utilization)

Scale-To-Zero

With conventional cloud platforms you need to keep at least one instance running at all times to be able to respond to incoming requests. Performing a just-in-time cold boot is simply too time-consuming and would create a response latency of multiple seconds. This is not the case with KraftCloud. Being based on lightweight unikernel technology, instances on KraftCloud are able to cold boot within milliseconds. This allows us to perform low-latency scale-to-zero.

To enable scale-to-zero it is sufficient to set min_size to 0 in the autoscale configuration. You do not need to specify autoscale policies. KraftCloud will then terminate the master instance if there is no traffic to your service within the window of a cooldown period. Subsequent requests are queued until the master instance has been started again.

There is a shortcut for enabling scale-to-zero when you create a new instance with POST /instances. Just add "features": ["scale-to-zero"] to your instance description and KraftCloud will configure the new instance to shut down when there are no incoming requests. The autoscale configuration uses the default parameters (see here) with min_size and max_size set to 0 and 1, respectively.

API Endpoints

The KraftCloud Services Autoscale REST API provides the following endpoints:

Method	Endpoint	Purpose and Description
`POST`	`/v1/services/autoscale`	Creates an autoscale configuration for one or more service groups
`GET`	`/v1/services/autoscale`	Returns the current autoscale configuration of of service groups
`DELETE`	`/v1/services/autoscale`	Deletes the autoscale configuration for the specified service groups
`POST`	`/v1/services/<UUID>/autoscale/policies`	Adds one or more autoscale policies to the given service group
`GET`	`/v1/services/<UUID>/autoscale/policies`	Gets the configuration of existing autoscale policies
`DELETE`	`/v1/services/<UUID>/autoscale/policies`	Deletes one or more autoscale policies from the given service group

In the following, the API endpoints are specified relative to this base URL:

https://api.X.kraft.cloud/

With X being the IATA metro code. We use fra0 as an example in the documentation. See the introduciton for more information on how to connect to the API.

Creating an Autoscale Configuration

Creates an autoscale configuration for the specified service group.

Request

Endpoints:
POST /v1/services/autoscale
POST /v1/services/<UUID>/autoscale

Parameter	Type	Default	Required	Description
`uuid` \| `name`^1,2	UUID \| Name		✔️	UUID or name of the service group for which to create a configuration
`min_size`	int	1		Minimal number of instances
`max_size`	int	4		Maximum number of instances
`warmup_time_ms`	int	1000		Length of warmup phase in milliseconds
`cooldown_time_ms`	int	1000		Length of cooldown phase in milliseconds
`master`	object		✔️
`uuid` \| `name`²	UUID \| Name		✔️	UUID or name of instance to use as autoscale master
`policies`	array of objects			Description of autoscale policies. See policy creation endpoint

¹ Not allowed in local scope.
² You need to specify either uuid or name.

curl -X POST \
     -H "Authorization: Bearer ${KRAFTCLOUD_TOKEN}" \
     -H "Content-Type: application/json" \
     "https://api.fra0.kraft.cloud/v1/services/autoscale" \
     -d '{
        "name": "my-service-group",
        "min_size": 0,
        "max_size": 1,
        "master": {
          "name": "my-instance"
        },
        "warmup_time_ms": 500,
        "cooldown_time_ms": 500
     }'

Response

The response is embedded in a JSON object as described in API Responses.

Field	Type	Description
`status`	string	`success` on success, or `error` if the request failed
`uuid`	UUID	UUID of the service group
`name`	Name	Name of the service group

{
  "status": "success",
  "data": {
    "service_groups": [
      {
        "status": "success",
        "uuid": "3b5b4c36-2c9b-46e4-80c6-7e5b561938c2",
        "name": "my-service-group"
      }
    ]
  }
}

Getting an Existing Autoscale Configuration

Returns the current autoscale configuration of a service group.

Request

Endpoints:
GET /v1/services/autoscale
GET /v1/services/<UUID>/autoscale

Parameter	Type	Default	Required	Description
`uuid` \| `name`¹	UUID \| Name		✔️	UUID or name of the service group to get the autoscale configuration for

¹ Not allowed in local scope. You need to specify either uuid or name.

curl -X GET \
     -H "Authorization: Bearer ${KRAFTCLOUD_TOKEN}" \
     "https://api.fra0.kraft.cloud/v1/services/3b5b4c36-2c9b-46e4-80c6-7e5b561938c2/autoscale"

Response

The response is embedded in a JSON object as described in API Responses.

Field	Type	Description
`status`	string	`success` on success, `unconfigured` if autoscale is not configured, or `error` if the request failed
`uuid`	UUID	UUID of the service group
`name`	Name	Name of the service group
`enabled`	bool	Whether autoscale is enabled
`min_size`	int	Minimal number of instances
`max_size`	int	Maximum number of instances
`warmup_time_ms`	int	Length of warmup phase in milliseconds
`cooldown_time_ms`	int	Length of cooldown phase in milliseconds
`master`	object
`uuid`	UUID	UUID of autoscale master
`name`	Name	Name of autoscale master
`policies`	array of objects	Description of autoscale policies. See policy creation endpoint

{
  "status": "success",
  "data": {
    "service_groups": [
      {
        "status": "success",
        "uuid": "3b5b4c36-2c9b-46e4-80c6-7e5b561938c2",
        "name": "my-service-group",
        "enabled": true,
        "min_size": 0,
        "max_size": 1,
        "warmup_time_ms": 500,
        "cooldown_time_ms": 500,
        "master": {
          "uuid": "77d0316a-fbbe-488d-8618-5bf7a612477a",
          "name": "my-instance"
        },
        "policies": []
      }
    ]
  }
}

{
  "status": "success",
  "data": {
    "service_groups": [
      {
        "status": "unconfigured",
        "uuid": "3b5b4c36-2c9b-46e4-80c6-7e5b561938c2",
        "name": "my-service-group"
      }
    ]
  }
}

Deleting an Autoscale Configuration

Deletes the autoscale configuration for the specified service group. KraftCloud will immediately drain all connections from all instances that have been created by autoscale and delete the instances afterwards. The draining phase is allowed to take at most cooldown_time_ms milliseconds after which remaining connections are forcefully closed. The master instance is never deleted. However, deleting the autoscale configuration causes the master instance to start if it is stopped.

Request

Endpoints:
DELETE /v1/services/autoscale
DELETE /v1/services/<UUID>/autoscale

Parameter	Type	Default	Required	Description
`uuid` \| `name`¹	UUID \| Name		✔️	UUID or name of the service group for which to delete the autoscale configuration

¹ Not allowed in local scope. You need to specify either uuid or name.

curl -X DELETE \
     -H "Authorization: Bearer ${KRAFTCLOUD_TOKEN}" \
     "https://api.fra0.kraft.cloud/v1/services/3b5b4c36-2c9b-46e4-80c6-7e5b561938c2/autoscale"

Response

The response is embedded in a JSON object as described in API Responses.

Field	Type	Description
`status`	string	`success` on success, or `error` if the request failed
`uuid`	UUID	UUID of the service group
`name`	Name	Name of the service group

{
  "status": "success",
  "data": {
    "service_groups": [
      {
        "status": "success",
        "uuid": "3b5b4c36-2c9b-46e4-80c6-7e5b561938c2",
        "name": "my-service-group"
      }
    ]
  }
}

Adding an Autoscale Policy

Adds a new autoscale policy to the existing autoscale configuration of the specified service group.

Request

Endpoints:
POST /v1/services/<UUID>/autoscale/policies

The available fields depend on the policy type. The following properties are common to all policies:

Parameter	Type	Default	Required	Description
`name`¹	Name		✔️	Name of the policy
`type`	Policy		✔️	Type of autoscale policy

¹Policy names are subject to the same restrictions as object names in general (see here). In addition, policy names cannot be longer than 31 characters.

Step Policy

Additional properties for step policies are:

Parameter	Type	Default	Required	Description
`metric`	Metric	`cpu`		Metric to monitor
`adjustment_type`	Adjustment Type	`change`		Type of adjustment specified in the steps
`steps`	array of objects		✔️	Steps of the step policy
`lower_bound`	int		✔️²	Lower bound of the step range. In dimension of selected metric
`upper_bound`	int		✔️²	Upper bound of the step range. In dimension of selected metric
`adjustment`	int		✔️	Adjustment to take if metric is in range

² Only one of lower_bound and upper_bound can be null or not specified. See the description of the step policy for more information on defining steps.

curl -X POST \
     -H "Authorization: Bearer ${KRAFTCLOUD_TOKEN}" \
     -H "Content-Type: application/json" \
     "https://api.fra0.kraft.cloud/v1/services/3b5b4c36-2c9b-46e4-80c6-7e5b561938c2/autoscale/policies" \
     -d '[
        {
          "name": "scale-out-policy",
          "type": "step",
          "metric": "cpu",
          "adjustment_type": "percent",
          "steps": [
            { "lower_bound": 500, "upper_bound": 700,  "adjustment": 50  },
            { "lower_bound": 700, "upper_bound": null, "adjustment": 100 }
          ]
        },
        {
          "name": "scale-in-policy",
          "type": "step",
          "metric": "cpu",
          "adjustment_type": "percent",
          "steps": [
            { "lower_bound": null, "upper_bound": 40, "adjustment": -20 },
            { "lower_bound": 40,   "upper_bound": 50, "adjustment": -10 }
          ]
        }
      ]'

Response

The response is embedded in a JSON object as described in API Responses.

Field	Type	Description
`status`	string	`success` on success, or `error` if the request failed
`uuid`	UUID	UUID of the service group
`name`	Name	Name of the service group

{
  "status": "success",
  "data": {
    "policies": [
      {
        "status": "success",
        "name": "scale-out-policy"
      },
      {
        "status": "success",
        "name": "scale-in-policy"
      }
    ]
  }
}

Getting the Configuration of an Autoscale Policy

Returns the configuration of the specified autoscale policy.

Request

Endpoints:
GET /v1/services/<UUID>/autoscale/policies
GET /v1/services/<UUID>/autoscale/policies/<NAME>

Parameter	Type	Default	Required	Description
`name`¹	Name		✔️	Name of the autoscale policy

¹ Not allowed in local scope.

curl -X GET \
     -H "Authorization: Bearer ${KRAFTCLOUD_TOKEN}" \
     "https://api.fra0.kraft.cloud/v1/services/3b5b4c36-2c9b-46e4-80c6-7e5b561938c2/autoscale/policies/scale-out-policy"

Response

The response is embedded in a JSON object as described in API Responses.

The properties returned depend on the policy type. The following properties are common to all policies:

Field	Type	Description
`status`	string	`success` on success, or `error` if the request failed
`name`	Name	Name of the policy
`type`	Policy	Type of autoscale policy
`enabled`	bool	Whether the autoscale policy is enabled

Step Policy

Additional properties for step policies are:

Field	Type	Description
`metric`	Metric	Metric to monitor
`adjustment_type`	Adjustment Type	Type of adjustment specified in the steps
`steps`	array of objects	Steps of the step policy
`lower_bound`	int	Lower bound of the step range. In dimension of selected metric
`upper_bound`	int	Upper bound of the step range. In dimension of selected metric
`adjustment`	int	Adjustment to take if metric is in range

{
  "status": "success",
  "data": {
    "policies": [
      {
        "status": "success",
        "name": "scale-out-policy",
        "type": "step",
        "enabled": true,
        "metric": "cpu",
        "adjustment_type": "percent",
        "steps": [
          { "lower_bound": 500, "upper_bound": 700,  "adjustment": 50  },
          { "lower_bound": 700, "upper_bound": null, "adjustment": 100 }
        ]
      }
    ]
  }
}

Deleting an Autoscale Policy

Deletes the specified autoscale policy.

Request

Endpoints:
DELETE /v1/services/<UUID>/autoscale/policies
DELETE /v1/services/<UUID>/autoscale/policies/<NAME>

Parameter	Type	Default	Required	Description
`name`¹	Name		✔️	Name of the autoscale policy

¹ Not allowed in local scope.

curl -X DELETE \
     -H "Authorization: Bearer ${KRAFTCLOUD_TOKEN}" \
     "https://api.fra0.kraft.cloud/v1/services/3b5b4c36-2c9b-46e4-80c6-7e5b561938c2/autoscale/policies/scale-out-policy"

Response

The response is embedded in a JSON object as described in API Responses.

Field	Type	Description
`status`	string	`success` on success, or `error` if the request failed
`name`	Name	Name of the policy

{
  "status": "success",
  "data": {
    "policies": [
      {
        "status": "success",
        "name": "scale-out-policy"
      }
    ]
  }
}