Automatic Scaling Doris Cluster
This document describes how to enable automatic horizontal scaling for CN (Compute Node) in a Doris cluster on Kubernetes, which based on the actual workload.
Prerequisites
Automatic scaling with the Doris Operator requires Kubernetes version 1.22+ and the installation of Metric Server on the Kubernetes cluster.
Configuring DorisAutoscaler
You can configure the automatic scaling behavior of the Doris cluster by configuring the DorisInitializer
custom
resource (CR).
A DorisAutoscaler CR sample
apiVersion: al-assad.github.io/v1beta1
kind: DorisAutoscaler
metadata:
name: basic-autoscale
spec:
# The doris cluster name to be scaled
cluster: basic
# Whether to disable the behavior of scaling down.
# disableScaleDown: false
# The period of time in seconds for each scaling operation.
# scalePeriodSeconds:
# scaleUp: 60
# scaleDown: 60
cn:
# The maximum and minimum CN replicas of automatic scaling.
replicas:
min: 1
max: 5
# Metrics rules for scaling
rules:
# Use CPU metrics as scaling rules (optional)
# The maximum and minimum CPU utilization of CN, the value is a percentage, such as 80 represents 80%.
#
# When the average overall cpu usage of a CN cluster is greater than the max value for a period of time,
# one replica would be automatically added until the next round of computation is below this max value.
# When the average overall cpu usage of a CN cluster is less than the min value for a period of time,
# one replica would be automatically removed until the next round of computation is above this min value.
cpu:
max: 90
min: 20
# Use Memory metrics as scaling rules (optional)
# The maximum and minimum CPU utilization of CN, the value is a percentage, such as 80 represents 80%.
memory:
max: 80
min: 20
${cluster_name}
directory and save it
as ${cluster_name}/doris-autoscaler.yaml
.Replica Limit
spec.cn.replicas
defines the maximum and minimum replica limits for CN’s automatic scaling. In the following example,
the maximum number of replicas for CN scaling is limited to 5, and the minimum for scaling ins is 1.
spec:
cn:
# ...
replicas:
min: 1
max: 5
Scaling Rules
spec.cn.rules
defines the scaling rules for CN based on CPU and memory metrics.
spec:
cn:
# ...
rules:
cpu:
max: 90
min: 20
memory:
max: 80
min: 20
In the above example, max
and min
values are in percentages, e.g., 90 represents 90%. The DorisAutoscaler
dynamically scales in or out based on the overall CPU and memory utilization of CN nodes. Taking CPU as an example:
- When the overall average CPU usage of the CN cluster continuously exceeds
cpu.max
for a period, it automatically adds a replica until the next assessment does not exceedcpu.max
. - When the overall average CPU usage of the CN cluster falls below
cpu.min
for a period, it automatically removes a replica until the next assessment shows a CPU usage abovecpu.min
.
Apply DorisAutoscaler
kubectl apply -f ${cluster_name}/doris-autoscale.yaml --namespace=${namespace}
View the status of the DorisAutoscaler:
kubectl get dorisautoscaler ${dorisautoscaler_name} -n ${namespace} -o yaml
Delete DorisAutoScaler
kubectl delete -f ${cluster_name}/doris-autoscale.yaml --namespace=${namespace}