Title here
Summary here
本文介绍了如何让 Kubernetes 上的 Doris 集群的 CN 计算节点根据实际负载自动扩缩容。
Doris Operator 的自动扩缩容要求 Kubernetes 版本 1.22 以及以上,且 Kubernetes 集群上已经安装 Metric Server。
可以通过配置 DorisInitializer
CR 来配置 Doris 集群的自动扩缩容行为。
apiVersion: al-assad.github.io/v1beta1
kind: DorisAutoscaler
metadata:
name: basic-autoscale
spec:
# The doris cluster name to be scaled
cluster: basic
# Whether to disable the behavior of scaling down.
# disableScaleDown: false
# The period of time in seconds for each scaling operation.
# scalePeriodSeconds:
# scaleUp: 60
# scaleDown: 60
cn:
# The maximum and minimum CN replicas of automatic scaling.
replicas:
min: 1
max: 5
# Metrics rules for scaling
rules:
# Use CPU metrics as scaling rules (optional)
# The maximum and minimum CPU utilization of CN, the value is a percentage, such as 80 represents 80%.
#
# When the average overall cpu usage of a CN cluster is greater than the max value for a period of time,
# one replica would be automatically added until the next round of computation is below this max value.
# When the average overall cpu usage of a CN cluster is less than the min value for a period of time,
# one replica would be automatically removed until the next round of computation is above this min value.
cpu:
max: 90
min: 20
# Use Memory metrics as scaling rules (optional)
# The maximum and minimum CPU utilization of CN, the value is a percentage, such as 80 represents 80%.
memory:
max: 80
min: 20
${cluster_name}
目录下组织 Doris 集群的配置,并将其另存为 ${cluster_name}/doris-autoscaler.yaml
。spec.cn.replicas
定义了 CN 自动扩缩容的最大、最小副本数量限制。以下例子中限制了 CN 扩容最大的副本数量为 5,缩容的最小副本为
1。
spec:
cn:
# ...
replicas:
min: 1
max: 5
spec.cn.rules
定义了 CN 扩缩容的依据规则,支持根据 CPU 和内存的指标评估。
spec:
cn:
# ...
rules:
cpu:
max: 90
min: 20
memory:
max: 80
min: 20
以上例子中,其中 max
和 min
的值为百分比,比如 90 代表 90%。DorisAutoscaler 会分别根据 CN 节点整体的 CPU
和内存的利用率进行动态扩缩容, 以 CPU 为例子:
cpu.max
时,将自动增加一个副本,直到下一轮计算评估不大于 cpu.max
。cpu.min
时,将自动移除一个副本,直到下一轮计算的 CPU
占用率高于该 cpu.min
。kubectl apply -f ${cluster_name}/doris-autoscale.yaml --namespace=${namespace}
查看 DorisAutoscaler 的运行情况:
kubectl get dorisautoscaler ${dorisautoscaler_name} -n ${namespace} -o yaml
kubectl delete -f ${cluster_name}/doris-autoscale.yaml --namespace=${namespace}