配置 Doris 集群
本文档介绍了如何配置生产可用的 Doris 集群,
资源配置
部署前需要根据实际情况和需求,为 Doris 集群各个组件配置资源,其中 FE、BE、CN、Broker 是 Doris 集群的核心服务组件,在生产环境下它们的资源配置还需要按组件要求指定,具体参考:资源配置推荐。
为了保证 Doris 集群的组件在 Kubernetes 中合理的调度和稳定的运行,建议为其设置 Guaranteed 级别的 QoS,通过在配置资源时让 limits 等于 requests 来实现, 具体参考:配置 QoS。
如果使用 NUMA 架构的 CPU,为了获得更好的性能,需要在节点上开启 Static
的 CPU 管理策略。为了 Doris 集群组件能独占相应的
CPU
资源,除了为其设置上述 Guaranteed 级别的 QoS 外,还需要保证 CPU 的配额必须是大于或等于 1
的整数。具体参考: CPU 管理策略。
部署配置
通过配置 DorisCluster
CR 来配置 Doris 集群:
简要的 DorisCluster CR 示例
# IT IS NOT SUITABLE FOR PRODUCTION USE.
# This YAML describes a basic Doris cluster with minimum resource requirements,
# which should be able to run in any Kubernetes cluster with storage support.
apiVersion: al-assad.github.io/v1beta1
kind: DorisCluster
metadata:
name: basic
spec:
# Image tag of fe, be, cn and broker components.
version: 2.0.3
## Doris FE configuration.
# When this "fe" configuration key is not set, the Doris FE component will not be deployed,
# and the FE components on the cluster will be deleted (but the pvc for fe persistent data
# will be retained).
fe:
baseImage: ghcr.io/linsoss/doris-fe
# The replica of fe must be an odd number, it is recommended to 3 in the production env.
replicas: 1
# Extra FE config, see: https://doris.apache.org/docs/dev/admin-manual/config/fe-config/
config:
prefer_compute_node_for_external_table: "true"
# The resource requirements. For production environments,
# please refer to: https://doris.apache.org/docs/dev/install/standard-deployment/#production-environment
requests:
cpu: 500m
memory: 1Gi
storage: 2Gi
# If storageClassName is not set, the default Storage Class of the Kubernetes cluster will be used.
# storageClassName: local
## Doris BE (mixed mode) configuration.
# When this "be" configuration key is not set, the Doris BE component will not be deployed,
# and the BE components on the cluster will be deleted (but the pvc for be persistent data
# will be retained).
be:
baseImage: ghcr.io/linsoss/doris-be
replicas: 1
# Extra BE config, see: https://doris.apache.org/docs/dev/admin-manual/config/be-config
config: { }
# The resource requirements. For production environments, please
# refer to: https://doris.apache.org/docs/dev/install/standard-deployment/#production-environment
requests:
cpu: 500m
memory: 1Gi
storage: 5Gi
# If storageClassName is not set, the default Storage Class of the Kubernetes cluster will be used.
# storageClassName: local
## Doris BE (computation mode) configuration.
# When this "cn" configuration key is not set, the Doris CN component will not be deployed,
# and the CN components on the cluster will be deleted.
cn:
baseImage: ghcr.io/linsoss/doris-cn
replicas: 1
# Extra BE config, see: https://doris.apache.org/docs/dev/admin-manual/config/be-config
config: { }
# The resource requirements. For production environments, please
# refer to: https://doris.apache.org/docs/dev/install/standard-deployment/#production-environment
requests:
cpu: 500m
memory: 1Gi
## Doris Broker configuration
# When this "broker" configuration key is not set, the Doris broker component will not be deployed,
# and the broker components on the cluster will be deleted.
broker:
baseImage: ghcr.io/linsoss/doris-broker
replicas: 1
# Extra Broker config.
config: { }
# The resource requirements. For production environments, please
# refer to: https://doris.apache.org/docs/dev/install/standard-deployment/#production-environment
requests:
cpu: 500m
memory: 512Mi
完整的 DorisCluster CR 示例
apiVersion: al-assad.github.io/v1beta1
kind: DorisCluster
metadata:
name: basic
spec:
# Image tag of fe, be, cn and broker components.
version: 2.0.3
###############################
# Cluster Global Configuration #
###############################
## ImagePullPolicy of Doris Cluster Pods
## Ref: https://kubernetes.io/docs/concepts/configuration/overview/#container-images
# imagePullPolicy: IfNotPresent
## Ref: https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
# imagePullSecrets:
# - name: secretName
## Customized busybox image for init container used by BE and CN.
# busyBoxImage: busybox:1.36
## Specifies the service account for FE/BE/CN/Broker components.
# serviceAccount: ""
## NodeSelector of pods。
## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
# nodeSelector:
# node-role.kubernetes.io/doris: true
## Affinity for pod scheduling
## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
# affinity:
# podAntiAffinity:
# # require not to run FE pods on nodes where there's already a FE pod running
# # if setting this, you must ensure that at least `replicas` nodes are available in the cluster
# requiredDuringSchedulingIgnoredDuringExecution:
# - labelSelector:
# matchExpressions:
# - key: app.kubernetes.io/component
# operator: In
# values:
# - fe
# topologyKey: kubernetes.io/hostname
## Tolerations are applied to Doris cluster pods, allowing (but do not require) pods to be scheduled onto nodes
## with matching taints.
## This cluster-level `tolerations` only take effect when no component-level `tolerations` are set.
## E.g., if `fe.tolerations` is not empty, `tolerations` here will be ignored.
## Ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
# tolerations:
# - effect: NoSchedule
# key: dedicated
# operator: Equal
# value: doris
## Specify pod priorities of pods in DorisCluster, default to empty.
## Can be overwritten by component settings.
## Ref: https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/
# priorityClassName: system-cluster-critical
## Set update strategy of StatefulSet can be overwritten by the setting of each component.
## Defaults to RollingUpdate.
## Ref: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#update-strategies
# statefulSetUpdateStrategy: RollingUpdate
## Hadoop's configuration that injected into FE, BE, CN and Broker pods.
# hadoopConf:
# ## Host name and IP address of Hadoop cluster
# hosts:
# - ip: 10.233.123.189
# name: hadoop-01
# - ip: 10.233.123.179
# name: hadoop-02
# - ip: 10.233.123.179
# name: hadoop-03
# ## Hadoop conf files
# configs:
# hdfs-site.xml: |
# <configuration>
# ...
# </configuration>
# hive-site.xml: |
# <configuration>
# ...
# </configuration>
###################
# FE Configuration #
###################
# When this "fe" configuration key is not set, the Doris FE component will not be deployed,
# and the FE components on the cluster will be deleted (but the pvc for fe persistent data
# will be retained).
fe:
#########################
# FE Basic Configuration #
#########################
## Base image of the FE component
baseImage: ghcr.io/linsoss/doris-fe
## The replica of fe must be an odd number, it is recommended to 3 in the production env.
replicas: 3
## Extra FE config, see: https://doris.apache.org/docs/dev/admin-manual/config/fe-config/
# config:
# prefer_compute_node_for_external_table: 'true'
# qe_max_connection: '2048'
## Describe the resource requirements. For production environments, please refer to: https://doris.apache.org/docs/dev/install/standard-deployment/#production-environment
## Ref: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
requests:
cpu: '8'
memory: 8Gi
storage: 100Gi
# limits:
# cpu: '32'
# memory: 64Gi
## The storageClassName of the persistent volume for FE persistent data.
## If storageClassName is not set, the default Storage Class of the Kubernetes cluster will be used.
# storageClassName: local
## Defines Kubernetes service for doris-fe
# service:
# ## service type, only ClusterIP and NodePort support is available.
# type: NodePort
# ## Expose the FE query port to the Node Port, default 0 is a random port.
# queryPort: 0
# ## Expose the FE http port to the Node Port, default 0 is a random port.
# httpPort: 0
############################
# FE Advanced Configuration #
############################
## Annotations for FE pods
# annotations: {}
## Host aliases for FE pods, it will be merged with the hadoopConf field
## Ref: https://kubernetes.io/docs/concepts/services-networking/add-entries-to-pod-etc-hosts-with-host-aliases/
# hostAliases:
# - ip: 10.233.123.122
# hostnames:
# - bg01
# - ip: 10.233.123.123
# hostnames:
# - bg02
## List of environment variables to set in the container
## Ref: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
# additionalEnvs:
# - name: MY_ENV_1
# value: value1
# - name: MY_ENV_2
# valueFrom:
# fieldRef:
# fieldPath: status.myEnv2
## Custom sidecar containers can be injected into the FE pods,
## which can act as a tracing agent or for any other use case
# additionalContainers:
# - name: myCustomContainer
# image: ubuntu
## Custom additional volumes in FE pods.
## Ref: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#types-of-persistent-volumes
# additionalVolumes:
# - name: nfs
# nfs:
# server: 192.168.0.2
# path: /nfs
## Custom additional volume mounts in FE pods.
# additionalVolumeMounts:
# # this must match `name` in `additionalVolumes`
# - name: nfs
# mountPath: /nfs
## The following block overwrites cluster-level configurations in `spec`
# serviceAccount: ""
# affinity: {}
# tolerations: {}
# priorityClassName: ""
# statefulSetUpdateStrategy: RollingUpdate
# nodeSelector:
# app.kubernetes.io/component: fe
###############################
# BE(mixed mode) Configuration #
###############################
# When this "be" configuration key is not set, the Doris BE component will not be deployed,
# and the BE components on the cluster will be deleted (but the pvc for be persistent data
# will be retained).
be:
#########################
# BE Basic Configuration #
#########################
## Base image of the BE component
baseImage: ghcr.io/linsoss/doris-be
## The replica of the BE component
replicas: 3
## Extra BE config, see: https://doris.apache.org/docs/dev/admin-manual/config/be-config
# config:
# external_table_connect_timeout_sec: '30 seconds'
# mem_limit: '90%'
## Describes the resource requirements. For production environments, please refer to: https://doris.apache.org/docs/dev/install/standard-deployment/#production-environment
## Ref: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
requests:
cpu: '8'
memory: 32Gi
storage: 500Gi
# limits:
# cpu: '32'
# memory: 64Gi
## The storageClassName of the persistent volume for FE persistent data.
## If storageClassName is not set, the default Storage Class of the Kubernetes cluster will be used.
# storageClassName: local
############################
# BE Advanced Configuration #
############################
## The custom storage of BE used to support cold and hot storage separation.
## Ref: https://doris.apache.org/docs/1.2/install/standard-deployment/#deploy-be
## name: custom storage name
## medium: storage medium, SSD(hot storage) or HDD(cold storage)
## request: storage capacity, e.g. "500Gi"
## storageClassName: k8s storage class name for the pvc
# storage:
# - name: storage-cold-1
# medium: HDD
# request: 500Gi
# storageClassName: hdd-pool
# - name: storage-cold-2
# medium: HDD
# request: 500Gi
# storageClassName: hdd-pool
# - name: storage-hot
# medium: SSD
# request: 200Gi
# storageClassName: ssd-pool
## Whether to retain the default data storage mount for BE which is located at be/storage,
# retainDefaultStorage: false
## Annotations for BE pods
# annotations: {}
## Host aliases for BE pods, it will be merged with the hadoopConf field
## Ref: https://kubernetes.io/docs/concepts/services-networking/add-entries-to-pod-etc-hosts-with-host-aliases/
# hostAliases:
# - ip: 10.233.123.122
# hostnames:
# - bg01
# - ip: 10.233.123.123
# hostnames:
# - bg02
## List of environment variables to set in the container
## Ref: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
# additionalEnvs:
# - name: MY_ENV_1
# value: value1
# - name: MY_ENV_2
# valueFrom:
# fieldRef:
# fieldPath: status.myEnv2
## Custom sidecar containers can be injected into the BE pods,
## which can act as a tracing agent or for any other use case
# additionalContainers:
# - name: myCustomContainer
# image: ubuntu
## Custom additional volumes in BE pods.
## Ref: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#types-of-persistent-volumes
# additionalVolumes:
# - name: nfs
# nfs:
# server: 192.168.0.2
# path: /nfs
## Custom additional volume mounts in BE pods.
# additionalVolumeMounts:
# # this must match `name` in `additionalVolumes`
# - name: nfs
# mountPath: /nfs
## The following block overwrites cluster-level configurations in `spec`
# serviceAccount: ""
# affinity: {}
# tolerations: {}
# priorityClassName: ""
# statefulSetUpdateStrategy: RollingUpdate
# nodeSelector:
# app.kubernetes.io/component: be
#####################################
# BE(computation mode) Configuration #
#####################################
# When this "cn" configuration key is not set, the Doris CN component will not be deployed,
# and the CN components on the cluster will be deleted.
cn:
#########################
# CN Basic Configuration #
#########################
## Base image of the CN component
baseImage: ghcr.io/linsoss/doris-cn
## The replica of the CN component
## When there is a DorisAutoscaler bound to the current DorisCluster, cn.replicas will not take effect.
## The actual number of replicas adjusted by DorisAutoscaler shall prevail.
replicas: 2
## Extra BE config, see: https://doris.apache.org/docs/dev/admin-manual/config/be-config
# config:
# external_table_connect_timeout_sec: '30 seconds'
# mem_limit: '90%'
## The resource requirements. For production environments, please refer to: https://doris.apache.org/docs/dev/install/standard-deployment/#production-environment
## Ref: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
requests:
cpu: '8'
memory: 32Gi
# limits:
# cpu: '32'
# memory: 64Gi
############################
# CN Advanced Configuration #
############################
## Annotations for CN pods
# annotations: {}
## Host aliases for BE pods, it will be merged with the hadoopConf field
## Ref: https://kubernetes.io/docs/concepts/services-networking/add-entries-to-pod-etc-hosts-with-host-aliases/
# hostAliases:
# - ip: 10.233.123.122
# hostnames:
# - bg01
# - ip: 10.233.123.123
# hostnames:
# - bg02
## List of environment variables to set in the container
## Ref: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
# additionalEnvs:
# - name: MY_ENV_1
# value: value1
# - name: MY_ENV_2
# valueFrom:
# fieldRef:
# fieldPath: status.myEnv2
## Custom sidecar containers can be injected into the BE pods,
## which can act as a tracing agent or for any other use case
# additionalContainers:
# - name: myCustomContainer
# image: ubuntu
## Custom additional volumes in BE pods.
## Ref: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#types-of-persistent-volumes
# additionalVolumes:
# - name: nfs
# nfs:
# server: 192.168.0.2
# path: /nfs
## Custom additional volume mounts in BE pods.
# additionalVolumeMounts:
# # this must match `name` in `additionalVolumes`
# - name: nfs
# mountPath: /nfs
## The following block overwrites cluster-level configurations in `spec`
# serviceAccount: ""
# affinity: {}
# tolerations: {}
# priorityClassName: ""
# statefulSetUpdateStrategy: RollingUpdate
# nodeSelector:
# app.kubernetes.io/component: cn
#######################
# Broker Configuration #
#######################
# When this "broker" configuration key is not set, the Doris broker component will not be deployed,
# and the broker components on the cluster will be deleted.
broker:
#############################
# Broker Basic Configuration #
#############################
## Base image of the Broker component
baseImage: ghcr.io/linsoss/doris-broker
## The replica of the Broker component
replicas: 1
## Extra Broker config.
config: { }
## Describe the resource requirements. For production environments, please refer to: https://doris.apache.org/docs/dev/install/standard-deployment/#production-environment.
## Ref: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
requests:
cpu: '1'
memory: 2Gi
# limits:
# cpu: 8
# memory: 16Gi
################################
# Broker Advanced Configuration #
################################
## Annotations for Broker pods
# annotations: {}
## Host aliases for BE pods, it will be merged with the hadoopConf field
## Ref: https://kubernetes.io/docs/concepts/services-networking/add-entries-to-pod-etc-hosts-with-host-aliases/
# hostAliases:
# - ip: 10.233.123.122
# hostnames:
# - bg01
# - ip: 10.233.123.123
# hostnames:
# - bg02
## List of environment variables to set in the container
## Ref: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
# additionalEnvs:
# - name: MY_ENV_1
# value: value1
# - name: MY_ENV_2
# valueFrom:
# fieldRef:
# fieldPath: status.myEnv2
## Custom sidecar containers can be injected into the BE pods,
## which can act as a tracing agent or for any other use case
# additionalContainers:
# - name: myCustomContainer
# image: ubuntu
## Custom additional volumes in BE pods.
## Ref: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#types-of-persistent-volumes
# additionalVolumes:
# - name: nfs
# nfs:
# server: 192.168.0.2
# path: /nfs
## Custom additional volume mounts in BE pods.
# additionalVolumeMounts:
# # this must match `name` in `additionalVolumes`
# - name: nfs
# mountPath: /nfs
## The following block overwrites cluster-level configurations in `spec`
# serviceAccount: ""
# affinity: {}
# tolerations: {}
# priorityClassName: ""
# statefulSetUpdateStrategy: RollingUpdate
# nodeSelector:
# app.kubernetes.io/component: broker
${cluster_name}
目录下组织 Doris 集群的配置,并将其另存为 ${cluster_name}/doris-cluster.yaml
。修改配置并提交后,会自动应用到
Doris 集群中。集群名称
通过更改 DorisCuster
CR 中的 metadata.name
来配置集群名称。
版本
正常情况下,集群内的各组件应该使用相同版本,所以一般建议配置 spec.<fe/be/cn/broker>.baseImage
+ spec.version
即可。
相关参数的格式如下:
spec.version
,格式为imageTag
,例如2.0.3
spec.<fe/be/cn/broker>.baseImage
,格式为imageName
,例如ghcr.io/linsoss/doris-fe
;
请注意必须使用 doris-operator/images 进行构建的 Doris 组件镜像,当然您也可以直接使用 linsoss 发布的 doris 组件镜像 😃:
Component | Image |
---|---|
FE | ghcr.io/linsoss/doris-fe |
BE | ghcr.io/linsoss/doris-be |
CN | ghcr.io/linsoss/doris-cn |
Broker | ghcr.io/linsoss/doris-broker |
存储
如果需要设置存储类型,可以修改 ${cluster_name}/doris-cluster.yaml
中各组件的 storageClassName
字段。
Doris 集群不同组件对磁盘的要求不一样,所以部署集群前,要根据当前 Kubernetes 集群支持的存储类型以及使用场景,参考存储配置文档为 Doris 集群各组件选择合适的存储类型。
如果需要为 Doris BE 配置冷热存储分离存储,可以参考 配置 Doris BE 冷热分离存储。
Doris 组件配置参数
可以通过 spec.<fe/be/cn/broker>.config
来配置各个组件的参数。
比如想修改 FE 以下配置参数:
prefer_compute_node_for_external_table=true
enable_spark_load=true
则修改 DorisCluster
的以下配置:
spec:
fe:
config:
prefer_compute_node_for_external_table: 'true'
enable_spark_load: 'true'
配置 Doris 服务
通过配置 spec.fe.service
定义不同的 Service 类型,如 ClusterIP
、 NodePort
。默认情况下 Doris Operator 会为 FE
创建一个额外的 ClusterIP
类型 Service。
ClusterIP
ClusterIP
是通过集群的内部 IP 暴露服务,选择该类型的服务时,只能在集群内部访问,使用 ClusterIP 或者 Service 域名(${cluster_name}-fe.${namespace}
)访问。spec: doris: service: type: ClusterIP
NodePort
在本地测试时候,可选择通过 NodePort 暴露,Doris Operator 会绑定 FE 的 SQL 查询端口和 Web UI 端口到 NodePort。
NodePort 是通过节点的 IP 和静态端口暴露服务。通过请求
NodeIP + NodePort
,可以从集群的外部访问一个 NodePort 服务。spec: doris: service: type: NodePort
Hadoop 连接配置
当 Doris 集群需要连接 Hadoop,相关的 Hadoop 配置文件是必不可少的,spec.hadoopConf
配置项提供了方便的向 FE、BE、CN、Broke 注入
Hadoop 配置的方式。
spec:
hadoopConf:
# Hadoop 集群的 hostname-ip 映射
hosts:
- ip: 10.233.123.189
name: hadoop-01
- ip: 10.233.123.179
name: hadoop-02
- ip: 10.233.123.179
name: hadoop-03
# Hadoop 配置文件内容
configs:
hdfs-site.xml: |
<configuration>
...
</configuration>
hive-site.xml: |
<configuration>
...
</configuration>
物理拓扑高可用
Doris 是一个分布式数据库,以下介绍 3 种方式来为维持 Doris 在 Kubernetes 上的物理拓扑高可用。
通过 nodeSelector 约束调度实例
通过各组件配置的 nodeSelector
字段,可以约束组件的实例只能调度到特定的节点上。关于 nodeSelector
的更多说明,请参阅 nodeSelector。
apiVersion: al-assad.github.io/v1beta1
kind: DorisCluster
# ...
spec:
fe:
nodeSelector:
node-role.kubernetes.io/fe: true
# ...
be:
nodeSelector:
node-role.kubernetes.io/be: true
# ...
cn:
nodeSelector:
node-role.kubernetes.io/cn: true
# ...
broker:
nodeSelector:
node-role.kubernetes.io/broker: true
通过 tolerations 调度实例
通过各组件配置的 tolerations
字段,可以允许组件的实例能够调度到带有与之匹配的污点 (
Taint)
的节点上。关于污点与容忍度的更多说明,请参阅 Taints and Tolerations。
apiVersion: al-assad.github.io/v1beta1
kind: DorisCluster
# ...
spec:
fe:
tolerations:
- effect: NoSchedule
key: dedicated
operator: Equal
value: fe
# ...
be:
tolerations:
- effect: NoSchedule
key: dedicated
operator: Equal
value: be
# ...
cn:
tolerations:
- effect: NoSchedule
key: dedicated
operator: Equal
value: cn
# ...
broker:
tolerations:
- effect: NoSchedule
key: dedicated
operator: Equal
value: broker
# ...
通过 affinity 调度实例
配置 PodAntiAffinity
能尽量避免同一组件的不同实例部署到同一个物理拓扑节点上,从而达到高可用的目的。关于 Affinity
的使用说明,请参阅 Affinity & AntiAffinity。
下面是一个避免 FE 实例调度到同一个物理节点的例子:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/component
operator: In
values:
- fe
- key: app.kubernetes.io/instance
operator: In
values:
- ${name}
topologyKey: kubernetes.io/hostname