Redis Operator 在 Kubernetes 中的部署与运维全指南
Redis Operator是Kubernetes生态中用于自动化管理Redis集群的工具,采用Operator模式实现复杂有状态应用的部署和运维。本文介绍了Redis Operator的核心功能,包括支持单机和集群模式部署、内置监控、动态存储分配等。通过Helm快速安装Operator后,可以便捷部署单节点Redis实例,或通过YAML文件自定义配置。Redis Operator简化了Redis
一、Operator概述
1.1 为什么选择Operator模式来部署Redis集群
在 Kubernetes 生态中,Operator 模式通过引入自定义资源 (CRD) 和自定义控制器,将特定应用的运维知识编码到软件中,从而实现复杂有状态应用的自动化管理。相较于 Helm Chart(通常侧重于应用的初始部署和配置),Operator 提供了更深层次的生命周期管理能力,包括自动扩缩容、版本升级、故障恢复、备份等。
对于 Redis Cluster 这样的分布式有状态服务,Operator 能够:
- 简化部署与管理: 通过一个简单的 YAML (Custom Resource) 即可定义和部署整个集群。
- 自动化运维: Operator 会持续监控集群状态,自动处理节点故障、数据同步等问题。
- 高可用性: 确保 Redis 服务在节点故障时能自动切换和恢复,保障业务连续性。
- 声明式配置: 您只需声明期望的状态,Operator 负责将其变为现实。
Operator模式架构图

1.2 Redis Operator概述
本次使用的是Opstree公司开发的 Redis Operator 。这是一款用 Golang 编写的 Redis 运维控制器(Operator),用于在 Kubernetes 上部署和管理 Redis 的 单机模式 和 集群模式,支持云端部署和裸机部署。它能根据最佳实践创建 Redis 集群,同时内置了 Redis Exporter,实现了对 Redis 的监控能力。
架构图如下

该 Redis Operator 支持以下功能:
- ✅ 支持 Redis 的集群模式和单节点模式部署
- ✅ 内置 Prometheus 监控导出器(redis-exporter)
- ✅ 动态存储资源分配(基于 PVC 模板)
- ✅ 资源请求与限制设置(CPU、内存等)
- ✅ 支持设置密码或无密码的 Redis 实例
- ✅ 支持节点调度策略(nodeSelector)与亲和性(affinity)配置
- ✅ 支持设置优先级类(Priority Class)以控制 Pod 优先级
- ✅ 支持使用 SecurityContext 配置内核参数和权限管理
二、部署Redis Operator
# 添加 Opstree Helm 仓库
helm repo add ot-helm https://ot-container-kit.github.io/helm-charts/
# 更新本地 Helm 仓库索引
helm repo update
# 安装 Redis Operator 到 'ot-operator' 命名空间 (如果不存在则创建)
$ helm install redis-operator ot-helm/redis-operator -n ot-operator --create-namespace
NAME: redis-operator01
LAST DEPLOYED: Tue Jul 22 11:47:11 2025
NAMESPACE: redis-operator
STATUS: deployed
REVISION: 1
TEST SUITE: None
...
安装完成后,验证 Operator Pod 是否正常运行:
root@k8s-master01:~# kubectl get pod -n ot-operator
NAME READY STATUS RESTARTS AGE
pod/redis-operator-57f979468d-qkn8p 1/1 Running 0 43s
三、使用
mysql Operator可以创建下列四种资源
- Redis
- Redis Cluster
- Redis Replication
- Redis Sentinel
3.1 单节点Redis
Redis 单节点是一个基于单进程的 Redis Pod

3.1.1 单节点Redis部署
使用helm安装
$ helm install redis ot-helm/redis --namespace redis-server --create-namespace
NAME: redis
LAST DEPLOYED: Tue Jul 22 14:36:22 2025
NAMESPACE: redis-server
STATUS: deployed
REVISION: 1
TEST SUITE: None
通过 kubectl 命令行验证独立 redis 设置。
root@k8s-master01:~/redis# kubectl get all -n redis-server
NAME READY STATUS RESTARTS AGE
pod/redis-0 1/1 Running 0 6s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/redis ClusterIP 10.244.22.26 <none> 6379/TCP 6s
service/redis-additional ClusterIP 10.244.9.163 <none> 6379/TCP 6s
service/redis-headless ClusterIP None <none> 6379/TCP 6s
NAME READY AGE
statefulset.apps/redis 1/1 6s
YAML安装如下
编辑配置文件 standalone.yaml
root@k8s-master01:~/redis# cat standalone.yaml
---
apiVersion: redis.redis.opstreelabs.in/v1beta2
kind: Redis
metadata:
name: redis-standalone
spec:
kubernetesConfig:
image: quay.io/opstree/redis:v7.0.15
imagePullPolicy: IfNotPresent
storage:
volumeClaimTemplate:
spec:
# storageClassName: standard
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
storageClassName: nfs-client
securityContext:
runAsUser: 1000
fsGroup: 1000
部署
kubectl apply -f standalone.yaml
查看部署详情
root@k8s-master01:~/redis# kubectl get all -n redis-server
NAME READY STATUS RESTARTS AGE
pod/redis-0 1/1 Running 0 6s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/redis ClusterIP 10.244.22.26 <none> 6379/TCP 6s
service/redis-additional ClusterIP 10.244.9.163 <none> 6379/TCP 6s
service/redis-headless ClusterIP None <none> 6379/TCP 6s
NAME READY AGE
statefulset.apps/redis 1/1 6s
3.1.2 配置参数清单
Redis的Helm配置清单如下
| Key | Type | Default | Description |
|---|---|---|---|
| TLS.ca | string | "ca.key" |
|
| TLS.cert | string | "tls.crt" |
|
| TLS.key | string | "tls.key" |
|
| TLS.secret.secretName | string | "" |
|
| acl.secret.secretName | string | "" |
|
| affinity | object | {} |
|
| env | list | [] |
|
| externalConfig.data | string | "tcp-keepalive 400\nslowlog-max-len 158\nstream-node-max-bytes 2048\n" |
|
| externalConfig.enabled | bool | false |
|
| externalService.enabled | bool | false |
|
| externalService.port | int | 6379 |
|
| externalService.serviceType | string | "NodePort" |
|
| initContainer.args | list | [] |
|
| initContainer.command | list | [] |
|
| initContainer.enabled | bool | false |
|
| initContainer.env | list | [] |
|
| initContainer.image | string | "" |
|
| initContainer.imagePullPolicy | string | "IfNotPresent" |
|
| initContainer.resources | object | {} |
|
| labels | object | {} |
|
| nodeSelector | object | {} |
|
| podSecurityContext.fsGroup | int | 1000 |
|
| podSecurityContext.runAsUser | int | 1000 |
|
| priorityClassName | string | "" |
|
| redisExporter.enabled | bool | false |
|
| redisExporter.env | list | [] |
|
| redisExporter.image | string | "quay.io/opstree/redis-exporter" |
|
| redisExporter.imagePullPolicy | string | "IfNotPresent" |
|
| redisExporter.resources | object | {} |
|
| redisExporter.tag | string | "v1.44.0" |
|
| redisStandalone.ignoreAnnotations | list | [] |
|
| redisStandalone.image | string | "quay.io/opstree/redis" |
|
| redisStandalone.imagePullPolicy | string | "IfNotPresent" |
|
| redisStandalone.imagePullSecrets | list | [] |
|
| redisStandalone.minReadySeconds | int | 0 |
|
| redisStandalone.name | string | "" |
|
| redisStandalone.recreateStatefulSetOnUpdateInvalid | bool | false |
statefulset的某些字段是不可变的,例如volumeClaimTemplates。当设置为true时,Operator将删除statefulset并重新创建。默认值为false。 |
| redisStandalone.redisSecret.secretKey | string | "" |
|
| redisStandalone.redisSecret.secretName | string | "" |
|
| redisStandalone.resources | object | {} |
|
| redisStandalone.serviceType | string | "ClusterIP" |
|
| redisStandalone.tag | string | "v7.0.15" |
|
| securityContext | object | {} |
|
| serviceAccountName | string | "" |
|
| serviceMonitor.enabled | bool | false |
|
| serviceMonitor.interval | string | "30s" |
|
| serviceMonitor.namespace | string | "monitoring" |
|
| serviceMonitor.scrapeTimeout | string | "10s" |
|
| sidecars.env | list | [] |
|
| sidecars.image | string | "" |
|
| sidecars.imagePullPolicy | string | "IfNotPresent" |
|
| sidecars.name | string | "" |
|
| sidecars.resources.limits.cpu | string | "100m" |
|
| sidecars.resources.limits.memory | string | "128Mi" |
|
| sidecars.resources.requests.cpu | string | "50m" |
|
| sidecars.resources.requests.memory | string | "64Mi" |
|
| storageSpec.volumeClaimTemplate.spec.accessModes[0] | string | "ReadWriteOnce" |
|
| storageSpec.volumeClaimTemplate.spec.resources.requests.storage | string | "1Gi" |
|
| tolerations | list | [] |
3.2 Redis Cluster
Redis 集群本质上是一种数据分片策略,它会自动将数据分布到多个 Redis 节点上。这是 Redis 的一项高级功能,能够实现分布式存储并避免单点故障。
当任何一个 Redis 节点发生故障时,从节点(follower pod)会自动晋升为主节点(leader);而当原故障节点恢复在线后,它会重新以从节点的身份运行。
- 若构建仅包含主节点的 Redis 分片集群,至少需要 3 个节点。
- 若同时包含从节点,则至少需要 6 个 Redis Pod / 进程(通常为 3 主 3 从的配置)。

3.2.1 Redis Cluster部署
使用Helm安装
$ helm install redis-cluster ot-helm/redis-cluster \
--set redisCluster.clusterSize=3 --namespace ot-operators
...
Release "redis-cluster" does not exist. Installing it now.
NAME: redis-cluster
LAST DEPLOYED: Sun May 2 16:11:38 2021
NAMESPACE: ot-operators
STATUS: deployed
REVISION: 1
TEST SUITE: None
验证安装Pod详情
oot@k8s-master01:~/redis# kubectl get pod -n redis-server
NAME READY STATUS RESTARTS AGE
redis-cluster-follower-0 1/1 Running 0 37s
redis-cluster-follower-1 1/1 Running 0 34s
redis-cluster-follower-2 1/1 Running 0 31s
redis-cluster-leader-0 1/1 Running 0 82s
redis-cluster-leader-1 1/1 Running 0 61s
redis-cluster-leader-2 1/1 Running 0 58s
通过redis-cli查看集群状态
oot@k8s-master01:~/redis# kubectl exec -it redis-cluster-leader-0 -n redis-server -- redis-cli -a Opstree@1234 cluster nodes
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
AUTH failed: ERR AUTH <password> called without any password configured for the default user. Are you sure your configuration is correct?
6e7492c5979f60305003a159e6dbcdfec0a74926 10.244.96.156:6379@16379,redis-cluster-follower-0 slave e988ec0596613ded3f6ac8f241851596f4414814 0 1753167523762 1 connected
e988ec0596613ded3f6ac8f241851596f4414814 10.244.96.155:6379@16379,redis-cluster-leader-0 myself,master - 0 1753167524000 1 connected 0-5460
b9dc458eafb1a8e7aff36df09ebad220d3ca528d 10.244.95.218:6379@16379,redis-cluster-follower-1 slave 0b4d5d5b62e5d3cbb4dcf5d668844b5c7d657682 0 1753167524767 2 connected
ae32c2492600c36453725c2e5ce2d5d24e99f68a 10.244.66.37:6379@16379,redis-cluster-follower-2 slave 1d1e17f99792b1e6087540dd0de433ad385b4aa5 0 1753167523000 3 connected
1d1e17f99792b1e6087540dd0de433ad385b4aa5 10.244.66.36:6379@16379,redis-cluster-leader-2 master - 0 1753167523000 3 connected 10923-16383
0b4d5d5b62e5d3cbb4dcf5d668844b5c7d657682 10.244.95.205:6379@16379,redis-cluster-leader-1 master - 0 1753167524566 2 connected 5461-10922
检查service状态
root@k8s-master01:~/redis# kubectl get svc -n redis-server
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
redis-cluster-follower ClusterIP 10.244.49.246 <none> 6379/TCP 81m
redis-cluster-follower-additional ClusterIP 10.244.2.131 <none> 6379/TCP 81m
redis-cluster-follower-headless ClusterIP None <none> 6379/TCP 81m
redis-cluster-leader ClusterIP 10.244.37.225 <none> 6379/TCP 82m
redis-cluster-leader-additional ClusterIP 10.244.22.8 <none> 6379/TCP 82m
redis-cluster-leader-headless ClusterIP None <none> 6379/TCP 82m
redis-cluster-master ClusterIP 10.244.38.178 <none> 6379/TCP 82m
关键 Service:
my-redis-cluster-leader: 主要的连接入口,指向当前的 leader/master 节点。客户端应连接此 Service。my-redis-cluster-follower: 指向 follower/slave 节点,用于读密集型场景(需要客户端配置)。
3.2.2 配置Redis访问凭证
上面可以看到密码是Helm自动配置的默认密码,那么怎么修改密码呢
1)首先创建一个secret
# redis-secret.yaml
---
apiVersion: v1
kind: Secret
metadata:
name: redis-secret # 此名称将在 RedisCluster CR 中引用
namespace: redis-server # 建议为 Redis 集群创建一个专用命名空间
data:
password: UEBzc3cwcmQ= # "P@ssw0rd" 的 Base64 编码
type: Opaque
2)将redis cluster的values get下来
$ helm get values -n redis-server redis-cluster --all >> values.yaml
3)修改values.yaml文件
$ vim values.yaml
redisCluster:
redisSecret:
secretKey: "password" # secret的key名
secretName: "redis-secret" # 刚刚创建的secret名称
4)更新helm
root@k8s-master01:~/redis# helm upgrade -n redis-server redis-cluster ot-helm/redis-cluster -f values.yaml
Release "redis-cluster" has been upgraded. Happy Helming!
NAME: redis-cluster
LAST DEPLOYED: Tue Jul 22 15:11:40 2025
NAMESPACE: redis-server
STATUS: deployed
REVISION: 2
TEST SUITE: None
这里做一个小记
helm upgrade 更新helm命令格式如下
Usage: helm upgrade [RELEASE] [CHART] [flags]
RELEASE名称
CHART名称
CHART名称可以根据以下命令获得
$ helm search repo ot-helm | grep redis-cluster
ot-helm/redis-cluster 0.17.0 0.17.0 Provides easy redis setup definitions for Kuber...
5)现在使用更新后的redis的密码查看集群状态,验证我们密码是否更新成功
$ kubectl exec -it redis-cluster-leader-0 -n redis-server -- redis-cli -a 'P@ssw0rd' cluster nodes
可以看到如下输出:密码更新成功!
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
b9dc458eafb1a8e7aff36df09ebad220d3ca528d 10.244.95.220:6379@16379,redis-cluster-follower-1 master - 0 1753168734000 5 connected 5461-10922
1d1e17f99792b1e6087540dd0de433ad385b4aa5 10.244.66.38:6379@16379,redis-cluster-leader-2 slave ae32c2492600c36453725c2e5ce2d5d24e99f68a 0 1753168734000 4 connected
6e7492c5979f60305003a159e6dbcdfec0a74926 10.244.96.158:6379@16379,redis-cluster-follower-0 master - 0 1753168735135 6 connected 0-5460
0b4d5d5b62e5d3cbb4dcf5d668844b5c7d657682 10.244.95.219:6379@16379,redis-cluster-leader-1 slave b9dc458eafb1a8e7aff36df09ebad220d3ca528d 0 1753168734000 5 connected
e988ec0596613ded3f6ac8f241851596f4414814 10.244.96.157:6379@16379,redis-cluster-leader-0 myself,slave 6e7492c5979f60305003a159e6dbcdfec0a74926 0 1753168734000 6 connected
ae32c2492600c36453725c2e5ce2d5d24e99f68a 10.244.66.39:6379@16379,redis-cluster-follower-2 master - 0 1753168734131 4 connected 10923-16383
3.1.3 Redis Cluster配置参数清单
Redis Cluster Helm配置清单如下
| Key | Type | Default | Description |
|---|---|---|---|
| TLS.ca | string | "ca.key" |
|
| TLS.cert | string | "tls.crt" |
|
| TLS.key | string | "tls.key" |
|
| TLS.secret.secretName | string | "" |
|
| acl.secret.secretName | string | "" |
|
| env | list | [] |
|
| externalConfig.data | string | "tcp-keepalive 400\nslowlog-max-len 158\nstream-node-max-bytes 2048\n" |
|
| externalConfig.enabled | bool | false |
|
| externalService.enabled | bool | false |
|
| externalService.port | int | 6379 |
|
| externalService.serviceType | string | "LoadBalancer" |
|
| initContainer.args | list | [] |
|
| initContainer.command | list | [] |
|
| initContainer.enabled | bool | false |
|
| initContainer.env | list | [] |
|
| initContainer.image | string | "" |
|
| initContainer.imagePullPolicy | string | "IfNotPresent" |
|
| initContainer.resources | object | {} |
|
| labels | object | {} |
|
| podSecurityContext.fsGroup | int | 1000 |
|
| podSecurityContext.runAsUser | int | 1000 |
|
| priorityClassName | string | "" |
|
| redisCluster.clusterSize | int | 3 |
|
| redisCluster.clusterVersion | string | "v7" |
|
| redisCluster.follower.affinity | string | nil |
|
| redisCluster.follower.nodeSelector | string | nil |
|
| redisCluster.follower.pdb.enabled | bool | false |
|
| redisCluster.follower.pdb.maxUnavailable | int | 1 |
|
| redisCluster.follower.pdb.minAvailable | int | 1 |
|
| redisCluster.follower.replicas | int | 3 |
|
| redisCluster.follower.securityContext | object | {} |
|
| redisCluster.follower.serviceType | string | "ClusterIP" |
|
| redisCluster.follower.tolerations | list | [] |
|
| redisCluster.image | string | "quay.io/opstree/redis" |
|
| redisCluster.imagePullPolicy | string | "IfNotPresent" |
|
| redisCluster.imagePullSecrets | object | {} |
|
| redisCluster.leader.affinity | object | {} |
|
| redisCluster.leader.nodeSelector | string | nil |
|
| redisCluster.leader.pdb.enabled | bool | false |
|
| redisCluster.leader.pdb.maxUnavailable | int | 1 |
|
| redisCluster.leader.pdb.minAvailable | int | 1 |
|
| redisCluster.leader.replicas | int | 3 |
|
| redisCluster.leader.securityContext | object | {} |
|
| redisCluster.leader.serviceType | string | "ClusterIP" |
|
| redisCluster.leader.tolerations | list | [] |
|
| redisCluster.minReadySeconds | int | 0 |
|
| redisCluster.name | string | "" |
|
| redisCluster.persistenceEnabled | bool | true |
|
| redisCluster.recreateStatefulSetOnUpdateInvalid | bool | false |
Some fields of statefulset are immutable, such as volumeClaimTemplates. When set to true, the operator will delete the statefulset and recreate it. Default is false. |
| redisCluster.redisSecret.secretKey | string | "" |
|
| redisCluster.redisSecret.secretName | string | "" |
|
| redisCluster.resources | object | {} |
|
| redisCluster.tag | string | "v7.0.15" |
|
| redisExporter.enabled | bool | false |
|
| redisExporter.env | list | [] |
|
| redisExporter.image | string | "quay.io/opstree/redis-exporter" |
|
| redisExporter.imagePullPolicy | string | "IfNotPresent" |
|
| redisExporter.resources | object | {} |
|
| redisExporter.tag | string | "v1.44.0" |
|
| serviceAccountName | string | "" |
|
| serviceMonitor.enabled | bool | false |
|
| serviceMonitor.interval | string | "30s" |
|
| serviceMonitor.namespace | string | "monitoring" |
|
| serviceMonitor.scrapeTimeout | string | "10s" |
|
| sidecars.env | object | {} |
|
| sidecars.image | string | "" |
|
| sidecars.imagePullPolicy | string | "IfNotPresent" |
|
| sidecars.name | string | "" |
|
| sidecars.resources.limits.cpu | string | "100m" |
|
| sidecars.resources.limits.memory | string | "128Mi" |
|
| sidecars.resources.requests.cpu | string | "50m" |
|
| sidecars.resources.requests.memory | string | "64Mi" |
|
| storageSpec.nodeConfVolume | bool | true |
|
| storageSpec.nodeConfVolumeClaimTemplate.spec.accessModes[0] | string | "ReadWriteOnce" |
|
| storageSpec.nodeConfVolumeClaimTemplate.spec.resources.requests.storage | string | "1Gi" |
|
| storageSpec.volumeClaimTemplate.spec.accessModes[0] | string | "ReadWriteOnce" |
|
| storageSpec.volumeClaimTemplate.spec.resources.requests.storage | string | "1Gi" |
3.3 Redis Replication
3.3.1 Redis Replication部署
Redis Replication 是将数据从一个 Redis 主节点(leader node)同步到一个或多个从节点(follower node)的过程。
在 Redis Replication 中,主节点负责接收写请求,并将数据变更同步到一个或多个从节点。从节点从主节点接收数据变更并在本地应用,从而形成主节点数据集的副本。
Redis Replication采用异步复制方式,这意味着主节点在发送新更新时,不会等待从节点完成变更应用。相反,从节点会根据可用的网络带宽和硬件性能,尽快与主节点同步数据。

使用Helm部署
$ helm install redis-replication ot-helm/redis-replication \
--set redisreplication.clusterSize=3 --namespace ot-operators
...
NAME: redis-replication
LAST DEPLOYED: Tue Mar 21 22:47:44 2023
NAMESPACE: ot-operators
STATUS: deployed
REVISION: 1
TEST SUITE: None
通过检查 pod 的状态来验证Redis Replication
root@k8s-master01:~/redis# kubectl get all -n redis-server
NAME READY STATUS RESTARTS AGE
pod/redis-replication-0 1/1 Running 0 26s
pod/redis-replication-1 1/1 Running 0 23s
pod/redis-replication-2 1/1 Running 0 20s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/redis-replication ClusterIP 10.244.4.100 <none> 6379/TCP 26s
service/redis-replication-additional ClusterIP 10.244.18.241 <none> 6379/TCP 26s
service/redis-replication-headless ClusterIP None <none> 6379/TCP 26s
service/redis-replication-master ClusterIP 10.244.53.248 <none> 6379/TCP 26s
service/redis-replication-replica ClusterIP 10.244.2.65 <none> 6379/TCP 26s
NAME READY AGE
statefulset.apps/redis-replication 3/3 26s
3.3.2 Redis Replication配置参数清单
配置参数清单如下
| Key | Type | Default | Description |
|---|---|---|---|
| TLS.ca | string | "ca.key" |
|
| TLS.cert | string | "tls.crt" |
|
| TLS.key | string | "tls.key" |
|
| TLS.secret.secretName | string | "" |
|
| affinity | object | {} |
|
| env | list | [] |
|
| externalConfig.data | string | "tcp-keepalive 400\nslowlog-max-len 158\nstream-node-max-bytes 2048\n" |
|
| externalConfig.enabled | bool | false |
|
| externalService.enabled | bool | false |
|
| externalService.port | int | 26379 |
|
| externalService.serviceType | string | "NodePort" |
|
| initContainer.args | list | [] |
|
| initContainer.command | list | [] |
|
| initContainer.enabled | bool | false |
|
| initContainer.env | list | [] |
|
| initContainer.image | string | "" |
|
| initContainer.imagePullPolicy | string | "IfNotPresent" |
|
| initContainer.resources | object | {} |
|
| labels | object | {} |
|
| livenessProbe.failureThreshold | int | 3 |
|
| livenessProbe.initialDelaySeconds | int | 1 |
|
| livenessProbe.periodSeconds | int | 10 |
|
| livenessProbe.successThreshold | int | 1 |
|
| livenessProbe.timeoutSeconds | int | 1 |
|
| nodeSelector | object | {} |
|
| pdb.enabled | bool | false |
|
| pdb.maxUnavailable | string | nil |
|
| pdb.minAvailable | int | 1 |
|
| podSecurityContext.fsGroup | int | 1000 |
|
| podSecurityContext.runAsUser | int | 1000 |
|
| priorityClassName | string | "" |
|
| readinessProbe.failureThreshold | int | 3 |
|
| readinessProbe.initialDelaySeconds | int | 1 |
|
| readinessProbe.periodSeconds | int | 10 |
|
| readinessProbe.successThreshold | int | 1 |
|
| readinessProbe.timeoutSeconds | int | 1 |
|
| redisExporter.enabled | bool | false |
|
| redisExporter.env | list | [] |
|
| redisExporter.image | string | "quay.io/opstree/redis-exporter" |
|
| redisExporter.imagePullPolicy | string | "IfNotPresent" |
|
| redisExporter.resources | object | {} |
|
| redisExporter.tag | string | "v1.44.0" |
|
| redisSentinel.clusterSize | int | 3 |
|
| redisSentinel.ignoreAnnotations | list | [] |
|
| redisSentinel.image | string | "quay.io/opstree/redis-sentinel" |
|
| redisSentinel.imagePullPolicy | string | "IfNotPresent" |
|
| redisSentinel.imagePullSecrets | list | [] |
|
| redisSentinel.minReadySeconds | int | 0 |
|
| redisSentinel.name | string | "" |
|
| redisSentinel.recreateStatefulSetOnUpdateInvalid | bool | false |
Some fields of statefulset are immutable, such as volumeClaimTemplates. When set to true, the operator will delete the statefulset and recreate it. Default is false. |
| redisSentinel.redisSecret.secretKey | string | "" |
|
| redisSentinel.redisSecret.secretName | string | "" |
|
| redisSentinel.resources | object | {} |
|
| redisSentinel.serviceType | string | "ClusterIP" |
|
| redisSentinel.tag | string | "v7.0.15" |
|
| redisSentinelConfig.downAfterMilliseconds | string | "" |
|
| redisSentinelConfig.failoverTimeout | string | "" |
|
| redisSentinelConfig.masterGroupName | string | "" |
|
| redisSentinelConfig.parallelSyncs | string | "" |
|
| redisSentinelConfig.quorum | string | "" |
|
| redisSentinelConfig.redisPort | string | "" |
|
| redisSentinelConfig.redisReplicationName | string | "redis-replication" |
|
| redisSentinelConfig.redisReplicationPassword.secretKey | string | "" |
|
| redisSentinelConfig.redisReplicationPassword.secretName | string | "" |
|
| redisSentinelConfig.resolveHostnames | string | "no" |
|
| redisSentinelConfig.announceHostnames | string | "no" |
|
| securityContext | object | {} |
|
| serviceAccountName | string | "" |
|
| serviceMonitor.enabled | bool | false |
|
| serviceMonitor.interval | string | "30s" |
|
| serviceMonitor.namespace | string | "monitoring" |
|
| serviceMonitor.scrapeTimeout | string | "10s" |
|
| sidecars.env | list | [] |
|
| sidecars.image | string | "" |
|
| sidecars.imagePullPolicy | string | "IfNotPresent" |
|
| sidecars.name | string | "" |
|
| sidecars.resources.limits.cpu | string | "100m" |
|
| sidecars.resources.limits.memory | string | "128Mi" |
|
| sidecars.resources.requests.cpu | string | "50m" |
|
| sidecars.resources.requests.memory | string | "64Mi" |
|
| tolerations | list | [] |
3.4 Redis Sentinel
3.4.1 概述
Redis Sentinel 是 Redis 的高可用组件,主要功能包括:
- 自动故障转移:当主节点出现故障时,Sentinel 能自动选出新的主节点;
- 节点监控:持续检查 Redis 主/从节点是否健康;
- 通知功能:在故障发生或切换时发送通知;
- 配置管理:更新集群配置信息,让从节点连接新主节点。
Sentinel 本质上是一个运行独立进程的监控系统,它之间会相互通信,并与 Redis 节点互动,以实现高可用。

3.4.2 部署
1) 使用Helm部署
使用helm部署
$ helm install redis-sentinel ot-helm/redis-sentinel \
--set redissentinel.clusterSize=3 --namespace ot-operators \
--set redisSentinelConfig.redisReplicationName="redis-replication"
...
NAME: redis-sentinel
LAST DEPLOYED: Tue Mar 21 23:11:57 2023
NAMESPACE: ot-operators
STATUS: deployed
REVISION: 1
TEST SUITE: None
参数解释
{"level":"error","ts":"2025-07-28T03:55:46Z","msg":"","controller":"redissentinel","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisSentinel","RedisSentinel":{"name":"redis-sentinel","namespace":"juicefs"},"namespace":"juicefs","name":"redis-sentinel","reconcileID":"81a58023-46ae-4084-8aeb-088ffdc088ee","error":"no real master pod found","stacktrace":"github.com/OT-CONTAINER-KIT/redis-operator/internal/k8sutils.getRedisReplicationMasterPod\n\t/workspace/internal/k8sutils/redis-sentinel.go:363\ngithub.com/OT-CONTAINER-KIT/redis-operator/internal/k8sutils.getRedisReplicationMasterIP\n\t/workspace/internal/k8sutils/redis-sentinel.go:374\ngithub.com/OT-CONTAINER-KIT/redis-operator/internal/k8sutils.IsRedisReplicationReady\n\t/workspace/internal/k8sutils/redis-replication.go:242\ngithub.com/OT-CONTAINER-KIT/redis-operator/internal/controller/redissentinel.(*RedisSentinelReconciler).reconcileReplication\n\t/workspace/internal/controller/redissentinel/redissentinel_controller.go:93\ngithub.com/OT-CONTAINER-KIT/redis-operator/internal/controller/redissentinel.(*RedisSentinelReconciler).Reconcile\n\t/workspace/internal/controller/redissentinel/redissentinel_controller.go:60\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:227"}
redis-sentinel:发布名称clusterSize=3:部署 3 个 sentinel 实例,形成一个可选举的仲裁集群redisSentinelConfig.redisReplicationName="redis-replication":这是一个关键参数,必须是已存在的 RedisReplication 资源名称,Sentinel 用来监控它
查看Pod状态
$ kubectl get pod -n ot-operators
NAME READY STATUS RESTARTS AGE
redis-replication-0 1/1 Running 0 107s
redis-replication-1 1/1 Running 0 105s
redis-replication-2 1/1 Running 0 101s
redis-sentinel-sentinel-0 1/1 Running 0 67s
redis-sentinel-sentinel-1 1/1 Running 0 59s
redis-sentinel-sentinel-2 1/1 Running 0 51s
查看service状态
root@k8s-master01:~/redis# kubectl get svc -n redis-server
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
redis-replication ClusterIP 10.244.45.240 <none> 6379/TCP 113s
redis-replication-additional ClusterIP 10.244.7.159 <none> 6379/TCP 113s
redis-replication-headless ClusterIP None <none> 6379/TCP 113s
redis-replication-master ClusterIP 10.244.51.62 <none> 6379/TCP 113s
redis-replication-replica ClusterIP 10.244.9.249 <none> 6379/TCP 113s
redis-sentinel-sentinel ClusterIP 10.244.6.134 <none> 26379/TCP 44s
redis-sentinel-sentinel-additional ClusterIP 10.244.12.179 <none> 26379/TCP 43s
redis-sentinel-sentinel-headless ClusterIP None <none> 26379/TCP 44s
注意事项:
- 客户端通过 Sentinel 获取当前主节点 IP 和端口,所以客户端连接sentinel节点的地址
- 主节点故障时,Sentinel 会自动选出新主节点,客户端通过 Sentinel 能及时获取最新的主节点信息;
2) 使用YAML部署
以下是一个最基本的 Redis Sentinel YAML 配置示例:
apiVersion: redis.redis.opstreelabs.in/v1beta2
kind: RedisSentinel
metadata:
name: redis-sentinel
spec:
clusterSize: 3
podSecurityContext:
runAsUser: 1000
fsGroup: 1000
redisSentinelConfig:
redisReplicationName: redis-replication # 必须是已存在的 RedisReplication 名称
kubernetesConfig:
image: quay.io/opstree/redis-sentinel:v7.0.15
imagePullPolicy: IfNotPresent
resources:
requests:
cpu: 101m
memory: 128Mi
limits:
cpu: 101m
memory: 128Mi
应用 YAML 文件:
kubectl apply -f sentinel.yaml
重要说明:
redisReplicationName字段必须与已有的 RedisReplication 资源名称一致;- RedisSentinel 是用来监控由 RedisReplication 创建的 Redis 主从节点;
- 所以在部署 Sentinel 之前,必须先部署 RedisReplication 资源。
3.4.3 验证
前提条件:配置好redis-replication和redis-sentinel的密码
在本机安装redis-cli客户端连接测试,发现可以正常连接进sentinel节点
root@k8s-master01:~/redis# redis-cli -h 10.244.6.134 -p 26379 -a P@ssw0rd
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
10.244.6.134:26379>
查看节点状态可以发现,1主节点,2个从节点,3个哨兵节点
10.244.6.134:26379> INFO Sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_tilt_since_seconds:-1
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=myMaster,status=ok,address=10.244.95.228:6379,slaves=2,sentinels=3
常用命令
SENTINEL masters # 查看所有被监控的主节点信息
SENTINEL get-master-addr-by-name <master-name> # 获取主节点地址
SENTINEL slaves <master-name> # 查看某个主节点的从节点列表
INFO Sentinel # 查看sentinel自身状态
3.5 Redis 故障模拟测试
本次使用 redis replication和redisSentinel模式测试,模拟主节点宕机,会不会自动把从节点切换成主节点,并且流量导向也到新主节点
准备redis主从+哨兵环境如下
root@k8s-master01:~# kubectl get pod,svc -n juicefs | grep redis
pod/redis-operator-865fcc9887-smj5b 1/1 Running 0 45m
pod/redis-replication-0 1/1 Running 0 25m
pod/redis-replication-1 1/1 Running 0 25m
pod/redis-replication-2 1/1 Running 0 25m
pod/redis-sentinel-sentinel-0 1/1 Running 0 2m29s
pod/redis-sentinel-sentinel-1 1/1 Running 0 2m27s
pod/redis-sentinel-sentinel-2 1/1 Running 0 2m26s
service/redis-replication ClusterIP 10.244.18.194 <none> 6379/TCP 44m
service/redis-replication-additional ClusterIP 10.244.11.114 <none> 6379/TCP 44m
service/redis-replication-headless ClusterIP None <none> 6379/TCP 44m
service/redis-replication-master ClusterIP 10.244.16.33 <none> 6379/TCP 44m
service/redis-replication-replica ClusterIP 10.244.9.234 <none> 6379/TCP 44m
service/redis-sentinel-sentinel ClusterIP 10.244.16.240 <none> 26379/TCP 2m24s
service/redis-sentinel-sentinel-additional ClusterIP 10.244.1.80 <none> 26379/TCP 2m24s
service/redis-sentinel-sentinel-headless ClusterIP None <none> 26379/TCP 2m25s
其中:redis-replication-master svc指向的是redis主从中的主节点
查看其配置文件可以发现,其指向的是 redis-role: master 标签
root@k8s-master01:~# kubectl get svc -n juicefs redis-replication-master -o yaml
...
spec:
selector:
app: redis-replication
app.kubernetes.io/component: middleware
app.kubernetes.io/instance: redis-replication
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: redis-replication
app.kubernetes.io/version: 0.16.7
helm.sh/chart: redis-replication-0.16.7
redis-role: master
redis_setup_type: replication
role: replication
sessionAffinity: None
type: ClusterIP
....
测试
现在可知 redis-replication-0是主节点
root@k8s-master01:~# kubectl get pod -n juicefs --show-labels | grep master
redis-replication-0 1/1 Running 0 29m app.kubernetes.io/component=middleware,app.kubernetes.io/instance=redis-replication,app.kubernetes.io/managed-by=Helm,app.kubernetes.io/name=redis-replication,app.kubernetes.io/version=0.16.7,app=redis-replication,apps.kubernetes.io/pod-index=0,controller-revision-hash=redis-replication-5487f6cd66,helm.sh/chart=redis-replication-0.16.7,redis-role=master,redis_setup_type=replication,role=replication,statefulset.kubernetes.io/pod-name=redis-replication-0
现在我们干掉 redis-replication-0,我直接把他所在的节点干掉
root@k8s-node-01:~# init 0
四、监控
4.1 Redis Exporter
在redis operator中内置了Redis Exporter,以 Prometheus 格式导出 redis 设置的指标。
监控架构如下图所示

对于使用Helm部署的 Redis资源,只需要修改values.yaml的如下参数即可开启Redis Exporter
redisExporter:
enabled: true
image: quay.io/opstree/redis-exporter:1.0
imagePullPolicy: Always
或
redisExporter:
enabled: true
image: quay.io/opstree/redis-exporter
imagePullPolicy: IfNotPresent
tag: v1.44.0
然后更新redis
root@k8s-master01:~/redis# helm upgrade -n redis-server redis-replication ot-helm/redis-replication -f values.yaml
Release "redis-replication" has been upgraded. Happy Helming!
NAME: redis-replication
LAST DEPLOYED: Tue Jul 22 17:10:48 2025
NAMESPACE: redis-server
STATUS: deployed
REVISION: 2
TEST SUITE: None
查看redis exporter部署详情
查看发现,Pod数量变成了2,因为redis-exporter是以sidecar的形式跑pod里面
root@k8s-master01:~/redis# kubectl get pod -n redis-server
NAME READY STATUS RESTARTS AGE
redis-replication-0 2/2 Running 0 9m21s
redis-replication-1 2/2 Running 0 9m29s
redis-replication-2 2/2 Running 0 9m37s
查看pod详情
root@k8s-master01:~/redis# kubectl get pod -n redis-server redis-replication-0 -o yaml
...
status:
containerStatuses:
- containerID: containerd://bf31f0fddf0a0b3d43efd6f5fae1c86821d1412d23323c724ebf4e10cb0d4527
image: quay.io/opstree/redis-exporter:v1.44.0
imageID: quay.io/opstree/redis-exporter@sha256:a63d2b6e946f8b82467ec5f853c24ab994b7cab5eacc3c3cbaa50e49bd27f235
lastState: {}
name: redis-exporter
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2025-07-22T09:11:12Z"
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-ps47p
readOnly: true
recursiveReadOnly: Disabled
...
查看svc详情,可以发现也多映射了一个端口,这个9121就是我们要采集的指标端口
root@k8s-master01:~/redis# kubectl get svc -n redis-server redis-replication -o yaml
...
spec:
ports:
...
- name: redis-exporter
port: 9121
protocol: TCP
targetPort: 9121
...
这样就可以配置Prometheus去采集指标了
4.2 ServiceMonitor
配置完Redis Exporter之后,在redis operator中还支持配置serviceMonitor,他会自动创建一个serviceMonitor资源,如果环境中有prometheus-oprator会很方便的就采集到指标
修改values.yaml文件
serviceMonitor:
enabled: true # 修改为true
extraLabels: {}
interval: 30s
namespace: monitoring
scrapeTimeout: 10s
更新redis
root@k8s-master01:~/redis# helm upgrade -n redis-server redis-replication ot-helm/redis-replication --install -f values.yaml
Release "redis-replication" has been upgraded. Happy Helming!
NAME: redis-replication
LAST DEPLOYED: Tue Jul 22 17:53:35 2025
NAMESPACE: redis-server
STATUS: deployed
REVISION: 4
TEST SUITE: None
查看operator创建的serviceMonitor
root@k8s-master01:~/redis# kubectl get serviceMonitor -n redis-server
NAME AGE
redis-replication-prometheus-monitoring 22s
root@k8s-master01:~/redis# kubectl get serviceMonitor -n redis-server -o yaml
apiVersion: v1
items:
- apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
annotations:
meta.helm.sh/release-name: redis-replication
meta.helm.sh/release-namespace: redis-server
creationTimestamp: "2025-07-22T09:53:36Z"
generation: 1
labels:
app.kubernetes.io/component: middleware
app.kubernetes.io/instance: redis-replication
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: redis-replication
app.kubernetes.io/version: 0.16.7
helm.sh/chart: redis-replication-0.16.7
name: redis-replication-prometheus-monitoring
namespace: redis-server
resourceVersion: "3076926"
uid: d13aeee2-c748-479b-be9e-79f49b3882fd
spec:
endpoints:
- interval: 30s
port: redis-exporter
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- redis-server
selector:
matchLabels:
app: redis-replication
redis_setup_type: replication
role: replication
kind: List
metadata:
resourceVersion: ""
更多推荐




所有评论(0)