当前位置: 代码迷 >> 综合 >> Redis 集群模式详解 - 主从(master-slave)、哨兵(sentinel)、集群(cluster)
  详细解决方案

Redis 集群模式详解 - 主从(master-slave)、哨兵(sentinel)、集群(cluster)

热度:79   发布时间:2023-10-10 05:28:37

首先 Redis 有三种集群模式,分别是

  • 主从模式
  • Sentinel 模式(哨兵模式)
  • Cluster 模式

关于 Redis 的相关信息可以去 Redis 官网 https://redis.io 查看,当前 Redis 最新版本是 6.0.10
安装包下载链接:https://download.redis.io/releases/redis-6.0.10.tar.gz


Redis 集群模式详解

    • 一、主从模式
        • 1)主从模式介绍
        • 2)主从模式搭建
    • 二、Sentinel 模式
        • 1)Sentinel 模式介绍
        • 2)Sentinel 模式搭建
    • 三、Cluster 模式
        • 1)Cluster 模式介绍
        • 2)Cluster 模式搭建
        • 3)集群操作


一、主从模式

1)主从模式介绍

Redis 的主从模式跟 mysql 主从复制原理差不多,在主从复制中,数据库分为两类:主数据库(master)和从数据库(slave)。
主从复制主要有如下特点:

  • 主数据库可以进行读写操作,从库只能进行读操作(可以配置从库支持读写操作,不建议)
  • 当主数据库的读写操作导致数据变化时会自动将数据同步给从数据库
  • 主从模式可以是一主多从,即一个 master 可以拥有多个 slave,但只能一从一主,即一个 slave 只能对应一个 master
  • slave 挂了之后不会影响其它 slave 读和 master 读写,重启启动 slave 之后会自动从 master 同步数据过来
  • master 挂了以后,不影响 slave 读,但 Redis 不再提供写服务,master 重启后 Redis 将重新对外提供写服务
  • master 挂了以后,不会在 slave 节点中重新选一个 master

主从模式的原理:
Redis 集群模式详解 - 主从(master-slave)、哨兵(sentinel)、集群(cluster)

主从模式的工作机制:
Redis 集群模式详解 - 主从(master-slave)、哨兵(sentinel)、集群(cluster)
??当slave启动后,主动向master发送SYNC命令。master接收到SYNC命令后在后台保存快照(RDB持久化)和缓存保存快照这段时间的命令,然后将保存的快照文件和缓存的命令发送给slave。slave接收到快照文件和命令后加载快照文件和缓存的执行命令。
??复制初始化后,master每次接收到的写命令都会同步发送给slave,保证主从数据一致性。

主从模式的缺点:
??master 节点在主从模式中唯一,若 master 挂掉,则 Redis 无法对外提供写服务。

2)主从模式搭建

  • 环境准备
角色 主机名 ip 地址
master redis-0 192.168.1.29
slave redis-1 192.168.1.30
slave redis-2 192.168.1.31
  • 全部节点安装基础命令
[root@redis-0 ~]# yum -y install gcc automake autoconf libtool make
  • 全部节点安装 Redis 软件
[root@redis-0 ~]# wget -P /usr/local/src/ https://download.redis.io/releases/redis-6.0.10.tar.gz
[root@redis-0 ~]# cd /usr/local/src/ && tar -zxvf redis-6.0.10.tar.gz && mv redis-6.0.10 /usr/local/redis/ && cd /usr/local/redis/
[root@redis-0 src]# make install
  • 全部节点的 Redis 配置成服务
[root@redis-0 bin]# cat /usr/lib/systemd/system/redis.service
[Unit]
Description=Redis
After=network.target[Service]
Type=forking
PIDFile=/var/run/redis_6379.pid
ExecStart=/usr/local/bin/redis-server /usr/local/redis/redis.conf
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s QUIT $MAINPID
PrivateTmp=true[Install]
WantedBy=multi-user.target[root@redis-0 bin]# systemctl daemon-reload
[root@redis-0 bin]# systemctl enable redis
  • 修改配置文件

在 192.168.1.29 机子上修改 Redis 配置文件,配置为 master 数据库:

[root@redis-0 ~]# mkdir -p /data/redis/data/
[root@redis-0 ~]# vi /usr/local/redis/redis.conf
--------找到如下内容并修改--------
bind 0.0.0.0			#对所有ip开放连接
daemonize yes           #允许后台启动
logfile "/usr/local/redis/redis.log"        #日志路径
dir /data/redis/data/	#数据库备份文件存放目录
requirepass 123456		#Redis连接密码,不设置则slave节点也无需设置 masterauth
appendonly yes			#在/data/redis/data/目录下生成appendonly.aof文件,将每一次写操作请求都追加到appendonly.aof 文件中

在 192.168.1.30 机子上修改 Redis 配置文件,配置为 slave 数据库:

[root@redis-1 ~]# mkdir -p /data/redis/data/
[root@redis-1 ~]# vi /usr/local/redis/redis.conf
--------找到如下内容并修改--------
bind 0.0.0.0			#对所有ip开放连接
daemonize yes           #允许后台启动
logfile "/usr/local/redis/redis.log"        #日志路径
dir /data/redis/data/	#数据库备份文件存放目录
replicaof 192.168.1.29 6379		#slave对应的master
masterauth 123456		#slave连接master的密码,master没有设置连接密码则无需配置
requirepass 123456		#Redis连接密码
appendonly yes			

在 192.168.1.31 机子上修改 Redis 配置文件,配置为 slave 数据库:

[root@redis-1 ~]# mkdir -p /data/redis/data/
[root@redis-1 ~]# vi /usr/local/redis/redis.conf
--------找到如下内容并修改--------
bind 0.0.0.0			#对所有ip开放连接
daemonize yes           #允许后台启动
logfile "/usr/local/redis/redis.log"        #日志路径
dir /data/redis/data/	#数据库备份文件存放目录
replicaof 192.168.1.29 6379		#slave对应的master
masterauth 123456		#slave连接master的密码,master没有设置连接密码则无需配置
requirepass 123456		#Redis连接密码
appendonly yes			
  • 全部节点启动 Redis
[root@redis-0 ~]# service redis status
  • 查看集群状态
[root@redis-0 ~]# redis-cli -h 192.168.1.29 -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.1.29:6379> info replication
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.1.30,port=6379,state=online,offset=154,lag=0
slave1:ip=192.168.1.31,port=6379,state=online,offset=154,lag=1
master_replid:683faac7a821e3a8f29c8a642ea5443b7a5d23c7
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:154
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:154
[root@redis-1 redis]# redis-cli -h 192.168.1.30 -a 123456 info replication
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:slave
master_host:192.168.1.29
master_port:6379
master_link_status:up
master_last_io_seconds_ago:8
master_sync_in_progress:0
slave_repl_offset:308
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:683faac7a821e3a8f29c8a642ea5443b7a5d23c7
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:308
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:308
  • 数据演示
192.168.1.29:6379> keys *
(empty array)
192.168.1.29:6379> set k1 v1.sanchar
OK
192.168.1.29:6379> set k2 v2.lisa
OK
192.168.1.29:6379> keys *
1) "k2"
2) "k1"
[root@redis-1 redis]# redis-cli -h 192.168.1.30 -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.1.30:6379> keys *
1) "k2"
2) "k1"
192.168.1.30:6379> get k1
"v1.sanchar"
192.168.1.30:6379> get k2
"v2.lisa"
192.168.1.30:6379> CONFIG GET dir
1) "dir"
2) "/data/redis/data"
192.168.1.30:6379> CONFIG GET dbfilename
1) "dbfilename"
2) "dump.rdb"
192.168.1.30:6379> set k3 v3.tina
(error) READONLY You can't write against a read only replica.

从上边的测试数据可以看到,master 节点写入的数据很快就同步到了 slave 节点,而在 slave 节点是无法写入数据的。

 

二、Sentinel 模式

1)Sentinel 模式介绍

Redis-Sentinel 是 Redis 给我们提供的一种高可用解决方案,Redis-sentinel本身也是一个独立运行的进程,它能监控多个master-slave集群,发现master宕机后能进行自动切换。Sentinel可以监视任意多个主服务器以及主服务器属下的从服务器,并在被监视的主服务器下线时,自动执行故障转移操作。

sentinel 中文含义为哨兵,所以又可以叫哨兵模式,其特点如下:

  • sentinel 模式是建立在主从模式的基础上,如果只有一个 Redis 节点,sentinel 就没有任何意义
  • 当 master 挂了以后,sentinel 会在 slave 中选择一个做为 master,并修改它们的配置文件,其他 slave 的配置文件也会被修改,比如 slaveof 属性会指向新的 master
  • 当master重新启动后,它将不再是master而是做为slave接收新的master的同步数据
  • sentinel因为也是一个进程有挂掉的可能,所以sentinel也会启动多个形成一个sentinel集群
  • 多sentinel配置的时候,sentinel之间也会自动监控
  • 当主从模式配置密码时,sentinel也会同步将配置信息修改到配置文件中,不需要担心
  • 一个sentinel或sentinel集群可以管理多个主从Redis,多个sentinel也可以监控同一个redis
  • sentinel最好不要和Redis部署在同一台机器,不然Redis的服务器挂了以后,sentinel也挂了

Sentinel 模式的原理:
Redis 集群模式详解 - 主从(master-slave)、哨兵(sentinel)、集群(cluster)

Sentinel 模式的工作机制:

  • 每个 sentinel 以每秒钟一次的频率向它所知的 master,slave 以及其他 sentinel 实例发送一个 PING 命令
  • 如果一个实例距离最后一次有效回复 PING 命令的时间超过 down-after-milliseconds 选项所指定的值, 则这个实例会被 sentinel 标记为主观下线。
  • 如果一个 master 被标记为主观下线,则正在监视这个 master 的所有 sentinel 要以每秒一次的频率确认 master 的确进入了主观下线状态
  • 当有足够数量的 sentinel(大于等于配置文件指定的值)在指定的时间范围内确认 master 的确进入了主观下线状态, 则 master 会被标记为客观下线
  • 在一般情况下, 每个 sentinel 会以每 10 秒一次的频率向它已知的所有 master,slave 发送 INFO 命令
  • 当 master 被 sentinel 标记为客观下线时,sentinel 向下线的 master 的所有 slave 发送 INFO 命令的频率会从 10 秒一次改为 1 秒一次
  • 若没有足够数量的 sentinel 同意 master 已经下线,master 的客观下线状态就会被移除;若 master 重新向 sentinel 的 PING 命令返回有效回复,master 的主观下线状态就会被移除

当使用 sentinel 模式的时候,客户端就不要直接连接 Redis,而是连接sentinel 的 ip 和 port,由 sentinel 来提供具体的可提供服务的 Redis 实现,这样当 master 节点挂掉以后,sentinel 就会感知并将新的 master 节点提供给使用者。

2)Sentinel 模式搭建

  • 环境准备
角色 主机名 ip 地址 sentinel 端口
master redis-0 192.168.1.29 26379
slave redis-1 192.168.1.30 26379
slave redis-2 192.168.1.31 26379
  • 修改配置文件

Sentinel 模式是基于主从模式搭建的,所以直接使用上边已经搭建好的主从模式环境,直接修改 sentinel 配置文件。

在 192.168.1.29、192.168.1.30、192.168.1.31 机子上修改 sentinel 配置文件:

[root@redis-0 ~]# mkdir -p /data/redis/sentinel/
[root@redis-0 ~]# vi /usr/local/redis/sentinel.conf
--------找到如下内容并修改--------
daemonize yes
logfile "/usr/local/redis/sentinel.log"
dir /data/redis/sentinel/			#sentinel工作目录
sentinel monitor mymaster 192.168.1.29 6379 2		#判断master失效至少需要2个sentinel同意,建议设置为n/2+1,n为sentinel个数
sentinel auth-pass mymaster 123456
sentinel down-after-milliseconds mymaster 30000		#判断master主观下线时间,默认30s
  • 全部节点启动 Sentinel
[root@redis-0 ~]# redis-sentinel /usr/local/redis/sentinel.conf
[root@redis-0 ~]# tail -f /usr/local/redis/sentinel.log
8580:X 14 Jan 2021 03:36:46.294 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
8580:X 14 Jan 2021 03:36:46.294 # Redis version=6.0.10, bits=64, commit=00000000, modified=0, pid=8580, just started
8580:X 14 Jan 2021 03:36:46.294 # Configuration loaded
8580:X 14 Jan 2021 03:36:46.296 * Increased maximum number of open files to 10032 (it was originally set to 1024).
8580:X 14 Jan 2021 03:36:46.298 * Running mode=sentinel, port=26379.
8580:X 14 Jan 2021 03:36:46.298 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
8580:X 14 Jan 2021 03:36:46.300 # Sentinel ID is b5bebb2c0917cf00f95c2d8cb2a774003f427d23
8580:X 14 Jan 2021 03:36:46.300 # +monitor master mymaster 192.168.1.29 6379 quorum 2
8580:X 14 Jan 2021 03:36:46.301 * +slave slave 192.168.1.30:6379 192.168.1.30 6379 @ mymaster 192.168.1.29 6379
8580:X 14 Jan 2021 03:36:46.303 * +slave slave 192.168.1.31:6379 192.168.1.31 6379 @ mymaster 192.168.1.29 6379
8580:X 14 Jan 2021 03:40:19.358 * +sentinel sentinel 7668aa47a46ea60a562b3b3ffe43ae94547afc78 192.168.1.30 26379 @ mymaster 192.168.1.29 6379
8580:X 14 Jan 2021 03:40:22.063 * +sentinel sentinel f902518f9b7c57fedf5cdf832bd845c4443c6476 192.168.1.31 26379 @ mymaster 192.168.1.29 6379

以下是所有可以收到的消息的消息格式,第一个单词是频道的名字,其它是数据的格式:

    +reset-master <instance details> -- 当 master 被重置时.+slave <instance details> -- 当检测到一个 slave 并添加进 slave 列表时.+failover-state-reconf-slaves <instance details> -- Failover 状态变为 reconf-slaves 状态时+failover-detected <instance details> -- 当 failover 发生时+slave-reconf-sent <instance details> -- sentinel 发送 SLAVEOF 命令把它重新配置时+slave-reconf-inprog <instance details> -- slave 被重新配置为另外一个 master 的 slave,但数据复制还未发生时。+slave-reconf-done <instance details> -- slave 被重新配置为另外一个 master 的 slave 并且数据复制已经与 master 同步时。-dup-sentinel <instance details> -- 删除指定 master 上的冗余 sentinel 时 (当一个 sentinel 重新启动时,可能会发生这个事件).+sentinel <instance details> -- 当 master 增加了一个 sentinel 时。+sdown <instance details> -- 进入 SDOWN 状态时;-sdown <instance details> -- 离开 SDOWN 状态时。+odown <instance details> -- 进入 ODOWN 状态时。-odown <instance details> -- 离开 ODOWN 状态时。+new-epoch <instance details> -- 当前配置版本被更新时。+try-failover <instance details> -- 达到 failover 条件,正等待其他 sentinel 的选举。+elected-leader <instance details> -- 被选举为去执行 failover 的时候。+failover-state-select-slave <instance details> -- 开始要选择一个 slave 当选新 master 时。no-good-slave <instance details> -- 没有合适的 slave 来担当新 masterselected-slave <instance details> -- 找到了一个适合的 slave 来担当新 masterfailover-state-send-slaveof-noone <instance details> -- 当把选择为新 master 的 slave 的身份进行切换的时候。failover-end-for-timeout <instance details> -- failover 由于超时而失败时。failover-end <instance details> -- failover 成功完成时。switch-master <master name> <oldip> <oldport> <newip> <newport> -- 当 master 的地址发生变化时。通常这是客户端最感兴趣的消息了。+tilt -- 进入Tilt模式。-tilt -- 退出Tilt模式。

instance details 的格式如下:
<instance-type> <name> <ip> <port> @ <master-name> <master-ip> <master-port>
如果这个 Redis 实例是一个 master,那么 @ 之后的消息就不会显示。

  • master宕机演示

停止 192.168.1.29 机子上的 Redis 服务,查看 sentinel 日志:

[root@redis-0 ~]# service redis stop
Redirecting to /bin/systemctl stop redis.service
[root@redis-0 ~]# tail -f /usr/local/redis/sentinel.log
8580:X 14 Jan 2021 03:36:46.303 * +slave slave 192.168.1.31:6379 192.168.1.31 6379 @ mymaster 192.168.1.29 6379
8580:X 14 Jan 2021 03:40:19.358 * +sentinel sentinel 7668aa47a46ea60a562b3b3ffe43ae94547afc78 192.168.1.30 26379 @ mymaster 192.168.1.29 6379
8580:X 14 Jan 2021 03:40:22.063 * +sentinel sentinel f902518f9b7c57fedf5cdf832bd845c4443c6476 192.168.1.31 26379 @ mymaster 192.168.1.29 6379
8580:X 14 Jan 2021 03:56:05.296 # +sdown master mymaster 192.168.1.29 6379
8580:X 14 Jan 2021 03:56:05.605 # +new-epoch 1
8580:X 14 Jan 2021 03:56:05.607 # +vote-for-leader 7668aa47a46ea60a562b3b3ffe43ae94547afc78 1
8580:X 14 Jan 2021 03:56:05.851 # +config-update-from sentinel 7668aa47a46ea60a562b3b3ffe43ae94547afc78 192.168.1.30 26379 @ mymaster 192.168.1.29 6379
8580:X 14 Jan 2021 03:56:05.851 # +switch-master mymaster 192.168.1.29 6379 192.168.1.30 6379
8580:X 14 Jan 2021 03:56:05.852 * +slave slave 192.168.1.31:6379 192.168.1.31 6379 @ mymaster 192.168.1.30 6379
8580:X 14 Jan 2021 03:56:05.852 * +slave slave 192.168.1.29:6379 192.168.1.29 6379 @ mymaster 192.168.1.30 6379
8580:X 14 Jan 2021 03:56:35.864 # +sdown slave 192.168.1.29:6379 192.168.1.29 6379 @ mymaster 192.168.1.30 6379

从日志中可以看到,master 节点已经从 192.168.1.29 转移到 192.168.1.30 上了

在 192.168.1.30 上查看 Redis 集群信息

[root@redis-1 ~]# redis-cli -h 192.168.1.30 -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.1.30:6379> info replication
# Replication
role:master
connected_slaves:1
slave0:ip=192.168.1.31,port=6379,state=online,offset=276967,lag=0
master_replid:b5b0f0440d68ecc751e393401d53ed0c195d2cae
master_replid2:683faac7a821e3a8f29c8a642ea5443b7a5d23c7
master_repl_offset:277106
second_repl_offset:209082
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:277106
192.168.1.30:6379> set k4 v4.sanchar
OK

当前集群中只有一个 slave——192.168.1.31,master 是192.168.1.30,且 192.168.1.30 具有写权限。

192.168.1.31 上查看 Redis 的配置文件也可以看到 replicaof 192.168.1.30 6379,这是 sentinel 在选举新的 master 是做的修改。

把 192.168.1.29 上的 Reids 服务重新启动:

[root@redis-0 ~]# service redis start
Redirecting to /bin/systemctl start redis.service
[root@redis-0 ~]# tail -f /usr/local/redis/sentinel.log
8580:X 14 Jan 2021 03:40:22.063 * +sentinel sentinel f902518f9b7c57fedf5cdf832bd845c4443c6476 192.168.1.31 26379 @ mymaster 192.168.1.29 6379
8580:X 14 Jan 2021 03:56:05.296 # +sdown master mymaster 192.168.1.29 6379
8580:X 14 Jan 2021 03:56:05.605 # +new-epoch 1
8580:X 14 Jan 2021 03:56:05.607 # +vote-for-leader 7668aa47a46ea60a562b3b3ffe43ae94547afc78 1
8580:X 14 Jan 2021 03:56:05.851 # +config-update-from sentinel 7668aa47a46ea60a562b3b3ffe43ae94547afc78 192.168.1.30 26379 @ mymaster 192.168.1.29 6379
8580:X 14 Jan 2021 03:56:05.851 # +switch-master mymaster 192.168.1.29 6379 192.168.1.30 6379
8580:X 14 Jan 2021 03:56:05.852 * +slave slave 192.168.1.31:6379 192.168.1.31 6379 @ mymaster 192.168.1.30 6379
8580:X 14 Jan 2021 03:56:05.852 * +slave slave 192.168.1.29:6379 192.168.1.29 6379 @ mymaster 192.168.1.30 6379
8580:X 14 Jan 2021 03:56:35.864 # +sdown slave 192.168.1.29:6379 192.168.1.29 6379 @ mymaster 192.168.1.30 6379
8580:X 14 Jan 2021 04:06:34.222 # -sdown slave 192.168.1.29:6379 192.168.1.29 6379 @ mymaster 192.168.1.30 6379
8580:X 14 Jan 2021 04:06:44.187 * +convert-to-slave slave 192.168.1.29:6379 192.168.1.29 6379 @ mymaster 192.168.1.30 6379

在 192.168.1.29 上查看 Redis 集群信息

[root@redis-0 ~]# redis-cli -h 192.168.1.29 -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.1.29:6379> info replication
# Replication
role:slave
master_host:192.168.1.30
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:364769
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:b5b0f0440d68ecc751e393401d53ed0c195d2cae
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:364769
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:340532
repl_backlog_histlen:24238
192.168.1.29:6379> get k1
"v1.sanchar"
192.168.1.29:6379> set k5 v5.sanchar
(error) READONLY You can't write against a read only replica.

可以看到,即使 192.168.1.29 重新启动 Redis 服务,也是作为 slave 加入 Redis 集群,192.168.1.30 仍然是 master。

三、Cluster 模式

1)Cluster 模式介绍

Sentinel 模式基本可以满足一般生产的需求,具备高可用性。但是当数据量过大到一台服务器存放不下的情况时,主从模式或 Sentinel 模式就不能满足需求了,这个时候需要对存储的数据进行分片,将数据存储到多个 Redis 实例中。Cluster 模式的出现就是为了解决单机 Redis 容量有限的问题,将 Redis 的数据根据一定的规则分配到多台机器。

Cluster 可以说是 Sentinel 和主从模式的结合体,通过 Cluster 可以实现主从和 master 重选功能,所以如果配置两个副本三个分片的话,就需要六个 Redis 实例。因为 Redis 的数据是根据一定规则分配到 Cluster 的不同机器的,当数据量过大时,可以新增机器进行扩容。

使用集群,只需要将 Redis 配置文件中的 cluster-enable 配置打开即可。每个集群中至少需要三个主数据库才能正常运行,新增节点非常方便。

Cluster 模式的特点如下:

  • 多个redis节点网络互联,数据共享
  • 所有的节点都是一主一从(也可以是一主多从),其中从不提供服务,仅作为备用
  • 不支持同时处理多个key(如MSET/MGET),因为redis需要把key均匀分布在各个节点上,并发量很高的情况下同时创建key-value会降低性能并导致不可预测的行为
  • 支持在线增加、删除节点
  • 客户端可以连接任何一个主节点进行读写

2)Cluster 模式搭建

  • 环境准备
主机名 ip 地址 端口
redis-0 192.168.1.29 7001, 7002
redis-1 192.168.1.30 7003, 7004
redis-2 192.168.1.31 7005, 7006
  • 修改配置文件

在 192.168.1.29 机子上修改 Redis 配置文件:

[root@redis-0 ~]# mkdir -p /usr/local/redis/cluster
[root@redis-0 ~]# cp /usr/local/redis/redis.conf /usr/local/redis/cluster/redis_7001.conf
[root@redis-0 ~]# cp /usr/local/redis/redis.conf /usr/local/redis/cluster/redis_7002.conf
[root@redis-0 ~]# mkdir -p /data/redis/cluster/{redis_7001,redis_7002}
[root@redis-0 ~]# vi /usr/local/redis/cluster/redis_7001.conf
--------找到如下内容并修改--------
bind 0.0.0.0
port 7001
daemonize yes
pidfile "/var/run/redis_7001.pid"
logfile "/usr/local/redis/cluster/redis_7001.log"
dir "/data/redis/cluster/redis_7001"
# replicaof 192.168.1.30 6379
masterauth 123456
requirepass 123456
appendonly yes
cluster-enabled yes
cluster-config-file nodes-7001.conf
cluster-node-timeout 15000
[root@redis-0 ~]# vi /usr/local/redis/cluster/redis_7002.conf
--------找到如下内容并修改--------
bind 0.0.0.0
port 7002
daemonize yes
pidfile "/var/run/redis_7002.pid"
logfile "/usr/local/redis/cluster/redis_7002.log"
dir "/data/redis/cluster/redis_7002"
# replicaof 192.168.1.30 6379
masterauth 123456
requirepass 123456
appendonly yes
cluster-enabled yes
cluster-config-file nodes-7002.conf
cluster-node-timeout 15000

其它两台机器配置与 192.168.1.29 一致,主要是端口号相关的替换,可以拷贝 redis_7001.conf 过去,然后在 vi 命令行模式下 :1.$ s/7001/7002/g 即可把文本里边的所有 7001 替换成 7002 。

  • 全部启动 Redis 服务
[root@redis-0 ~]# redis-server /usr/local/redis/cluster/redis_7001.conf
[root@redis-0 ~]# tail -f /usr/local/redis/cluster/redis_7001.log
[root@redis-0 ~]# redis-server /usr/local/redis/cluster/redis_7002.conf
[root@redis-0 ~]# tail -f /usr/local/redis/cluster/redis_7002.log
······
  • 创建集群
[root@redis-0 ~]# redis-cli -a 123456 --cluster create 192.168.1.29:7001 192.168.1.29:7002 192.168.1.30:7003 192.168.1.30:7004 192.168.1.31:7005 192.168.1.31:7006 --cluster-replicas 1
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 192.168.1.30:7004 to 192.168.1.29:7001
Adding replica 192.168.1.31:7006 to 192.168.1.30:7003
Adding replica 192.168.1.29:7002 to 192.168.1.31:7005
M: a793d749cd104119404ea16575ef2d847f29bb85 192.168.1.29:7001slots:[0-5460] (5461 slots) master
S: 63ac709e49b5e5234e86742d5020ceb8f40bcf2f 192.168.1.29:7002replicates c6a7592db178cd947efe53e3581232dacef957fa
M: 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 192.168.1.30:7003slots:[5461-10922] (5462 slots) master
S: 48d637f69399264472c6d02452d661922406a028 192.168.1.30:7004replicates a793d749cd104119404ea16575ef2d847f29bb85
M: c6a7592db178cd947efe53e3581232dacef957fa 192.168.1.31:7005slots:[10923-16383] (5461 slots) master
S: 8ec7c64f505533d99f5ba91cce6efbe8cbf3b32d 192.168.1.31:7006replicates 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410
Can I set the above configuration? (type 'yes' to accept): yes				#输入yes,接受上面配置
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
...
>>> Performing Cluster Check (using node 192.168.1.29:7001)
M: a793d749cd104119404ea16575ef2d847f29bb85 192.168.1.29:7001slots:[0-5460] (5461 slots) master1 additional replica(s)
M: 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 192.168.1.30:7003slots:[5461-10922] (5462 slots) master1 additional replica(s)
M: c6a7592db178cd947efe53e3581232dacef957fa 192.168.1.31:7005slots:[10923-16383] (5461 slots) master1 additional replica(s)
S: 8ec7c64f505533d99f5ba91cce6efbe8cbf3b32d 192.168.1.31:7006slots: (0 slots) slavereplicates 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410
S: 63ac709e49b5e5234e86742d5020ceb8f40bcf2f 192.168.1.29:7002slots: (0 slots) slavereplicates c6a7592db178cd947efe53e3581232dacef957fa
S: 48d637f69399264472c6d02452d661922406a028 192.168.1.30:7004slots: (0 slots) slavereplicates a793d749cd104119404ea16575ef2d847f29bb85
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

集群创建成功后,可以看到:
192.168.1.29:7001 是 master,它的 slave 是 192.168.1.30:7004;
192.168.1.30:7003 是 master,它的 slave 是 192.168.1.31:7006;
192.168.1.31:7005 是 master,它的 slave 是 192.168.1.29:7002;

自动生成nodes.conf文件:

[root@redis-0 ~]# ls /data/redis/cluster/redis_7001/
appendonly.aof  dump.rdb  nodes-7001.conf
[root@redis-0 ~]# cat /data/redis/cluster/redis_7001/nodes-7001.conf
a793d749cd104119404ea16575ef2d847f29bb85 192.168.1.29:7001@17001 myself,master - 0 1610618769000 1 connected 0-5460
7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 192.168.1.30:7003@17003 master - 0 1610618771716 3 connected 5461-10922
c6a7592db178cd947efe53e3581232dacef957fa 192.168.1.31:7005@17005 master - 0 1610618772747 5 connected 10923-16383
8ec7c64f505533d99f5ba91cce6efbe8cbf3b32d 192.168.1.31:7006@17006 slave 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 0 1610618773770 3 connected
63ac709e49b5e5234e86742d5020ceb8f40bcf2f 192.168.1.29:7002@17002 slave c6a7592db178cd947efe53e3581232dacef957fa 0 1610618769666 5 connected
48d637f69399264472c6d02452d661922406a028 192.168.1.30:7004@17004 slave a793d749cd104119404ea16575ef2d847f29bb85 0 1610618774792 1 connected
vars currentEpoch 6 lastVoteEpoch 0

3)集群操作

  • 登录集群
[root@redis-0 ~]# redis-cli -c -h 192.168.1.29 -p 7001 -a 123456
  • 查看集群信息
192.168.1.29:7001> CLUSTER INFO
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:564
cluster_stats_messages_pong_sent:595
cluster_stats_messages_sent:1159
cluster_stats_messages_ping_received:590
cluster_stats_messages_pong_received:564
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:1159
  • 列出节点信息
192.168.1.29:7001> CLUSTER NODES
a793d749cd104119404ea16575ef2d847f29bb85 192.168.1.29:7001@17001 myself,master - 0 1610619359000 1 connected 0-5460
7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 192.168.1.30:7003@17003 master - 0 1610619365948 3 connected 5461-10922
c6a7592db178cd947efe53e3581232dacef957fa 192.168.1.31:7005@17005 master - 0 1610619364922 5 connected 10923-16383
8ec7c64f505533d99f5ba91cce6efbe8cbf3b32d 192.168.1.31:7006@17006 slave 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 0 1610619361000 3 connected
63ac709e49b5e5234e86742d5020ceb8f40bcf2f 192.168.1.29:7002@17002 slave c6a7592db178cd947efe53e3581232dacef957fa 0 1610619362875 5 connected
48d637f69399264472c6d02452d661922406a028 192.168.1.30:7004@17004 slave a793d749cd104119404ea16575ef2d847f29bb85 0 1610619363897 1 connected

这里的内容与 nodes.conf 文件内容相同

  • 写入数据
192.168.1.29:7001> set kk11 vv11.sanchar
-> Redirected to slot [15489] located at 192.168.1.31:7005
OK
192.168.1.31:7005> set kk22 vv22.lisa
-> Redirected to slot [6577] located at 192.168.1.30:7003
OK
192.168.1.30:7003> set kk33 vv33.tina
-> Redirected to slot [15009] located at 192.168.1.31:7005
OK
192.168.1.31:7005> get kk11
"vv11.sanchar"
192.168.1.31:7005> get kk22
-> Redirected to slot [6577] located at 192.168.1.30:7003
"vv22.lisa"
192.168.1.30:7003> get kk33
-> Redirected to slot [15009] located at 192.168.1.31:7005
"vv33.tina"

可以看出 Redis Cluster 集群是去中心化的,每个节点都是平等的,连接哪个节点都可以获取和设置数据。

当然,平等指的是 master 节点,因为 slave 节点根本不提供服务,只是作为对应 master 节点的一个备份。

  • 增加节点

在 192.168.1.30 上增加一个节点:

[root@redis-1 ~]# mkdir -p /data/redis/cluster/redis_7007
[root@redis-1 ~]# cp /usr/local/redis/cluster/redis_7003.conf /usr/local/redis/cluster/redis_7007.conf
[root@redis-1 ~]# vi /usr/local/redis/cluster/redis_7007.conf
--------找到如下内容并修改--------
bind 0.0.0.0
port 7007
daemonize yes
pidfile "/var/run/redis_7007.pid"
logfile "/usr/local/redis/cluster/redis_7007.log"
dir "/data/redis/cluster/redis_7007"
# replicaof 192.168.1.30 6379
masterauth 123456
requirepass 123456
appendonly yes
cluster-enabled yes
cluster-config-file nodes-7007.conf
cluster-node-timeout 15000[root@redis-1 ~]# redis-server /usr/local/redis/cluster/redis_7007.conf
[root@redis-1 ~]# tail -f /usr/local/redis/cluster/redis_7007.log

在 192.168.1.31 上增加一个节点:

[root@redis-2 ~]# mkdir -p /data/redis/cluster/redis_7008
[root@redis-2 ~]# cp /usr/local/redis/cluster/redis_7005.conf /usr/local/redis/cluster/redis_7008.conf
[root@redis-2 ~]# vi /usr/local/redis/cluster/redis_7008.conf
--------找到如下内容并修改--------
bind 0.0.0.0
port 7008
daemonize yes
pidfile "/var/run/redis_7008.pid"
logfile "/usr/local/redis/cluster/redis_7008.log"
dir "/data/redis/cluster/redis_7008"
# replicaof 192.168.1.30 6379
masterauth 123456
requirepass 123456
appendonly yes
cluster-enabled yes
cluster-config-file nodes-7008.conf
cluster-node-timeout 15000[root@redis-2 ~]# redis-server /usr/local/redis/cluster/redis_7008.conf
[root@redis-2 ~]# tail -f /usr/local/redis/cluster/redis_7008.log

集群中增加节点:

192.168.1.30:7003> CLUSTER MEET 192.168.1.30 7007
OK
192.168.1.30:7003> CLUSTER NODES
7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 192.168.1.30:7003@17003 myself,master - 0 1610620488000 3 connected 5461-10922
63ac709e49b5e5234e86742d5020ceb8f40bcf2f 192.168.1.29:7002@17002 slave c6a7592db178cd947efe53e3581232dacef957fa 0 1610620484000 5 connected
a793d749cd104119404ea16575ef2d847f29bb85 192.168.1.29:7001@17001 master - 0 1610620487317 1 connected 0-5460
48d637f69399264472c6d02452d661922406a028 192.168.1.30:7004@17004 slave a793d749cd104119404ea16575ef2d847f29bb85 0 1610620488331 1 connected
6ad2bee4592f9817e6ebeebc05cc1db0add5b327 192.168.1.30:7007@17007 master - 0 1610620487000 0 connected
8ec7c64f505533d99f5ba91cce6efbe8cbf3b32d 192.168.1.31:7006@17006 slave 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 0 1610620486000 3 connected
c6a7592db178cd947efe53e3581232dacef957fa 192.168.1.31:7005@17005 master - 0 1610620486297 5 connected 10923-16383
192.168.1.30:7003> CLUSTER MEET 192.168.1.31 7008
OK
192.168.1.30:7003> CLUSTER NODES
7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 192.168.1.30:7003@17003 myself,master - 0 1610620526000 3 connected 5461-10922
63ac709e49b5e5234e86742d5020ceb8f40bcf2f 192.168.1.29:7002@17002 slave c6a7592db178cd947efe53e3581232dacef957fa 0 1610620526000 5 connected
a793d749cd104119404ea16575ef2d847f29bb85 192.168.1.29:7001@17001 master - 0 1610620528000 1 connected 0-5460
48d637f69399264472c6d02452d661922406a028 192.168.1.30:7004@17004 slave a793d749cd104119404ea16575ef2d847f29bb85 0 1610620527038 1 connected
841ba3c63048e0e4969dc599f8dc286a5fa70d55 192.168.1.31:7008@17008 master - 0 1610620527000 0 connected
6ad2bee4592f9817e6ebeebc05cc1db0add5b327 192.168.1.30:7007@17007 master - 0 1610620529073 0 connected
8ec7c64f505533d99f5ba91cce6efbe8cbf3b32d 192.168.1.31:7006@17006 slave 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 0 1610620526020 3 connected
c6a7592db178cd947efe53e3581232dacef957fa 192.168.1.31:7005@17005 master - 0 1610620528061 5 connected 10923-16383

可以看到,新增的节点都是以 master 身份加入集群的。

  • 更换节点身份

将新增的 192.168.1.31:7008 节点身份改为 192.168.1.30:7007 的 slave

[root@redis-2 ~]# redis-cli -c -h 192.168.1.31 -p 7008 -a 123456 cluster replicate 6ad2bee4592f9817e6ebeebc05cc1db0add5b327

cluster replicate 后面跟 node_id,更改对应节点身份。也可以登入集群更改。

[root@redis-2 ~]# redis-cli -c -h 192.168.1.31 -p 7008 -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.1.31:7008> CLUSTER NODES
6ad2bee4592f9817e6ebeebc05cc1db0add5b327 192.168.1.30:7007@17007 master - 0 1610620990032 7 connected
48d637f69399264472c6d02452d661922406a028 192.168.1.30:7004@17004 slave a793d749cd104119404ea16575ef2d847f29bb85 0 1610620987983 1 connected
8ec7c64f505533d99f5ba91cce6efbe8cbf3b32d 192.168.1.31:7006@17006 slave 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 0 1610620988804 3 connected
63ac709e49b5e5234e86742d5020ceb8f40bcf2f 192.168.1.29:7002@17002 slave c6a7592db178cd947efe53e3581232dacef957fa 0 1610620989006 5 connected
841ba3c63048e0e4969dc599f8dc286a5fa70d55 192.168.1.31:7008@17008 myself,slave 6ad2bee4592f9817e6ebeebc05cc1db0add5b327 0 1610620987000 7 connected
7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 192.168.1.30:7003@17003 master - 0 1610620992072 3 connected 5461-10922
a793d749cd104119404ea16575ef2d847f29bb85 192.168.1.29:7001@17001 master - 0 1610620991056 1 connected 0-5460
c6a7592db178cd947efe53e3581232dacef957fa 192.168.1.31:7005@17005 master - 0 1610620986964 5 connected 10923-16383

查看相应的nodes.conf文件,可以发现有更改,它记录当前集群的节点信息:

[root@redis-0 ~]# cat /data/redis/cluster/redis_7001/nodes-7001.conf
a793d749cd104119404ea16575ef2d847f29bb85 192.168.1.29:7001@17001 myself,master - 0 1610620873000 1 connected 0-5460
7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 192.168.1.30:7003@17003 master - 0 1610620880865 3 connected 5461-10922
c6a7592db178cd947efe53e3581232dacef957fa 192.168.1.31:7005@17005 master - 0 1610620878822 5 connected 10923-16383
6ad2bee4592f9817e6ebeebc05cc1db0add5b327 192.168.1.30:7007@17007 master - 0 1610620881887 7 connected
8ec7c64f505533d99f5ba91cce6efbe8cbf3b32d 192.168.1.31:7006@17006 slave 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 0 1610620882296 3 connected
63ac709e49b5e5234e86742d5020ceb8f40bcf2f 192.168.1.29:7002@17002 slave c6a7592db178cd947efe53e3581232dacef957fa 0 1610620877802 5 connected
841ba3c63048e0e4969dc599f8dc286a5fa70d55 192.168.1.31:7008@17008 slave 6ad2bee4592f9817e6ebeebc05cc1db0add5b327 0 1610620876000 7 connected
48d637f69399264472c6d02452d661922406a028 192.168.1.30:7004@17004 slave a793d749cd104119404ea16575ef2d847f29bb85 0 1610620879840 1 connected
vars currentEpoch 7 lastVoteEpoch 0
  • 删除节点
192.168.1.29:7002> CLUSTER FORGET 63ac709e49b5e5234e86742d5020ceb8f40bcf2f
(error) ERR I tried hard but I can't forget myself... #不能删除自己的master节点192.168.1.29:7002> CLUSTER FORGET c6a7592db178cd947efe53e3581232dacef957fa (error) ERR Can't forget my master!			#不能删除自己的master节点192.168.1.29:7002> CLUSTER FORGET 6ad2bee4592f9817e6ebeebc05cc1db0add5b327
OK			#可以删除其它的master节点192.168.1.29:7002> CLUSTER NODES
8ec7c64f505533d99f5ba91cce6efbe8cbf3b32d 192.168.1.31:7006@17006 slave 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 0 1610621576842 3 connected
48d637f69399264472c6d02452d661922406a028 192.168.1.30:7004@17004 slave a793d749cd104119404ea16575ef2d847f29bb85 0 1610621573777 1 connected
a793d749cd104119404ea16575ef2d847f29bb85 192.168.1.29:7001@17001 master - 0 1610621574793 1 connected 0-5460
841ba3c63048e0e4969dc599f8dc286a5fa70d55 192.168.1.31:7008@17008 slave - 0 1610621575822 7 connected
7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 192.168.1.30:7003@17003 master - 0 1610621577863 3 connected 5461-10922
63ac709e49b5e5234e86742d5020ceb8f40bcf2f 192.168.1.29:7002@17002 myself,slave c6a7592db178cd947efe53e3581232dacef957fa 0 1610621571000 5 connected
c6a7592db178cd947efe53e3581232dacef957fa 192.168.1.31:7005@17005 master - 0 1610621576000 5 connected 10923-16383192.168.1.29:7002> CLUSTER FORGET 841ba3c63048e0e4969dc599f8dc286a5fa70d55
OK			#可以删除其它的slave节点192.168.1.29:7002> CLUSTER NODES
8ec7c64f505533d99f5ba91cce6efbe8cbf3b32d 192.168.1.31:7006@17006 slave 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 0 1610621661472 3 connected
48d637f69399264472c6d02452d661922406a028 192.168.1.30:7004@17004 slave a793d749cd104119404ea16575ef2d847f29bb85 0 1610621664531 1 connected
6ad2bee4592f9817e6ebeebc05cc1db0add5b327 192.168.1.30:7007@17007 master - 0 1610621663514 7 connected
a793d749cd104119404ea16575ef2d847f29bb85 192.168.1.29:7001@17001 master - 0 1610621660448 1 connected 0-5460
7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 192.168.1.30:7003@17003 master - 0 1610621662488 3 connected 5461-10922
63ac709e49b5e5234e86742d5020ceb8f40bcf2f 192.168.1.29:7002@17002 myself,slave c6a7592db178cd947efe53e3581232dacef957fa 0 1610621650000 5 connected
c6a7592db178cd947ef
  • 保存配置
192.168.1.29:7002> CLUSTER SAVECONFIG
OK192.168.1.29:7002> CLUSTER NODES
8ec7c64f505533d99f5ba91cce6efbe8cbf3b32d 192.168.1.31:7006@17006 slave 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 0 1610621843074 3 connected
48d637f69399264472c6d02452d661922406a028 192.168.1.30:7004@17004 slave a793d749cd104119404ea16575ef2d847f29bb85 0 1610621842047 1 connected
6ad2bee4592f9817e6ebeebc05cc1db0add5b327 192.168.1.30:7007@17007 master - 0 1610621843477 7 connected
a793d749cd104119404ea16575ef2d847f29bb85 192.168.1.29:7001@17001 master - 0 1610621838984 1 connected 0-5460
841ba3c63048e0e4969dc599f8dc286a5fa70d55 192.168.1.31:7008@17008 slave 6ad2bee4592f9817e6ebeebc05cc1db0add5b327 0 1610621841031 7 connected
7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 192.168.1.30:7003@17003 master - 0 1610621836941 3 connected 5461-10922
63ac709e49b5e5234e86742d5020ceb8f40bcf2f 192.168.1.29:7002@17002 myself,slave c6a7592db178cd947efe53e3581232dacef957fa 0 1610621833000 5 connected
c6a7592db178cd947efe53e3581232dacef957fa 192.168.1.31:7005@17005 master - 0 1610621839595 5 connected 10923-16383[root@redis-0 ~]# cat /data/redis/cluster/redis_7001/nodes-7001.conf
a793d749cd104119404ea16575ef2d847f29bb85 192.168.1.29:7001@17001 myself,master - 0 1610620873000 1 connected 0-5460
7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 192.168.1.30:7003@17003 master - 0 1610620880865 3 connected 5461-10922
c6a7592db178cd947efe53e3581232dacef957fa 192.168.1.31:7005@17005 master - 0 1610620878822 5 connected 10923-16383
6ad2bee4592f9817e6ebeebc05cc1db0add5b327 192.168.1.30:7007@17007 master - 0 1610620881887 7 connected
8ec7c64f505533d99f5ba91cce6efbe8cbf3b32d 192.168.1.31:7006@17006 slave 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 0 1610620882296 3 connected
63ac709e49b5e5234e86742d5020ceb8f40bcf2f 192.168.1.29:7002@17002 slave c6a7592db178cd947efe53e3581232dacef957fa 0 1610620877802 5 connected
841ba3c63048e0e4969dc599f8dc286a5fa70d55 192.168.1.31:7008@17008 slave 6ad2bee4592f9817e6ebeebc05cc1db0add5b327 0 1610620876000 7 connected
48d637f69399264472c6d02452d661922406a028 192.168.1.30:7004@17004 slave a793d749cd104119404ea16575ef2d847f29bb85 0 1610620879840 1 connected
vars currentEpoch 7 lastVoteEpoch 0

可以看到,之前删除的节点又恢复了,这是因为对应的配置文件没有删除,执行 CLUSTER SAVECONFIG 恢复。

  • 模拟 master 节点挂掉

停止 192.168.1.29 机子上 Redis 端口为 7001 的服务(kill 掉相应的进程):

[root@redis-0 ~]# netstat -lntp |grep 7001
tcp        0      0 0.0.0.0:17001           0.0.0.0:*               LISTEN      8745/redis-server 0 
tcp        0      0 0.0.0.0:7001            0.0.0.0:*               LISTEN      8745/redis-server 0 
[root@redis-0 ~]# kill -9 8745
192.168.1.31:7008> CLUSTER NODES
6ad2bee4592f9817e6ebeebc05cc1db0add5b327 192.168.1.30:7007@17007 master - 0 1610622184000 7 connected
48d637f69399264472c6d02452d661922406a028 192.168.1.30:7004@17004 master - 0 1610622184000 8 connected 0-5460
8ec7c64f505533d99f5ba91cce6efbe8cbf3b32d 192.168.1.31:7006@17006 slave 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 0 1610622184167 3 connected
63ac709e49b5e5234e86742d5020ceb8f40bcf2f 192.168.1.29:7002@17002 slave c6a7592db178cd947efe53e3581232dacef957fa 0 1610622186837 5 connected
841ba3c63048e0e4969dc599f8dc286a5fa70d55 192.168.1.31:7008@17008 myself,slave 6ad2bee4592f9817e6ebeebc05cc1db0add5b327 0 1610622180000 7 connected
7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 192.168.1.30:7003@17003 master - 0 1610622185809 3 connected 5461-10922
a793d749cd104119404ea16575ef2d847f29bb85 192.168.1.29:7001@17001 master,fail - 1610622153052 1610622146920 1 disconnected
c6a7592db178cd947e

对应 192.168.1.29:7001 的一行可以看到,master,fail,状态为 disconnected;而对应 192.168.1.30:7004 的一行,slave 已经变成 master。

  • 重新启动 192.168.1.29:7001 节点
[root@redis-0 ~]# redis-server /usr/local/redis/cluster/redis_7001.conf
192.168.1.31:7008> CLUSTER NODES
6ad2bee4592f9817e6ebeebc05cc1db0add5b327 192.168.1.30:7007@17007 master - 0 1610622362000 7 connected
48d637f69399264472c6d02452d661922406a028 192.168.1.30:7004@17004 master - 0 1610622360797 8 connected 0-5460
8ec7c64f505533d99f5ba91cce6efbe8cbf3b32d 192.168.1.31:7006@17006 slave 7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 0 1610622361816 3 connected
63ac709e49b5e5234e86742d5020ceb8f40bcf2f 192.168.1.29:7002@17002 slave c6a7592db178cd947efe53e3581232dacef957fa 0 1610622362837 5 connected
841ba3c63048e0e4969dc599f8dc286a5fa70d55 192.168.1.31:7008@17008 myself,slave 6ad2bee4592f9817e6ebeebc05cc1db0add5b327 0 1610622361000 7 connected
7dbc8eb35fbbe38427f2f1e0744ba7ae96394410 192.168.1.30:7003@17003 master - 0 1610622364888 3 connected 5461-10922
a793d749cd104119404ea16575ef2d847f29bb85 192.168.1.29:7001@17001 slave 48d637f69399264472c6d02452d661922406a028 0 1610622363857 8 connected
c6a7592db178cd947efe53e3581232dacef957fa 192.168.1.31:7005@17005 master - 0 1610622365906 5 connected 10923-16383

可以看到,192.168.1.29:7001 节点启动后为 slave 节点,并且是 192.168.1.30:7004 的 slave 节点。即 master 节点如果挂掉,它的 slave 节点变为新 master 节点继续对外提供服务,而原来的 master 节点如果重启,则变为新 master 节点的 slave 节点。

另外,如果这里是拿 192.168.1.30:7007 节点做测试的话,会发现 192.168.1.31:7008 节点并不会切换,这是因为 192.168.1.30:7007 节点上根本没数据(节点添加进来之后没有重新分片,数据不会存到 7007 节点上)。集群数据被分为三份,采用哈希槽(hash slot)的方式来分配 16384 个 slot 的话,它们三个节点分别承担的 slot 区间是:

  • 节点 192.168.1.30:7004 覆盖 0-5460
  • 节点 192.168.1.30:7003 覆盖 5461-10922
  • 节点 192.168.1.31:7005 覆盖 10923-16383

Springboot 连接 Redis 集群(Sentinel 模式和 Cluster 模式)参考示例:
Spring boot 2.0 连接 Redis 集群之 Sentinel 和 Cluster 集群模式

  相关解决方案