当前位置: 代码迷 >> 综合 >> mysql innodb cluster (by quqi99)
  详细解决方案

mysql innodb cluster (by quqi99)

热度:26   发布时间:2023-12-13 08:53:13.0

版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明 (作者:张华 发表于:2021-07-23)

安装步骤

在三个节点(juju deploy -n 3 ubuntu)上都运行:

apt install mysql-server mysql-router
snap install mysql-shell
sudo mysql -u root -p   #must use sudo
update mysql.user set host = '%' where user = 'root';
select host, user from mysql.user;
CREATE USER 'rep1'@'%' IDENTIFIED BY 'password';
GRANT ALL PRIVILEGES ON *.* TO 'rep1'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;
# modify bind-address to 0.0.0.0 in all 3 nodes
sudo sed -i 's/127.0.0.1/0.0.0.0/g' /etc/mysql/mysql.conf.d/mysqld.cnf
sudo systemctl restart mysql

在其他两个节点上重复

# dba.configure_instance('rep1@juju-e57bf9-test-1:3306')
# dba.configure_instance('rep1@juju-e57bf9-test-2:3306')
root@juju-e57bf9-test-0:~# mysqlsh
mysql-py> dba.configure_instance('rep1@juju-e57bf9-test-0:3306')
Please provide the password for 'rep1@juju-e57bf9-test-0:3306': ********
...
NOTE: Some configuration options need to be fixed:
+----------------------------------------+---------------+----------------+--------------------------------------------------+
| Variable                               | Current Value | Required Value | Note                                             |
+----------------------------------------+---------------+----------------+--------------------------------------------------+
| binlog_transaction_dependency_tracking | COMMIT_ORDER  | WRITESET       | Update the server variable                       |
| enforce_gtid_consistency               | OFF           | ON             | Update read-only variable and restart the server |
| gtid_mode                              | OFF           | ON             | Update read-only variable and restart the server |
| server_id                              | 1             | <unique ID>    | Update read-only variable and restart the server |
| slave_parallel_type                    | DATABASE      | LOGICAL_CLOCK  | Update the server variable                       |
| slave_preserve_commit_order            | OFF           | ON             | Update the server variable                       |
+----------------------------------------+---------------+----------------+--------------------------------------------------+
Do you want to perform the required configuration changes? [y/n]: y
Do you want to restart the instance after configuring it? [y/n]: y
Configuring instance...
The instance 'juju-e57bf9-test-0:3306' was configured to be used in an InnoDB cluster.
Restarting MySQL...
NOTE: MySQL server at juju-e57bf9-test-0:3306 was restarted.

只在ubuntu/0上创建cluster.

mysql-py> shell.connect('rep1@juju-e57bf9-test-0:3306')
# This sets the group_replication_group_name system variable
mysql-py []> cluster = dba.create_cluster('jujuCluster')
mysql-py []> cluster.add_instance('rep1@juju-e57bf9-test-1:3306')
mysql-py []> cluster.add_instance('rep1@juju-e57bf9-test-2:3306')
mysql-py []> cluster.status()
mysql-py []> cluster.describe();
mysql-py []> cluster.rescan();
mysql-py []> cluster.check_instance_state('rep1@juju-e57bf9-test-0:3306');
mysql-py []> \sql
mysql-sql []> SELECT member_host, member_port, member_state, member_role FROM performance_schema.replication_group_members;
+--------------------+-------------+--------------+-------------+
| member_host        | member_port | member_state | member_role |
+--------------------+-------------+--------------+-------------+
| juju-e57bf9-test-1 |        3306 | ONLINE       | SECONDARY   |
| juju-e57bf9-test-2 |        3306 | ONLINE       | SECONDARY   |
| juju-e57bf9-test-0 |        3306 | ONLINE       | PRIMARY     |
+--------------------+-------------+--------------+-------------+# Persist cluster configuration
dba.configure_local_instance('rep1@juju-e57bf9-test-0:3306');
dba.configure_local_instance('rep1@juju-e57bf9-test-1:3306');
dba.configure_local_instance('rep1@juju-e57bf9-test-2:3306');# On non-leader nodes. eg:
mysqlsh rep1@juju-e57bf9-test-1:33060 -ppassword
#shell.connect('rep1@juju-e57bf9-test-1:3306')
cluster = dba.get_cluster('jujuCluster');# Use mysql router
mysqlrouter --bootstrap rep1@juju-e57bf9-test-0:3306 --directory myrouter
myrouter/start.sh
mysqlsh rep1@juju-e57bf9-test-0:6446
mysqlsh rep1@juju-e57bf9-test-0:6447  #read-only

连接cluster:

juju ssh ubuntu/0 -- mysqlsh rep1@juju-e57bf9-test-1:33060 -ppassword --cluster
juju ssh ubuntu/1 -- mysqlsh rep1@juju-e57bf9-test-1:33060 -ppassword --cluster
juju ssh ubuntu/2 -- mysqlsh rep1@juju-e57bf9-test-1:33060 -ppassword --cluster

问题

操作之前记得备份.

juju run-action --wait mysql-innodb-cluster/1 mysqldump  #/var/backups/mysql
mysqldump --all-databases --single-transaction --quick --lock-tables=false > full-backup-$(date +%F).sql -u clusteruser -pM6BsjpJVzYsy74L5gxqBHnr5V9mmwJNWdCbYBdN
juju ssh mysql-innodb-cluster/1 sudo du -ah /var/backups/mysql/ /var/lib/mysql | tee -a mysql-innodb-cluster-1-backups.txt
SHOW VARIABLES LIKE 'gtid_executed';

一般平时关一个节点,那样cluster不会脑裂, 一般下面也能恢复.

# https://lefred.be/content/mysql-innodb-cluster-how-to-manage-a-split-brain-situation/
cluster = dba.get_cluster('jujuCluster');
cluster.force_quorum_using_partition_of('rep1@juju-e57bf9-test-0:3306')
cluster.rejoin_instance('rep1@juju-e57bf9-test-0:3306')

发生outage了,三个节点同时关了,可能起来就一个cluster脑裂成3个了.这时一般通过reboot_cluster_from_complete_outage也能解决.

mysql-py []> cluster = dba.reboot_cluster_from_complete_outage();

但我三分列成3个cluster的情况下又遇到两个问题:

  • 之前的leader可能变成现在的non-leader,它再replication时容易状态变成error
  • 根据网上一篇错误文章删了重要的表mysql_innodb_cluster_metadata,它也变成error
    通过下列命令看三个节点各自为政都是offline容易通过reboot_cluster_from_complete_outage恢复,但涉及到复制导致error状态了就很难了.是不是要删除了数据目录重建呢?
\sql
SHOW STATUS LIKE 'group_replication_primary_member';
SELECT * FROM performance_schema.replication_group_members;

其他可能遇得到的命令:

\sql
SET SQL_LOG_BIN = 0;
stop group_replication;
set global super_read_only=0;
# after drop database, Access denied for user 'mysql_innodb_cluster'
# drop database mysql_innodb_cluster_metadata;
\use mysql_innodb_cluster_metadata
DELETE FROM instances WHERE address = ‘juju-e57bf9-test-1:3306’;
DELETE FROM v2_instances WHERE address = ‘juju-e57bf9-test-1:3306’;
\use mysql
DELETE FROM user WHERE Host = 'juju-e57bf9-test-1';
DELETE FROM user WHERE User = 'mysql_innodb_cluster_4097976713';
RESET MASTER;
RESET REPLICA ALL;
select * from performance_schema.replication_group_members;
SELECT @@group_replication_local_address;
#Be careful that the best practice is to shutdown the other nodes
#https://lefred.be/content/mysql-innodb-cluster-how-to-manage-a-split-brain-situation/
set global group_replication_force_members=@@group_replication_local_address;
set global group_replication_force_members=''
start group_replication;\py
cluster = dba.get_cluster('jujuCluster');
cluster = dba.get_cluster();
# since we drop mysql_innodb_cluster_metadata, so we need to first remove_instance
# then add_instance to rebuild mysql_innodb_cluster_metadata
cluster.remove_instance('rep1@juju-e57bf9-test-1:3306', {'force': 'true'});
# then remember to select 'Clone'
cluster.add_instance('rep1@juju-e57bf9-test-1:3306');

注:对于已经脑裂的节点在remove再add之前可能还需要stop replication. (尚未测试)

其他

一些重要的数据结构.

mysql-sql [mysql_innodb_cluster_metadata]> show tables;
+-----------------------------------------+
| Tables_in_mysql_innodb_cluster_metadata |
+-----------------------------------------+
| async_cluster_members                   |
| async_cluster_views                     |
| clusters                                |
| instances                               |
| router_rest_accounts                    |
| routers                                 |
| schema_version                          |
| v2_ar_clusters                          |
| v2_ar_members                           |
| v2_clusters                             |
| v2_gr_clusters                          |
| v2_instances                            |
| v2_router_rest_accounts                 |
| v2_routers                              |
| v2_this_instance                        |
+-----------------------------------------+mysql-sql [mysql_innodb_cluster_metadata]> select attributes from v2_instances \G
*************************** 1. row ***************************
attributes: {"server_id": 4097976713, "recoveryAccountHost": "%", "recoveryAccountUser": "mysql_innodb_cluster_4097976713"}
1 row in set (0.0010 sec)

20220607 - openstack db upgrade

DESC conntrack_helpers;
mysqldump -p neutron conntrack_helpers > conntrack_helpers.sql
mysql -u root -p neutron < conntrack.sqlsudo nova-manage db sync; sudo nova-manage api_db sync
sudo neutron-db-manage --config-file '/etc/neutron/neutron.conf' --config-file '/etc/neutron/plugins/ml2/ml2_conf.ini' current --verbose
sudo neutron-db-manage --config-file '/etc/neutron/neutron.conf' --config-file '/etc/neutron/plugins/ml2/ml2_conf.ini' upgrade head
sudo neutron-db-manage history --verbose
sudo neutron-db-manage branches --verbose
select * from alembic_version_dr;

20230531 - MySQL Router not yet bootstrapped

尝试解决MySQL Router not yet bootstrapped, 未果

问题:像neutron-api-mysql-router/0之类的unit上看到错误“MySQL Router not yet bootstrapped”, 进去之后看到的juju log如下:
2023-05-31 00:27:54 ERROR unit.neutron-api-mysql-router/0.juju-log server.go:316 Failed to bootstrap mysqlrouter: Error: Unable to connect to the metadata server: Error connecting to MySQL server at 10.0.0.19:0: Access denied for user 'mysqlrouteruser'@'10.0.0.19' (using password: YES) (1045)$ juju run -u mysql/leader leader-get |grep passwd
mysql-cinder.passwd: tq5zffZVFJR5qc82kGJ7Gr9SLmN43fsX
mysql-keystone.passwd: P3qHHqxx2Xsznhrz2FFPwqmdrMXHCb5X
mysql-mysqlrouteruser.passwd: PKcmBZqh8P9FXJNKJmwF6FwycNqFg5Vd
mysql-nova.passwd: rryWXGp4Z4GcnKpTj9Sndc2yLxV7Vcf9
mysql-placement.passwd: bMMGYTtrzsFkrBb3Wg6XTG9qGBYFk7cZ
mysql-vault.passwd: 78xJybbLMj7Fw6kBm8fTVYrhPRCFWp5P
mysql.passwd: Pz4JG3tV3WsXrnGnj2mWSBhFrkySmJnGqFL8W此时确实无法通过mysql-mysqlrouteruser用户登录:
sudo mysql -h 10.0.0.19 -umysql-mysqlrouteruser -pPKcmBZqh8P9FXJNKJmwF6FwycNqFg5Vd但可以通过root用户登录:
sudo mysql -h localhost -uroot -pPz4JG3tV3WsXrnGnj2mWSBhFrkySmJnGqFL8Wroot用户登录进去看到确实没看到关于10.0.0.19的ACL
mysql> SELECT User, Host, plugin FROM mysql.user WHERE User = 'mysqlrouteruser';
+-----------------+-----------+-----------------------+
| User            | Host      | plugin                |
+-----------------+-----------+-----------------------+
| mysqlrouteruser | 10.0.0.11 | caching_sha2_password |
| mysqlrouteruser | 10.0.0.16 | caching_sha2_password |
| mysqlrouteruser | 10.0.0.30 | caching_sha2_password |
| mysqlrouteruser | 10.0.0.41 | caching_sha2_password |
| mysqlrouteruser | 127.0.0.1 | caching_sha2_password |
+-----------------+-----------+-----------------------+
5 rows in set (0.00 sec)运行这个fix (https://github.com/juju/charm-helpers/pull/424/commits/ed43decf734f145ca6a29258dfa53e4cf323e5fd)中下列命令也不行, 它说You are not allowed to create a user with GRANT
GRANT SELECT, INSERT, UPDATE, DELETE, EXECUTE ON mysql_innodb_cluster_metadata.* TO 'mysqlrouteruser'@'10.0.0.19';
出现'You are not allowed to create a user with GRANT", 原来是8.0之后的mysql不支持 授权的时候就进行用户创建,所以创建 之后才能授权:mysql> create user 'mysql-mysqlrouteruser'@'10.0.0.19' identified by 'PKcmBZqh8P9FXJNKJmwF6FwycNqFg5Vd';
Query OK, 0 rows affected (0.01 sec)
mysql> grant all on *.* to 'mysql-mysqlrouteruser'@'10.0.0.19';
Query OK, 0 rows affected (0.01 sec)
mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.00 sec)这个时候就可以使用下列命令登录了:
sudo mysql -h 10.0.0.19 -umysql-mysqlrouteruser -pPKcmBZqh8P9FXJNKJmwF6FwycNqFg5Vd接着在neutron-api-plugin-ovn/0上rebootstrap mysql cluster, 重启3个mysql units后用下列命令:
juju run-action --wait mysql/leader reboot-cluster-from-complete-outage但是这么弄了之后,仍然能看到这个错误: Failed to bootstrap mysqlrouter: Error: Unable to connect to the metadata server: Error connecting to MySQL server at 10.0.0.19:0: Access denied for user 'mysqlrouteruser'@'10.0.0.19' (using password: YES) (1045)
  相关解决方案