MariaDB Galera Cluster简介及双主HA实践,mariadbgalera
MariaDB GaleraCluster简介及双主HA实践
http://blog.sina.com.cn/s/blog_6de3aa8a0102w00d.html
概述
MariaDB Galera Cluster由于其优异的特性,被广泛用于各类分布式系统中,例如OpenStack集群,freeradius集群等。本文将简要介绍GaleraCluster,并实际设计部署双主MariaDB GaleraCluster(基于freeradius认证的VPN集群系统)来验证HA(HighAvailability)特性。
GaleraCluster要素
DatabaseReplication
分布式数据库/数据库集群系统中数据库复制是十分重要的,常用的有以下两种架构:Master/SlaveReplication架构(左)和Multi-masterReplication架构(右)。
What is GaleraCluster?
Galera Cluster is awrite-set replication service provider in the formof the dlopenable library. It provides synchronous replication andsupports multi-master replication. Galera Cluster is capable ofunconstrained parallel applying (that is,“parallel replication”),multicast replication and automatic node provisioning.
The primary focus ofGalera Cluster is data consistency. Transactions are either appliedto every node or not at all. Galera Cluster is not a clustermanager, a load balancer or a cluster monitor. What it does it keepdatabases synchronized provided that they were properly configuredand synchronized in the beginning.
而GaleraCluster提供了并行的复制机制,能实现同步的,多主的复制方式。因此具有很多实用特性:(1)可以在任意节点上进行读写(2)自动剔除故障节点(3)自动加入新节点(4)真正行级别的并发复制(5)客户端连接跟操作单数据库的体验一致。
在数据复制方面有同步和异步两种方式,而同步方式的优势十分明显:(1)高可用(当节点宕机时数据不会丢失,数据复制一致性高,实现不复杂);(2)具有较高的性能;(3)跨集群。但是缺点是随着节点数量的增加,其事务的响应时间,冲突的概率和死锁率会呈指数增长。(因此像MySQL和PostgreSQL都采用异步复制方式)
为克服同步复制的的缺点,现在主要的技术包括:GroupCommunication,Write-sets,Database StateMachine和TransactionReordering。GaleraCluster使用一种称为Certification-basedReplication的策略实现集群同步制,Certification-basedReplication综合使用了GroupCommunication和TransactionReordering技术。具体来说:事务可以在任意单节点上执行或者复制,但是在事务在提交时,会运行一个协调认证程序来保证事务的全局一致性,它借助广播服务,建立一个全局的并发事务。
Certification-based Replication的主要思想是数据库中的任何事务在未到达提交点(commitpoint)时,会假设没有任何冲突。在事务commit发生之前,这个事务将会引起的所有数据库变化和所变行的主键都会被收集到一个write-set中。然后会把这个write-set发送到集群中的其它节点上。有了write-set,此时集群中所有的节点会使用主键执行一个certificationtest,如果失败则丢弃write-set并且回滚事务,如果成功则事务会被提交并将write-set应用到集群中其余的节点。而GaleraCluster中使用Certification-basedReplication策略则依赖于一个全局的事务序列。
GaleraCluster的内部架构
•DatabaseManagement System (DBMS) The databaseserver that runs on the individual node. Galera Cluster can useMySQL, MariaDB or Percona XtraDB.
•wsrepAPI The interface andthe responsibilities for the database server and replicationprovider. It consists of:
•wsrephooks The integrationwith the database server engine for write-setreplication.
•dlopen()Thefunction that makes the wsrep provider available to the wsrephooks.
•Galera ReplicationPlugin The plugin thatenables write-set replication service functionality.
•GroupCommunication plugins(GCS plugins) The various groupcommunication systems available to Galera Cluster. Forinstance,gcommand Spread..
State SnapshotTransfer (SST)
SST是集群提供给节点用来从一个节点到其它节点对一个完整数据的拷贝传输功能。GaleraCluster可以使用逻辑方法(mysqldump)和物理方法(rsync,rsync_wan,xtrabackup等)。
常用的GaleraCluster方式
如下图,对此本文设计了一个基于freeradius认证的VPN集群系统,其核心的数据库采用MariaDB双主GaleraCluster实现,同时借助keepalive和haproxy实现负载均衡。
基于freeradius认证的VPN集群系统
主要在对数据库mariadb做双主GaleraCluster架构;使用keepalive配置虚拟IP地址;实现两个freeradius服务器构成的集群(多机方式而非冗余方式);使用haproxy做httpd服务(包括daloradius和其它的辅助服务)的负载均衡;vpn-l2tpserver其规模具有线性拓展特性。
设VIP地址为192.168.10.99
节点 |
Hostname |
FQDN |
Ipaddress |
备注(CentOS7) |
Master |
radiusone |
radiusone.ctgu.cn |
192.168.10.91 |
Keepalive,haproxy,mariadb |
Slave |
radiustwo |
radiustwo.ctgu.cn |
192.168.10.92 |
Keepalive,haproxy,mariadb |
说明:由于是双机配置,相同的操作或配置以#开头,否则[root@radiusone/two~]# 开头。
系统初始化和准备yum源
# cat/etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4localhost4.localdomain4
::1 localhost localhost.localdomain localhost6localhost6.localdomain6
192.168.10.91 radiusone radiusone.ctgu.cn
192.168.10.92 radiustwo radiustwo.ctgu.cn
# yum install -yepel-release
为了安装galeramariadb-galera-common mariadb-galera-server mariadb-libs mariadbmariadb-galera-common,方便起见,这里使用RDO的源。
#wgethttps://repos.fedorapeople.org/repos/openstack/openstack-kilo/rdo-release-kilo-1.noarch.rpm
# rpm -ivhrdo-release-kilo-1.noarch.rpm
# yum update-y
# yum install -y wgetnet-tools iptables-services
# vim/etc/sysconfig/selinux
SELINUX=disabled
# setenforce0
设置防火墙
# iptables -A INPUT-p tcp -m state --state NEW -m multiport --dports3306,3307,4567,4568,4444 -j ACCEPT
#iptables-save
# systemctl enableiptables
# cat/etc/sysconfig/iptables
# sampleconfiguration for iptables service
# you can edit thismanually or use system-config-firewall
# please do not askus to add additional ports/services to this defaultconfiguration
*filter
:INPUT ACCEPT[0:0]
:FORWARD ACCEPT[0:0]
:OUTPUT ACCEPT[0:0]
-A INPUT -m state--state RELATED,ESTABLISHED -j ACCEPT
-A INPUT -p icmp -jACCEPT
-A INPUT -i lo -jACCEPT
-A INPUT -p tcp -mstate --state NEW -m tcp --dport 22 -j ACCEPT
-A INPUT -p tcp--dport 3306 -j ACCEPT
-A INPUT -p tcp--dport 3307 -j ACCEPT
-A INPUT -p tcp--dport 4567 -j ACCEPT
-A INPUT -p tcp--dport 4568 -j ACCEPT
-A INPUT -p tcp--dport 4444 -j ACCEPT
-A INPUT -j REJECT--reject-with icmp-host-prohibited
-A FORWARD -j REJECT--reject-with icmp-host-prohibited
COMMIT
# systemctl restartiptables
# systemctl stopfirewalld
# systemctl disablefirewalld
安装数据库和相关软件包
# yum install -ymariadb mariadb-libs mariadb-devel mariadb-galera-commonmariadb-galera-server galera rsync
# systemctl enable mariadb
# systemctl enable rsyncd
单节点初始化mysql数据库
[root@radiusone ~]#systemctl restart mariadb
[root@radiusone ~]#mysql_secure_installation
设置root用户远程使用密码password访问
[root@radiusone ~]#mysql -u root –p
MariaDB [(none)]>grant all privileges on *.* to 'root'@'%' identified by 'password'with grant option;
MariaDB[(none)]> select host, user frommysql.user;
MariaDB Galera双主集群设置
# vim/etc/my.cnf.d/galera.cnf +27
bind-address=0.0.0.0 --> bind-address=192.168.10.91/92
[root@radiusone ~]#vim /etc/my.cnf.d/server.cnf
[mysqld]
datadir=/var/lib/mysql
query_cache_size=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_provider_options="gcache.size=1G"
wsrep_cluster_address=gcomm://
#wsrep_cluster_address=gcomm://192.168.10.91,192.168.10.92
wsrep_cluster_name='dbcluster'
wsrep_node_address='192.168.10.91'
wsrep_node_name='radiusone'
#wsrep_sst_method=xtrabackup
wsrep_sst_method=rsync
wsrep_sst_auth=username:Password
字段说明
wsrep_provider – Ifyou are running on 32bit it will be in the /usr/lib otherwise on64bit systems it will be in /usr/lib64
wsrep_cluster_address– Include all theaddresses of the servers that will exist in the cluster. You canuse hostnames if you wish
wsrep_cluster_name –This is the name of the cluster. All servers in the cluster mustmatch names
wsrep_node_address– The IP address of thisserver
wsrep_node_name– The name of thisserver
wsrep_sst_auth – Thisis the username and password used to authenticate xtrabackup. Setthis the same on all machines
[root@radiustwo ~]#vim /etc/my.cnf.d/server.cnf
[mysqld]
datadir=/var/lib/mysql
query_cache_size=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_provider_options="gcache.size=1G"
wsrep_cluster_address=gcomm://192.168.10.91
#wsrep_cluster_address=gcomm://192.168.10.91,192.168.10.92
wsrep_cluster_name='dbcluster'
wsrep_node_address='192.168.10.92'
wsrep_node_name='radiustwo'
#wsrep_sst_method=xtrabackup
wsrep_sst_method=rsync
wsrep_sst_auth=username:Password
[root@radiusone ~]#systemctl restart mariadb
[root@radiustwo ~]#systemctl restart mariadb
【按照下文附录中的Mariadb-galeraHA集群启动顺序操作】
安装和配置keepalive
# yum install -ykeepalived
# vim/etc/sysctl.conf
net.ipv4.tcp_keepalive_intvl=1
net.ipv4.tcp_keepalive_time=5
net.ipv4.tcp_keepalive_probes=5
net.ipv4.ip_forward=1
net.ipv4.ip_nonlocal_bind=1
net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-arptables=1
# sysctl-p
配置keepalive forVIP 192.168.10.99
[root@radiusone ~]#cat /etc/keepalived/keepalived.conf
! Configuration Filefor keepalived
global_defs{
notification_email {
zhangguoqingas@gmail.com
}
notification_email_from root@radiusone
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id radiusone
}
vrrp_scriptcheck_haproxy {
script "killall -0 haproxy"
interval 2 # every 2 seconds
weight 2 # add 2 points if OK
}
vrrp_instance VI_1{
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass password
}
virtual_ipaddress {
192.168.10.99
}
}
[root@radiustwo ~]#cat /etc/keepalived/keepalived.conf
! Configuration Filefor keepalived
global_defs{
notification_email {
zhangguoqingas@gmail.com
}
notification_email_from root@radiustwo
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id radiustwo
}
vrrp_scriptcheck_haproxy {
script "killall -0 haproxy"
interval 2 # every 2 seconds
weight 2 # add 2 points if OK
}
vrrp_instance VI_1{
state BACKUP
interface eth0
virtual_router_id 51
priority 80
advert_int 1
authentication {
auth_type PASS
auth_pass password
}
virtual_ipaddress {
192.168.10.99
}
}
# systemctl restartkeepalived
# systemctl enablekeepalived
可以查看当前VIP192.168.10.99在radiusone[MASTER]主机上
[root@radiusone ~]#ip a
2: eth0: mtu 1500qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:fb:94:ea brd ff:ff:ff:ff:ff:ff
inet 192.168.10.91/24 brd 192.168.10.255 scope globaleth0
valid_lft forever preferred_lft forever
inet 192.168.10.99/32 scope globaleth0
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fefb:94ea/64 scope link
valid_lft forever preferred_lft forever
[root@radiustwo ~]#ip a
2: eth0: mtu 1500qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:33:3d:f8 brd ff:ff:ff:ff:ff:ff
inet 192.168.10.92/24 brd 192.168.10.255 scope globaleth0
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fe33:3df8/64 scope link
valid_lft forever preferred_lft foreve
当停止radiusone上的keepalive时,VIP会自动漂移至radiustwo上
[root@radiusone ~]#systemctl stop keepalived
[root@radiustwokeepalived]# ip a
2: eth0: mtu 1500qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:33:3d:f8 brd ff:ff:ff:ff:ff:ff
inet 192.168.10.92/24 brd 192.168.10.255 scope globaleth0
valid_lft forever preferred_lft forever
inet 192.168.10.99/32 scope globaleth0
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fe33:3df8/64 scope link
valid_lft forever preferred_lft forever
[root@radiusone ~]#mysql -h 192.168.10.99 -u root -p
安装和配置haproxy
# yum install-y haproxy
# cat/etc/haproxy/haproxy.cfg
global
log 192.168.10.92local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
stats socket /var/lib/haproxy/stats
defaults
mode tcp
log global
option tcplog
option dontlognull
option redispatch
retries 3
timeouthttp-request 10s
timeout queue 1m
timeoutconnect 10s
timeoutclient 1m
timeoutserver 1m
timeout http-keep-alive 10s
timeoutcheck 10s
maxconn 3000
# systemctl starthaproxy
添加数据库服务
# vim/etc/haproxy/haproxy.cfg
listengalera_cluster
bind 192.168.10.99:3307
balance source
option tcpka
server radiusone 192.168.10.91:3306 check port 4567 inter 2000 rise2 fall 5
server radiustwo 192.168.10.92:3306 check port 4567 inter 2000 rise2 fall 5 backup#
# systemctl restarthaproxy
# systemctlenable haproxy
测试连接数据库和检查集群
重点要检查的字段
SHOW GLOBAL STATUSLIKE’wsrep_%’;
(1)SHOW GLOBAL STATUSLIKE ’wsrep_cluster_state_uuid’; 集群中每个节点的值必须一致。
(2)SHOW GLOBAL STATUSLIKE ’wsrep_cluster_size’; 显示集群的规模,此例应该显示”2”。
(3)SHOW GLOBAL STATUSLIKE ’wsrep_cluster_status’; 显示”Primary”为正常。
(4)SHOW GLOBAL STATUSLIKE ’wsrep_ready’; 显示”ON”为正常。
(5) SHOW GLOBALSTATUS LIKE ’wsrep_connected’; 显示”ON”为正常。
(6) SHOW GLOBALSTATUS LIKE ’wsrep_local_state_comment’; 显示”Joined”为正常。
至此,MariaDBGalera双主Cluster构建完成,接下来即可参考官方文档完成freeradius和daloradius以及vpn-l2tp的部署,实现基于freeradius认证的VPN集群系统。
附录
Mariadb-galeraHA集群启动顺序
https://mariadb.com/kb/en/mariadb/starting-and-stopping-mariadb/
(1)先启动radiusone节点mariadb,并且vim/etc/my.cnf.d/server.cnf配置文件中
wsrep_cluster_address=gcomm://
#wsrep_cluster_address=gcomm://192.168.10.91,192.168.10.92
(2)再启动radiustwo节点mariadb,并且vim/etc/my.cnf.d/server.cnf配置文件中
#wsrep_cluster_address=gcomm://
wsrep_cluster_address=gcomm://192.168.10.91,192.168.10.92
(3)重启radiusone节点mariadb,并且vim/etc/my.cnf.d/server.cnf配置文件中
#wsrep_cluster_address=gcomm://
wsrep_cluster_address=gcomm://192.168.10.91,192.168.10.92
(4)查看端口(VIP在radiusone上面)
[root@radiusone ~]#netstat -anp | grep 3306
tcp 0 0192.168.10.91:3306 0.0.0.0:* LISTEN 3191/mysqld
tcp 0 0192.168.10.91:3306 192.168.10.91:34365 ESTABLISHED3191/mysqld
tcp 0 0192.168.10.91:34365 192.168.10.91:3306 ESTABLISHED1309/haproxy
[root@radiusone ~]#netstat -anp | grep 3307
tcp 0 0192.168.10.99:3307 0.0.0.0:* LISTEN 1309/haproxy
tcp 0 0192.168.10.99:3307 192.168.10.1:62490 ESTABLISHED 1309/haproxy
[root@radiustwo ~]#netstat -anp | grep 3306
tcp 0 0192.168.10.92:3306 0.0.0.0:* LISTEN 3156/mysqld
[root@radiustwo ~]#netstat -anp | grep 3307
tcp 0 0192.168.10.99:3307 0.0.0.0:* LISTEN 1843/haproxy
检查mariadb集群的状态
# mysql -u root–p
MariaDB [(none)]>SHOW STATUS LIKE 'wsrep%';
https://mariadb.com/kb/en/mariadb/galera-cluster-status-variables/
HA测试
(1)关闭或者重启slave节点radiustwo,检查能否使用VIP:3307正常访问数据库;【通过】
(2)关闭或者重启master节点radiusone,检查能否使用VIP:3307正常访问数据库;【通过】
(3)将master(radiusone)和slave(radiustwo)节点全部关闭,启动后,检查能否使用VIP:3307正常访问数据库。【按照上面集群启动的顺序来操作】
集群排查错误
(1)查看日志如果有以下类似错误:
# tailf/var/log/mariadb/mariadb.log
160203 18:31:37[Warning] WSREP: no nodes coming from prim view, prim notpossible
160203 18:31:37[Note] WSREP:view(view_id(NON_PRIM,4c674ade-ca61-11e5-a9f8-7e298e07ba
33,1) memb{
4c674ade-ca61-11e5-a9f8-7e298e07ba33,0
} joined {
} left {
} partitioned{
})
160203 18:31:38[Warning] WSREP: last inactive check more than PT1.5S ago(PT3.50687S)
, skippingcheck
160203 18:32:07[Note] WSREP: view((empty))
160203 18:32:07[ERROR] WSREP: failed to open gcomm backend connection: 110: failedto
reach primaryview: 110 (Connection timed out)
at gcomm/src/pc.cpp:connect():141
160203 18:32:07[ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():202: Failed toopen
backend connection:-110 (Connection timed out)
160203 18:32:07[ERROR] WSREP: gcs/src/gcs.c:gcs_open():1291: Failed to openchannel '
dbcluster' at'gcomm://192.168.10.91': -110 (Connection timed out)
160203 18:32:07[ERROR] WSREP: gcs connect failed: Connection timed out
160203 18:32:07[ERROR] WSREP: wsrep::connect() failed: 7
160203 18:32:07[ERROR] Aborting
160203 18:32:07[Note] WSREP: Service disconnected.
160203 18:32:08[Note] WSREP: Some threads may fail to exit.
160203 18:32:08[Note] /usr/libexec/mysqld: Shutdown complete
多半是因为防火墙的问题,使用iptables-F关闭防火墙,再次重启集群,检查HA是否正常。
# sudo iptables-F
(2)Mariadb无法重启,报如下错误
160203 21:03:26[Note] Plugin 'FEEDBACK' is disabled.
160203 21:03:26[Note] Server socket created on IP: '0.0.0.0'.
160203 21:03:26[ERROR] Can't start server: listen() on TCP/IP port: Addressalready in use
160203 21:03:26[ERROR] listen() on TCP/IP failed with error 98
160203 21:03:26[ERROR] Aborting
16020321:03:26 InnoDB: Starting shutdown...
16020321:03:27 InnoDB: Shutdown completed; log sequencenumber 1600051
160203 21:03:27[Note] /usr/libexec/mysqld: Shutdown complete'
原因是3306端口被占用,使用命令netstat-anp | grep 3306查看可知被haproxy监听listengalera_cluster端口占用。
解决方法:
关闭haproxy,重启mariadb集群成功后,再开启haproxy。
# iptables-F
# systemctl stophaproxy
# systemctl restartmariadb
# systemctl restarthaproxy
参考:
http://serverfault.com/questions/623054/centos-7-freeradius-fails-to-start-on-boot-due-to-priority
或者
将haproxy监听端口改为其它端口,比如3307。但是其它应用(比如freeradius和daloradius)在配置数据库是也要将端口设为3307。再或者是将各节点的mariadb配置为其它端口,修改
listengalera_cluster
bind 192.168.10.99:3307
balance source
option tcpka
server radiusone 192.168.10.91:3306 check port 4567 inter 2000 rise2 fall 5
server radiustwo 192.168.10.92:3306 check port 4567 inter 2000 rise2 fall 5 backup
# vim/etc/my.cnf.d/galera.cnf +27
bind-address=0.0.0.0 --> bind-address=192.168.10.91/92
主要参考:http://galeracluster.com/documentation-webpages/galera-documentation.pdf
cp /usr/share/zoneinfo/Asia/Shanghai/etc/localtime