Cassandra集群运维[Add & Remove Nodes]

栏目: 数据库 · 发布时间: 6年前

内容简介：在对Cassandra进行维护的时候，通常需要扩集群或者迁移数据，涉及到添加、移除节点。Virtual nodes (vnodes) greatly simplify adding nodes to an existing cluster:Calculating tokens and assigning them to each node is no longer required.

在对Cassandra进行维护的时候，通常需要扩集群或者迁移数据，涉及到添加、移除节点。

Cassandra Version: Apache Cassandra 3.0.6

Add Nodes

Virtual nodes (vnodes) greatly simplify adding nodes to an existing cluster:

Calculating tokens and assigning them to each node is no longer required.

Rebalancing a cluster is no longer necessary because a node joining the cluster assumes responsibility for an even portion of the data.

确保新加节点和现有集群的Cassandra 版本一致

【操作步骤】

在新的机器上部署cassandra，但不要启动

通常都是从现有集群的一台机器上scp cassandra目录到新机器

基于现有集群所用的 `snitch` 算法修改配置文件

cassandra-topology.properties or the cassandra-rackdc.properties

使用 PropertyFileSnitch 算法配置：cassandra-topology.properties
使用 GossipingPropertyFileSnitch , Ec2Snitch , Ec2MultiRegionSnitch , and GoogleCloudSnitch 算法配置：cassandra-rackdc.properties

ps: 这两个配置与机架和多数据中心有关，如果是同机架单数据中心则不用配置

修改配置 `cassandra.yaml` 文件

name	desc
auto_bootstrap	默认文件中是没有这个参数的，如果没有默认为true；如果有且为false修改为true
cluster_name	需要加入的集群名称
listen_address/broadcast_address	用来与集群内其他节点通信的ip，通常为本机真实ip，不要填写127.0.0.1或localhost
endpoint_snitch	用于定位节点和路由请求的算法，与现有集群保持一致
num_tokens	节点中vnodes的数量，与现有集群配置保持一致，如果当前机器配置更高可以按比例增加这个值，可以有更好的性能
seed_provider	种子节点，至少保证有一个现有集群的节点，-seeds列表表示了新节点与现有集群通过哪些节点通信（种子节点无法引导，所以不要仅仅把要加入的新节点配置进去，也不要将集群所有节点配置成种子节点）

启动新节点Cassandra

./bin/cassandra

初始化system相关信息

......
INFO  06:14:41 Initializing system.IndexInfo
INFO  06:14:42 Initializing system.batches
INFO  06:14:42 Initializing system.paxos
INFO  06:14:42 Initializing system.local
INFO  06:14:42 Initializing system.peers
INFO  06:14:42 Initializing system.peer_events
INFO  06:14:42 Initializing system.range_xfers
INFO  06:14:42 Initializing system.compaction_history
INFO  06:14:42 Initializing system.sstable_activity
INFO  06:14:42 Initializing system.size_estimates
INFO  06:14:42 Initializing system.available_ranges
INFO  06:14:42 Initializing system.views_builds_in_progress
INFO  06:14:42 Initializing system.built_views
INFO  06:14:42 Initializing system.hints
INFO  06:14:42 Initializing system.batchlog
......

寻找现有集群节点

INFO  06:14:44 Node /xx.xxx.xx.xx is now part of the cluster
INFO  06:14:44 Node /xx.xxx.xx.xx is now part of the cluster
INFO  06:14:44 Node /xx.xxx.xx.xx is now part of the cluster
INFO  06:14:44 Handshaking version with /xx.xxx.xx.xx
INFO  06:14:44 Handshaking version with /xx.xxx.xx.xx
INFO  06:14:44 InetAddress /xx.xxx.xx.xx is now UP
INFO  06:14:44 InetAddress /xx.xxx.xx.xx is now UP
INFO  06:14:44 InetAddress /xx.xxx.xx.xx is now UP

新节点加入集群

INFO  06:14:45 JOINING: waiting for ring information
INFO  06:14:45 Updating topology for all endpoints that have changed

同步schema

INFO  06:14:49 Initializing system_traces.events
INFO  06:14:49 Initializing system_traces.sessions
INFO  06:14:49 Initializing system_distributed.parent_repair_history
INFO  06:14:49 Initializing system_distributed.repair_history
INFO  06:14:49 Initializing system_auth.resource_role_permissons_index
INFO  06:14:49 Initializing system_auth.role_members
INFO  06:14:49 Initializing system_auth.role_permissions
INFO  06:14:49 Initializing system_auth.roles
INFO  06:14:49 JOINING: waiting for schema information to complete

Copy Schema数据

INFO  06:15:22 [Stream #317a30b0-d29d-11e8-aa92-e9ebc9b827d4] Executing streaming plan for Bootstrap
INFO  06:15:22 [Stream #317a30b0-d29d-11e8-aa92-e9ebc9b827d4] Starting streaming to /xx.xxx.xx.xx
INFO  06:15:22 [Stream #317a30b0-d29d-11e8-aa92-e9ebc9b827d4] Starting streaming to /xx.xxx.xx.xx
INFO  06:15:22 [Stream #317a30b0-d29d-11e8-aa92-e9ebc9b827d4] Starting streaming to /xx.xxx.xx.xx
INFO  06:15:22 [Stream #317a30b0-d29d-11e8-aa92-e9ebc9b827d4, ID#0] Beginning stream session with /xx.xxx.xx.xx
INFO  06:15:22 [Stream #317a30b0-d29d-11e8-aa92-e9ebc9b827d4, ID#0] Beginning stream session with /xx.xxx.xx.xx
INFO  06:15:22 [Stream #317a30b0-d29d-11e8-aa92-e9ebc9b827d4, ID#0] Beginning stream session with /xx.xxx.xx.xx
INFO  06:15:22 [Stream #317a30b0-d29d-11e8-aa92-e9ebc9b827d4 ID#0] Prepare completed. Receiving 48 files(358160851 bytes), sending 0 files(0 bytes)
INFO  06:15:22 [Stream #317a30b0-d29d-11e8-aa92-e9ebc9b827d4 ID#0] Prepare completed. Receiving 35 files(132483825 bytes), sending 0 files(0 bytes)
INFO  06:15:23 [Stream #317a30b0-d29d-11e8-aa92-e9ebc9b827d4 ID#0] Prepare completed. Receiving 46 files(174538642 bytes), sending 0 files(0 bytes)
INFO  06:16:54 [Stream #317a30b0-d29d-11e8-aa92-e9ebc9b827d4] Session with /xx.xxx.xx.xx is complete
INFO  06:17:38 [Stream #317a30b0-d29d-11e8-aa92-e9ebc9b827d4] Session with /xx.xxx.xx.xx is complete
INFO  06:20:28 [Stream #317a30b0-d29d-11e8-aa92-e9ebc9b827d4] Session with /xx.xxx.xx.xx is complete
INFO  06:20:28 [Stream #317a30b0-d29d-11e8-aa92-e9ebc9b827d4] All sessions completed

节点切换成NORMAL

INFO  06:20:29 Node /xx.xxx.xx.xx state jump to NORMAL
INFO  06:20:29 Waiting for gossip to settle before accepting client requests...

查看节点同步状态

./bin/nodetool status

数据同步期间节点的状态：

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens       Owns (effective)  Host ID                               Rack
UN  xx.xxx.xx.xx  1.53 GB    256          100.0%            30ed942d-6827-469b-aab9-7fb649c6c3d7  rack1
UN  xx.xxx.xx.xx  1.38 GB    256          100.0%            96736106-e95d-4c54-aabf-41666071bc59  rack1
UN  xx.xxx.xx.xx  1.07 GB    256          100.0%            4351af17-2e68-4b46-a78f-fad900e44d13  rack1
UJ  新加节点      57.87 MB   256          ?                 f3f590ac-9835-47bb-b4d8-6e17ea2916ac  rack1

数据同步结束后的状态：

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens       Owns (effective)  Host ID                               Rack
UN  xx.xxx.xx.xx  1.53 GB    256          69.2%             30ed942d-6827-469b-aab9-7fb649c6c3d7  rack1
UN  xx.xxx.xx.xx  1.38 GB    256          79.3%             96736106-e95d-4c54-aabf-41666071bc59  rack1
UN  xx.xxx.xx.xx  1.07 GB    256          78.0%             4351af17-2e68-4b46-a78f-fad900e44d13  rack1
UN  xx.xxx.xx.xx  581.43 MB  256          73.5%             f3f590ac-9835-47bb-b4d8-6e17ea2916ac  rack1

运行nodetool cleanup

nodetool options cleanup [keyspace_name [table_name] […] ]

在所有新节点都加入集群并且数据同步完成后，在之前旧的每一个节点上运行nodetool cleanup操作删除keys。

在做操作时保证一个节点结束后再运行下一个节点，不要并发执行，这样可以安全地推迟清理

Reomve Nodes

UN状态的节点下线

在要下线的节点运行 nodetool decommission 命令

nodetool <options> decommission

该命令会将当前节点的range和请求交给其他节点管理，并且将数据同步给其他节点

DN状态的节点下线

在任何存活的节点运行 nodetool removenode 命令

该命令会将当前集群下线的节点移除，并且将数据同步给其他节点

查看节点状态：

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
UN  192.168.2.101  112.82 KB  256     31.7%             420129fc-0d84-42b0-be41-ef7dd3a8ad06  RAC1
DN  192.168.2.103  91.11 KB   256     33.9%             d0844a21-3698-4883-ab66-9e2fd5150edd  RAC1
UN  192.168.2.102  124.42 KB  256     32.6%             8d5ed9f4-7764-4dbd-bad8-43fddce94b7c  RAC1

> nodetool <options> removenode -- <status> | <force> | <ID>

> nodetool removenode d0844a21-3698-4883-ab66-9e2fd5150edd

节点下线失败

 nodetool assassinate

nodetool [options] assassinate <ip_address>

nodetool -u cassandra -pw cassandra assassinate 192.168.100.2

转载请注明出处

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

Open Data Structures

Pat Morin / AU Press / 2013-6 / USD 29.66

Offered as an introduction to the field of data structures and algorithms, Open Data Structures covers the implementation and analysis of data structures for sequences (lists), queues, priority queues......一起来看看《Open Data Structures》这本书的介绍吧!

码农工具