胖胖的cassandra工作笔记——安装及集群部署，cassandra集群部署

和通数据库htsjk.Com2019-09-07 22:07 来源:未知阅读:17138 评论 507 热度4

标签：

胖胖的cassandra工作笔记——安装及集群部署，cassandra集群部署

序言：因为公司业务涉及cassandra的应用，在边学边做边测试的过程中，发现网上关于此新事物的东西真的好少，中文版的更少，因此萌生了把此经历记录下来，跟同学们一起学习进步。

一：安装

cassandra的安装网上一搜一大堆（貌似也就安装能搜到点儿东西），在此简单记下

1：在http://cassandra.apache.org/下载最新版本的cassandra

2：解压缩，比如解压缩到D:/cassandra

3：打开conf文件夹，修改log4j.properties文件，指定安装路径，如下（logs文件夹自行创建）：
log4j.appender.R.File=D:/cassandra/logs

4：修改conf目录下的storage-conf.xml文件（相关四个文件夹自行创建）

<CommitLogDirectory>D:/cassandra/commitlog</CommitLogDirectory>
                 <DataFileDirectories>
                       <DataFileDirectory>D:/cassandra/data</DataFileDirectory>
                 </DataFileDirectories>
            <CalloutLocation>D:/cassandra/callouts</CalloutLocation>
            <StagingFileDirectory>D:/cassandra/staging</StagingFileDirectory>

5：去sun的官方下载相应版本的jre并安装,并记下安装路径，比如：D:/jre

6:设置环境变量"我的电脑"--"高级"选项卡--"环境变量"--"系统变量--新增"，根据自己安装的jre与cassandra的位置来设置系统变量的值
我的机器的配置如下：
变量名：JAVA_HOME 变量值：D:/jre
变量名：Cassandra_Home 变量值：D:/cassandra

7：免重启激活环境变量
在cmd中键入命令"set Java_Home=任意字符"回车，然后" set Cassandra_Home=任意字符"回车，这样环境变量可以在不重启的情况下被激活了。

二：集群的配置

cassandra保持流畅（即写入之后可以马上同步到所有replication，而不让人感受到延时）的集群配置是：
每四台服务器里有一台作为seeds服务器

现有八台服务器，分别为：10.1.1.1,10.1.1.2,10.1.1.3,10.1.1.4...10.1.1.8，准备以1,5为seeds，搭建一个测试集群。

修改每台服务器的storage-conf.xml，如下

1：修改每台服务器上的<ListenAddress>，<ThriftAddress>，由默认的：localhost改为各自服务器的ip

2：修改每台服务器上的<seeds>配置:
<Seeds>
<Seed>127.0.0.1</Seed>
</Seeds>

改为：
<Seeds>
<Seed>10.1.1.1</Seed>
<Seed>10.1.1.5</Seed>
</Seeds>

3：非seed的服务的<AutoBootstrap>要由false改为true，关于这个，维基上面是这么说的：

auto_bootstrap

Set to 'true' to make new [non-seed] nodes automatically migrate the right data to themselves. (If no InitialToken is specified, they will pick one such that they will get half the range of the most-loaded node.) If a node starts up without bootstrapping, it will mark itself bootstrapped so that you can't subsequently accidently bootstrap a node with data on it. (You can reset this by wiping your data and commitlog directories.)

Default is: 'false', so that new clusters don't bootstrap immediately. You should turn this on when you start adding new nodes to a cluster that already has data on it.

具体是说，如果一个节点设置为false的话，则别的新节点无法从该节点获取数据，(You can reset this by wiping your data and commitlog directories.) 这句话的引申意义让我感觉，即使从false改成了true，之前的数据还是无法获取的，所以必须清空数据，修改commitlog。

今天就写到这里，鉴于自己还是个新手，欢迎探讨指正！（不要砸烂番茄就行~O(∩_∩)O~）
之后还会有性能，存储，配置方面的记录出来，笨笨要鼓励胖胖哦~