角色分配
Flume 的 Agent 和 Collector 分布如下表所示:
名称 | HOST | 角色 |
---|
Agent1 | node01 | Web Server |
Collector1 | node02 | AgentMstr1 |
Collector2 | node03 | AgentMstr2 |
图中所示,Agent1 数据分别流入到 Collector1 和 Collector2,Flume NG 本身提供了 Failover 机制,可以自动切换和恢复。在上图中,有 3 个产生日志服务器分布在不同的机房,要把所有的日志都收集到一个集群中存储。下 我们开发配置 Flume NG 集群
node01 安装配置 flume 与拷贝文件脚本
将 node03 机器上面的 flume 安装包拷贝到 node01 机器上面去
node03 机器执行以下命令
cd /export/servers scp -r apache-flume-1.6.0-cdh5.14.0-bin/ node01:$PWD
|
node01 机器配置 agent 的配置文件
cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf vim agent.conf
|
agent.conf agent1.channels = c1 agent1.sources = r1 agent1.sinks = k1 k2
agent1.sinkgroups = g1
agent1.channels.c1.type = memory agent1.channels.c1.capacity = 1000 agent1.channels.c1.transactionCapacity = 100
agent1.sources.r1.channels = c1 agent1.sources.r1.type = exec agent1.sources.r1.command = tail -F /export/servers/taillogs/access_log
agent1.sources.r1.interceptors = i1 i2 agent1.sources.r1.interceptors.i1.type = static agent1.sources.r1.interceptors.i1.key = Type agent1.sources.r1.interceptors.i1.value = LOGIN agent1.sources.r1.interceptors.i2.type = timestamp
agent1.sinks.k1.channel = c1 agent1.sinks.k1.type = avro agent1.sinks.k1.hostname = node02 agent1.sinks.k1.port = 52020
agent1.sinks.k2.channel = c1 agent1.sinks.k2.type = avro agent1.sinks.k2.hostname = node03 agent1.sinks.k2.port = 52020
agent1.sinkgroups.g1.sinks = k1 k2
agent1.sinkgroups.g1.processor.type = failover agent1.sinkgroups.g1.processor.priority.k1 = 10 agent1.sinkgroups.g1.processor.priority.k2 = 1 agent1.sinkgroups.g1.processor.maxpenalty = 10000
|