Flume-0.9.4和Hbase-0.96整合
2021-03-17 01:25
2,修改flume-core \ src \ main \ java \ org \ apache \ hadoop \ io \ FlushingSequenceFileWriter.java和RawSequenceFileWriter.java两个java类,
因为步骤一中我们用新版本的Hadoop替换了旧版本的Hadoop,而新版本Hadoop中的org.apache.hadoop.io.SequenceFile.Writer类和旧版本的org.apache.hadoop.io.SequenceFile.Writer类有些不一样。所以导致了FlushingSequenceFileWriter.java和RawSequenceFileWriter.java两个java类出现了部分的错误,解决方法如下:
(1),需要修改Hadoop-2.2.0源码中的hadoop-2.2.0-src \ hadoop-common-project \ hadoop-common \ src \ main \ java \ org \ apache \ hadoop \ io \ SequenceFile.java类,在Writer类里面添加替换的构造函数:
Writer(){ this .compress = CompressionType.NONE;
} |
然后重新编译hadoop-common-project工程,将编译后的hadoop-common-2.2.0.jar替换为hadoop-common-2.2.0.jar
(2),修改FlushingSequenceFileWriter.java和RawSequenceFileWriter.java
这两个类中有错误,请使用新版本的Hadoop的相应API替换掉旧版本的Hadoop的API,具体如何修改,由此不不说了,如有需要的同学,可以邮件联系我(wyphao.2007@163.com )
(3),修改com.cloudera.flume.handlers.seqfile中的SequenceFileOutputFormat类修改如下:
this (SequenceFile.getCompressionType(FlumeConfiguration.get()),
new DefaultCodec());
修改为 this (SequenceFile.getDefaultCompressionType(FlumeConfiguration.get()),
new DefaultCodec());
CompressionType compressionType = SequenceFile.getCompressionType(conf); 修改为 CompressionType compressionType = SequenceFile.getDefaultCompressionType(conf); |
3,重新编译Flume源码
重新编译Flume源码(如何编译Flume源码?请参见本博客的《 Flume-0.9.4内核编译及一些编译错误解决方法》),并用编译之后的flume-core-0.9.4- cdh3u3.jar替换$ {FLUME_HOME} / lib中的flume-core-0.9.4-cdh3u3.jar类。删除掉$ {FLUME_HOME} /lib/hadoop-core-0.20.2-cdh3u3.jar等有关Hadoop旧版本的包。
4,修改$ {} FLUME_HOME /斌/水槽脚本启动
仔细分析$ {} FLUME_HOME /斌/水槽脚本,你会发现如下代码:
# put hadoop conf dir in classpath to include Hadoop # core-site.xml/hdfs-site.xml
if [ -n "${HADOOP_CONF_DIR}" ]; then
CLASSPATH= "${CLASSPATH}:${HADOOP_CONF_DIR}"
elif [ -n "${HADOOP_HOME}" ] ; then
CLASSPATH= "${CLASSPATH}:${HADOOP_HOME}/conf"
elif [ -e "/usr/lib/hadoop/conf" ] ; then
# if neither is present see if the CDH dir exists
CLASSPATH= "${CLASSPATH}:/usr/lib/hadoop/conf" ;
HADOOP_HOME= "/usr/lib/hadoop"
fi # otherwise give up
# try to load the hadoop core jars
HADOOP_CORE_FOUND= false
while true ; do
if [ -n "$HADOOP_HOME" ]; then
HADCOREJARS=`find ${HADOOP_HOME}/hadoop-core*.jar || \
find ${HADOOP_HOME}/lib/hadoop-core*.jar || true `
if [ -n "$HADCOREJARS" ]; then
HADOOP_CORE_FOUND= true
CLASSPATH= "$CLASSPATH:${HADCOREJARS}"
break ;
fi
fi
HADCOREJARS=`find ./lib/hadoop-core*.jar 2 > /dev/ null || true `
if [ -n "$HADCOREJARS" ]; then
# if this is the dev environment then hadoop jar will
# get added as part of ./lib (below)
break
fi
# core jars may be missing, we‘ll check for this below
break
done
|
你会发现,这是Flume加载Hadoop旧版本的依赖包,在新版本的Hadoop根本就没有$ {HADOOP_HOME} / conf等文件夹,所以会出现Flume不能加载对新版本Hadoop的依赖。这里教你用最简单的方法来实现对新版本的Hbase和Hadoop的依赖,在$ {FLUME_HOME} / bin / flume脚本里面加入下面的CLASSPATH依赖:
CLASSPATH= "/home/q/hbase/hbase-0.96.0-hadoop2/lib/*"
|
5,如何和Hbase-0.96整合
在flume- src \ plugins \ flume-plugin-hbasesink \ src \ main \ java里面的添加自己的类(当然你完全可以自己创建一个新的maven工程)。如果需要和Hbase整合,必须继承EventSink.Base类,改写里面的方法(可以参照flume-src \ plugins \ flume-plugin-hbasesink \ src \ main \ java \ com \ cloudera \ flume \ hbase \ Attr2 HBase EventSink.java),写完之后需要重新编译flume-src \ plugins \ flume-plugin-hbasesink底下的类,打包成jar文件。然后将您写好的HBase接收器注册到Flume中,关于如何注册,请参见本博客的《 Flume -0.9.4配置Hbase sink》。。
6,结束
经过上面几步的配置,你的水槽-0.9.4就可以和Hbase-0.96整合了,祝你成功。