最新编译Apache-impala 的心酸历程。大概花了10天才整好,极度的崩溃!!!由于国内的上网环境大家都懂的,访问国外的s3.amazonaws.com一些资源库的速度极其感人,尤其国家处于疫情的特殊时期,各种KXSW被墙,如果你有高速上网通道,你可以忽略本文,安装官方文档直接编译就行,在Google云上编译很快就能完成,但是拿到本地对于10多G几K的速度而言基本无法忍受。尝试了各种其他各种百度的方式编译都不行情况下,还是回归到官方的编译方式,各种尝试后终于完成。如果跟我一样无法高速上网的同学可以参考下,可以加快进度。

背景:

  • 最近一直在尝试Hive 的Orc格式ACID 功能,但是CDH的impala最新版本都还不支持hive属性 [transactional=true] 表。
  • 各种资料查询发现 IMPALA-8813 可以支持,版本是Apache-impala 最新版本 3.3.0版本,CDH目前还没支持到目前跟之相匹配的版本。
  • Apache-impala 版本的安装好像只能源代码编译安装,本人是没有找到相关的 rpm 、yum  或者其他的安装方式,有的话希望大佬可以分享指点下。

过程:

  1. 准备环境 : Centos 7 虚拟机一台,最好全新的,省的跟自己的本来环境造成冲突。
  2. 下载源代码:wget http://archive.apache.org/dist/impala/3.3.0/apache-impala-3.3.0.tar.gz
  3. 解压好后,修改一个配置文件 /bin/bootstrap_system.sh  把Ant的版本从1.9.13 改成1.9.14,并注释掉 sha512sum 校验。1.9.13已经无法下载,不然编译会报错
  4. 设置环境变量 (新环境直接Copy,无需修改,都是安装后的默认路径)
    export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.242.b08-0.el7_7.x86_64
    export JAVA_BIN=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.242.b08-0.el7_7.x86_64/bin
    export PATH=$PATH:$JAVA_HOME/bin
    export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
    export JAVA_HOME JAVA_BIN PATH CLASSPATH
    export ANT_HOME=/usr/local/apache-ant-1.9.14
    export PATH=$PATH:$JAVA_HOME/bin:$ANT_HOME/bin
    export MVN_HOME=/usr/local/apache-maven-3.5.4
    export PATH=$PATH:$JAVA_HOME/bin:$ANT_HOME/bin:$MVN_HOME/bin

     

  5. 【可以跳过】提前下载Python 需要的包,这个好像直接20 30M,可以直接下载,速度稍慢,但是至少是可以成功的。
  6. 提前下载好native-toolchain 文件。
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/tpc-h/2.17.0-gcc-4.9.2/tpc-h-2.17.0-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/re2/20190301-gcc-4.9.2/re2-20190301-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/cdp_components/1352353/tarballs/ranger-1.2.0.7.1.0.0-33-admin.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/cdh_components/1173663/tarballs/kudu-1.10.0-cdh6.x-SNAPSHOT-redhat7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/cdh_components/1173663/tarballs/llama-minikdc-1.0.0.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/cdh_components/1173663/tarballs/hive-2.1.1-cdh6.x-SNAPSHOT.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/cdh_components/1173663/tarballs/hadoop-3.0.0-cdh6.x-SNAPSHOT.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/cdh_components/1173663/tarballs/sentry-2.1.0-cdh6.x-SNAPSHOT.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/cdh_components/1173663/tarballs/hbase-2.1.0-cdh6.x-SNAPSHOT.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/zstd/1.4.0-gcc-4.9.2/zstd-1.4.0-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/zlib/1.2.8-gcc-4.9.2/zlib-1.2.8-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/tpc-ds/2.1.0-gcc-4.9.2/tpc-ds-2.1.0-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/tpc-h/2.17.0-gcc-4.9.2/tpc-h-2.17.0-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/thrift/0.9.3-p7-gcc-4.9.2/thrift-0.9.3-p7-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/llvm/5.0.1-asserts-p1-gcc-4.9.2/llvm-5.0.1-asserts-p1-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/llvm/5.0.1-p1-gcc-4.9.2/llvm-5.0.1-p1-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/avro/1.7.4-p4-gcc-4.9.2/avro-1.7.4-p4-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/binutils/2.26.1-gcc-4.9.2/binutils-2.26.1-gcc-4.9.2-ec2-package-centos-7.tar.gz 
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/boost/1.57.0-p3-gcc-4.9.2/boost-1.57.0-p3-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/breakpad/97a98836768f8f0154f8f86e5e14c2bb7e74132e-p2-gcc-4.9.2/breakpad-97a98836768f8f0154f8f86e5e14c2bb7e74132e-p2-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/bzip2/1.0.6-p2-gcc-4.9.2/bzip2-1.0.6-p2-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/cctz/2.2-gcc-4.9.2/cctz-2.2-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/cmake/3.14.3-gcc-4.9.2/cmake-3.14.3-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/crcutil/440ba7babeff77ffad992df3a10c767f184e946e-p1-gcc-4.9.2/crcutil-440ba7babeff77ffad992df3a10c767f184e946e-p1-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/flatbuffers/1.6.0-gcc-4.9.2/flatbuffers-1.6.0-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/gcc/4.9.2-gcc-4.9.2/gcc-4.9.2-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/gdb/7.9.1-p1-gcc-4.9.2/gdb-7.9.1-p1-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/gflags/2.2.0-p2-gcc-4.9.2/gflags-2.2.0-p2-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/glog/0.3.4-p3-gcc-4.9.2/glog-0.3.4-p3-gcc-4.9.2-ec2-package-centos-7.tar.gz https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/gperftools/2.5-gcc-4.9.2/gperftools-2.5-gcc-4.9.2-ec2-package-centos-7.tar.gz https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/gtest/1.6.0-gcc-4.9.2/gtest-1.6.0-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/libev/4.20-gcc-4.9.2/libev-4.20-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/libunwind/1.3-rc1-p3-gcc-4.9.2/libunwind-1.3-rc1-p3-gcc-4.9.2-ec2-package-centos-7.tar.gz https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/lz4/1.7.5-gcc-4.9.2/lz4-1.7.5-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/openldap/2.4.47-gcc-4.9.2/openldap-2.4.47-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/openssl/1.0.2l-gcc-4.9.2/openssl-1.0.2l-gcc-4.9.2-ec2-package-centos-7.tar.gz
    https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/orc/1.5.5-p1-gcc-4.9.2/orc-1.5.5-p1-gcc-4.9.2-ec2-package-centos-7.tar.gz https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/protobuf/3.5.1-gcc-4.9.2/protobuf-3.5.1-gcc-4.9.2-ec2-package-centos-7.tar.gz https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/flatbuffers/1.6.0-gcc-4.9.2/flatbuffers-1.6.0-gcc-4.9.2-ec2-package-centos-7.tar.gz https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/crcutil/440ba7babeff77ffad992df3a10c767f184e946e-p1-gcc-4.9.2/crcutil-440ba7babeff77ffad992df3a10c767f184e946e-p1-gcc-4.9.2-ec2-package-centos-7.tar.gz https://native-toolchain.s3.amazonaws.com/build/51-03506fd053/gcc/4.9.2-gcc-4.9.2/gcc-4.9.2-gcc-4.9.2-ec2-package-centos-7.tar.gz

    下载完成后,在源代码的文件夹下建一个 toolchain文件夹,把没有颜色标注的直接放到toolchain跟目录下,标注颜色的根据网址新建2个文件夹 cdp_components_1352353 cdh_components_1173663 把对应的文件放到对应的文件夹中,编译的过程中有时候明明存在了,还会重新下载的情况下,提前自己解压下就会跳过。(主要的需要的包网址给大家备好,当时就这卡了N久没速度干等,对于网速不好的同学是个挑战,大家各显神通,后续整理好我把本地的上传到网盘分享下)

  7. 执行编译命令 

    cd ~/Impala
    export IMPALA_HOME=`pwd`
    $IMPALA_HOME/bin/bootstrap_system.sh
    source $IMPALA_HOME/bin/impala-config.sh
    $IMPALA_HOME/buildall.sh -noclean -notests
  8. 如果能出现(盗图哈,我的当时没截图),就证明差不多快成功了

 

补充我下载好的资源分享:

链接:https://pan.baidu.com/s/1TCGTW0QS00zJi8iGsn5bmQ
提取码:rzkx

 

错误:

1.编译失败 ,需要重新执行编译的话,需要删除:

编译失败 需要删除:
cd  /usr/local/bin   rm -rf ant
cd  /var/lib/pgsql   yum remove postgresql
                     rm -rf data/
再次执行编译命令

2.

g++: internal compiler error: Killed (program cc1plus)


执行 :dmesg
[ 4449.266432] Out of memory: Kill process 11351 (cc1plus) score 782 or sacrifice child

虚拟机增加内存,gcc 编译需要很多内存

 

3. 到编译到最后的时候非常慢,[run maven -B xxxxx]的字样卡住时,这个是在下载maven 的资源库下载到本地,非常慢。需要等待N久。稍后我整理好我的本地的资源统一上传。可以下载大概 500M左右