Apache Zepplin 설치 하기

Requirements

  • Java 7+
  • Maven
  • Node.js Package Manager
  • Git

소스 받기

$ git clone https://github.com/apache/incubator-zeppelin

PATH 등록

export ZEPPELIN_HOME={Zepplin Directory}
export PATH=$ZEPPELIN_HOME/sbin:$PATH

Mavne POM 변경

  • spark/pom.xml 변경

    $ vi $ZEPPELIN_HOME/spark/pom.xml
    
    <properties>
      <spark.version>1.5.2</spark.version>
      <scala.version>2.11.5</scala.version>
      <scala.binary.version>2.11</scala.binary.version>
      <hadoop.version>2.6.0</hadoop.version>
      <py4j.version>0.8.2.1</py4j.version>
    </properties>
    
    <dependency>
    <groupId>org.elasticsearch</groupId>
    <artifactId>elasticsearch-hadoop</artifactId>
    <version>2.2.0-beta1</version>
    <exclusions>
      <exclusion>
        <groupId>org.slf4j</groupId>
        <artifactId>log4j-over-slf4j</artifactId>
      </exclusion>
      <exclusion>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.10</artifactId>
      </exclusion>
      <exclusion>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.10</artifactId>
      </exclusion>
    </exclusions>
    </dependency>
    
  • spark-dependency/pom.xml 변경

    vi $ZEPPELIN_HOME/spark-dependency/pom.xml
    
    <properties>
      <spark.version>1.5.2</spark.version>
      <scala.version>2.11.5</scala.version>
      <scala.binary.version>2.11</scala.binary.version>
    
      <hadoop.version>2.6.0</hadoop.version>
      <yarn.version>${hadoop.version}</yarn.version>
      <avro.version>1.7.7</avro.version>
      <avro.mapred.classifier></avro.mapred.classifier>
      <jets3t.version>0.7.1</jets3t.version>
      <protobuf.version>2.4.1</protobuf.version>
    
      <akka.group>org.spark-project.akka</akka.group>
      <akka.version>2.3.4-spark</akka.version>
    
      <spark.download.url>http://archive.apache.org/dist/spark/spark-${spark.version}/spark-${spark.version}.tgz</spark.download.url>
      <py4j.version>0.8.2.1</py4j.version>
    </properties>
    

Maven 빌드

  • config 파일 복사
    $ cd $ZEPPLIN_HOME
    $ mvn clean package install -Pspark-1.5 -Dspark.version=1.5.2 -Phadoop-2.6 -Dhadoop.version=2.6.0 -Pyarn -Ppyspark -DskipTests
    

Config 변경

  • config 파일 복사

    $ mv $ZEPPELIN_HOME/conf/zeppelin-env.sh.template $ZEPPELIN_HOME/conf/zeppelin-env.sh
    $ mv $ZEPPELIN_HOME/conf/zeppelin-site.xml.template $ZEPPELIN_HOME/conf/zeppelin-site.xml
    
  • zeppelin-env.sh 변경

    $ vi $ZEPPELIN_HOME/conf/zeppelin-env.sh
    
    export SPARK_HOME=/home/logvadmin/spark-1.5.2-bin-hadoop2.6
    
  • zeppelin web 환경 설정 변경

    $ vi $ZEPPELIN_HOME/conf/zeppelin-site.xml
    
    $ vi zeppelin-site.xml
    <property>
    <name>zeppelin.server.addr</name>
    <value>192.168.10.251</value>
    <description>Server address</description>
    </property>
    <property>
    <name>zeppelin.server.port</name>
    <value>9090</value>
    <description>Server port.</description>
    </property>
    

    Server Address, Port 변경

Zepplin 실행


Posted by satis
,