In this short post I will show how you can run the Cloudera QuickStart using Docker. As you know from my previous post I am big fan of dockers and of all the stuff related to dockers. It’s great tool and I am using dockers in many situations, because it’s very easy to setup the run specific application or setup complex environment in few minutes. Additional it allows us to easily manage this application.
Table of Contents
Introduction
What Is Cloudera Data Platform (CDP)?
The Cloudera" Data Platform (CDP) is a data cloud designed for businesses. Businesses may use CDP to manage and secure the entire data lifecycle – gathering, enriching, analyzing, testing, and predicting with their data – in order to gain actionable insights and make data-driven decisions. To process enterprise data volumes, multi-stage analytic pipelines are required for the most valuable and disruptive business use cases. CDP enables businesses to succeed in the digital transformation era by enabling them to extract value from large-scale, complex, distributed, and quickly changing data.
Docker And Docker Images
Docker" is a containerization platform that allows you to package and run applications in isolated environments called containers. Containers provide a convenient and lightweight way to package and deploy applications, making it easy to run the same code on different systems and environments.
Docker images are the templates that are used to create Docker" containers. They contain all the necessary code, libraries, and dependencies needed to run an application. When you run a Docker" container, you are running an instance of a Docker" image.
To use Docker", you will need to install the Docker" engine on your system. Once the engine is installed, you can use the docker
command-line tool to manage Docker" images and containers. Some common Docker" commands include:
docker pull IMAGE_NAME
: This command downloads a Docker" image from a registry (such as Docker" Hub) to your local system.docker run IMAGE_NAME
: This command runs a new Docker" container based on the specified image.docker ps
: This command lists all the running Docker" containers on your system.docker stop CONTAINER_ID
: This command stops the specified Docker" container.
For more information on Docker" and how to use it, you can consult the Docker" documentation or visit the Docker" website.
Cloudera Quickstart As Docker Image
Cloudera" QuickStart is a fully-functional Apache Hadoop" distribution that is packaged in a Docker" container. It allows you to quickly and easily set up a Cloudera" Hadoop" environment on your local machine for testing and development purposes.
To use Cloudera" QuickStart with Docker", you will need to have Docker" installed on your system. You can then use the docker pull
command to download the Cloudera" QuickStart Docker" image from a registry such as Docker" Hub.
One Command -> Cloudera QuickStart Using Docker
To up and run Cloudera" QuickStart environment please use this command below. Additional I mounted the “/src” directory to Docker", but of course you don’t have to do this. (Run Cloudera" QuickStart using Docker")
sudo docker run --hostname=quickstart.cloudera --privileged=true -t -i -v /src:/src --publish-all=true -p 8888 -p 7180 cloudera/quickstart /usr/bin/docker-quickstart
Now please wait few seconds or minutes. If you don’t have an image locally, it will be downloaded and after that the container will be created.
When container (Cloudera" QuickStart) will be up and running please find the entry in logs like this one. STARTUP_MSG: host = quickstart.cloudera/172.17.0.2. Now you can use this IP address to connect to Hue for instance. (Run Cloudera" QuickStart using Docker")
Cloudera QuickStart -> Docker Logs
Here you can the logs from Cloudera" QuickStart container image. (Run Cloudera" QuickStart using Docker")
Starting mysqld: [ OK ] if [ "$1" == "start" ] ; then if [ "${EC2}" == 'true' ]; then FIRST_BOOT_FLAG=/var/lib/cloudera-quickstart/.ec2-key-installed if [ ! -f "${FIRST_BOOT_FLAG}" ]; then METADATA_API=http://169.254.169.254/latest/meta-data KEY_URL=${METADATA_API}/public-keys/0/openssh-key SSH_DIR=/home/cloudera/.ssh mkdir -p ${SSH_DIR} chown cloudera:cloudera ${SSH_DIR} curl ${KEY_URL} >> ${SSH_DIR}/authorized_keys touch ${FIRST_BOOT_FLAG} fi fi if [ "${DOCKER}" != 'true' ]; then if [ -f /sys/kernel/mm/redhat_transparent_hugepage/defrag ]; then echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag fi cloudera-quickstart-ip HOSTNAME=quickstart.cloudera hostname ${HOSTNAME} sed -i -e "s/HOSTNAME=.*/HOSTNAME=${HOSTNAME}/" /etc/sysconfig/network fi ( cd /var/lib/cloudera-quickstart/tutorial; nohup python -m SimpleHTTPServer 80 & ) # TODO: check for expired CM license and update config.js accordingly fi + '[' start == start ']' + '[' '' == true ']' + '[' true '!=' true ']' + cd /var/lib/cloudera-quickstart/tutorial + nohup python -m SimpleHTTPServer 80 nohup: appending output to `nohup.out' JMX enabled by default Using config: /etc/zookeeper/conf/zoo.cfg Starting zookeeper ... STARTED starting datanode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-datanode-quickstart.cloudera.out Started Hadoop datanode (hadoop-hdfs-datanode): [ OK ] starting journalnode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-journalnode-quickstart.cloudera.out Started Hadoop journalnode: [ OK ] starting namenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-namenode-quickstart.cloudera.out Started Hadoop namenode: [ OK ] starting secondarynamenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-secondarynamenode-quickstart.cloudera.out
Started Hadoop secondarynamenode: [ OK ] Setting HTTPFS_HOME: /usr/lib/hadoop-httpfs Using HTTPFS_CONFIG: /etc/hadoop-httpfs/conf Sourcing: /etc/hadoop-httpfs/conf/httpfs-env.sh Using HTTPFS_LOG: /var/log/hadoop-httpfs/ Using HTTPFS_TEMP: /var/run/hadoop-httpfs Setting HTTPFS_HTTP_PORT: 14000 Setting HTTPFS_ADMIN_PORT: 14001 Setting HTTPFS_HTTP_HOSTNAME: quickstart.cloudera Setting HTTPFS_SSL_ENABLED: false Setting HTTPFS_SSL_KEYSTORE_FILE: /var/lib/hadoop-httpfs/.keystore Setting HTTPFS_SSL_KEYSTORE_PASS: password Using CATALINA_BASE: /var/lib/hadoop-httpfs/tomcat-deployment Using HTTPFS_CATALINA_HOME: /usr/lib/bigtop-tomcat Setting CATALINA_OUT: /var/log/hadoop-httpfs//httpfs-catalina.out Using CATALINA_PID: /var/run/hadoop-httpfs/hadoop-httpfs-httpfs.pid Using CATALINA_OPTS: Adding to CATALINA_OPTS: -Dhttpfs.home.dir=/usr/lib/hadoop-httpfs -Dhttpfs.config.dir=/etc/hadoop-httpfs/conf -Dhttpfs.log.dir=/var/log/hadoop-httpfs/ -Dhttpfs.temp.dir=/var/run/hadoop-httpfs -Dhttpfs.admin.port=14001 -Dhttpfs.http.port=14000 -Dhttpfs.http.hostname=quickstart.cloudera Using CATALINA_BASE: /var/lib/hadoop-httpfs/tomcat-deployment Using CATALINA_HOME: /usr/lib/bigtop-tomcat Using CATALINA_TMPDIR: /var/run/hadoop-httpfs Using JRE_HOME: /usr/java/jdk1.7.0_67-cloudera Using CLASSPATH: /usr/lib/bigtop-tomcat/bin/bootstrap.jar Using CATALINA_PID: /var/run/hadoop-httpfs/hadoop-httpfs-httpfs.pid Started Hadoop httpfs (hadoop-httpfs): [ OK ] starting historyserver, logging to /var/log/hadoop-mapreduce/mapred-mapred-historyserver-quickstart.cloudera.out 19/12/20 09:19:27 INFO hs.JobHistoryServer: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting JobHistoryServer STARTUP_MSG: host = quickstart.cloudera/172.17.0.2 STARTUP_MSG: args = [] STARTUP_MSG: version = 2.6.0-cdh5.7.0 STARTUP_MSG: build = http://github.com/cloudera/hadoop -r c00978c67b0d3fe9f3b896b5030741bd40bf541a; compiled by 'jenkins' on 2016-03-23T18:36Z STARTUP_MSG: java = 1.7.0_67 ************************************************************/ Started Hadoop historyserver: [ OK ] starting nodemanager, logging to /var/log/hadoop-yarn/yarn-yarn-nodemanager-quickstart.cloudera.out Started Hadoop nodemanager: [ OK ] starting resourcemanager, logging to /var/log/hadoop-yarn/yarn-yarn-resourcemanager-quickstart.cloudera.out Started Hadoop resourcemanager: [ OK ] starting master, logging to /var/log/hbase/hbase-hbase-master-quickstart.cloudera.out Started HBase master daemon (hbase-master): [ OK ] starting rest, logging to /var/log/hbase/hbase-hbase-rest-quickstart.cloudera.out Started HBase rest daemon (hbase-rest): [ OK ] starting thrift, logging to /var/log/hbase/hbase-hbase-thrift-quickstart.cloudera.out Started HBase thrift daemon (hbase-thrift): [ OK ] Starting Hive Metastore (hive-metastore): [ OK ] Started Hive Server2 (hive-server2): [ OK ] Starting Sqoop Server: [ OK ] Sqoop home directory: /usr/lib/sqoop2 Setting SQOOP_HTTP_PORT: 12000 Setting SQOOP_ADMIN_PORT: 12001 Using CATALINA_OPTS: -Xmx1024m Adding to CATALINA_OPTS: -Dsqoop.http.port=12000 -Dsqoop.admin.port=12001 Using CATALINA_BASE: /var/lib/sqoop2/tomcat-deployment Using CATALINA_HOME: /usr/lib/bigtop-tomcat Using CATALINA_TMPDIR: /var/tmp/sqoop2 Using JRE_HOME: /usr/java/jdk1.7.0_67-cloudera Using CLASSPATH: /usr/lib/bigtop-tomcat/bin/bootstrap.jar Using CATALINA_PID: /var/run/sqoop2/sqoop-server-sqoop2.pid Starting Spark history-server (spark-history-server): [ OK ] Starting Hadoop HBase regionserver daemon: starting regionserver, logging to /var/log/hbase/hbase-hbase-regionserver-quickstart.cloudera.out hbase-regionserver. Starting hue: [ OK ] Started Impala State Store Server (statestored): [ OK ] Setting OOZIE_HOME: /usr/lib/oozie Sourcing: /usr/lib/oozie/bin/oozie-env.sh setting JAVA_LIBRARY_PATH="$JAVA_LIBRARY_PATH:/usr/lib/hadoop/lib/native" setting OOZIE_DATA=/var/lib/oozie setting OOZIE_CATALINA_HOME=/usr/lib/bigtop-tomcat setting CATALINA_TMPDIR=/var/lib/oozie setting CATALINA_PID=/var/run/oozie/oozie.pid setting CATALINA_BASE=/var/lib/oozie/tomcat-deployment setting OOZIE_HTTPS_PORT=11443 setting OOZIE_HTTPS_KEYSTORE_PASS=password setting CATALINA_OPTS="$CATALINA_OPTS -Doozie.https.port=${OOZIE_HTTPS_PORT}" setting CATALINA_OPTS="$CATALINA_OPTS -Doozie.https.keystore.pass=${OOZIE_HTTPS_KEYSTORE_PASS}" setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m" setting OOZIE_CONFIG=/etc/oozie/conf setting OOZIE_LOG=/var/log/oozie Using OOZIE_CONFIG: /etc/oozie/conf Sourcing: /etc/oozie/conf/oozie-env.sh setting JAVA_LIBRARY_PATH="$JAVA_LIBRARY_PATH:/usr/lib/hadoop/lib/native" setting OOZIE_DATA=/var/lib/oozie setting OOZIE_CATALINA_HOME=/usr/lib/bigtop-tomcat setting CATALINA_TMPDIR=/var/lib/oozie setting CATALINA_PID=/var/run/oozie/oozie.pid setting CATALINA_BASE=/var/lib/oozie/tomcat-deployment setting OOZIE_HTTPS_PORT=11443 setting OOZIE_HTTPS_KEYSTORE_PASS=password setting CATALINA_OPTS="$CATALINA_OPTS -Doozie.https.port=${OOZIE_HTTPS_PORT}" setting CATALINA_OPTS="$CATALINA_OPTS -Doozie.https.keystore.pass=${OOZIE_HTTPS_KEYSTORE_PASS}" setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m" setting OOZIE_CONFIG=/etc/oozie/conf setting OOZIE_LOG=/var/log/oozie Setting OOZIE_CONFIG_FILE: oozie-site.xml Using OOZIE_DATA: /var/lib/oozie Using OOZIE_LOG: /var/log/oozie Setting OOZIE_LOG4J_FILE: oozie-log4j.properties Setting OOZIE_LOG4J_RELOAD: 10 Setting OOZIE_HTTP_HOSTNAME: quickstart.cloudera Setting OOZIE_HTTP_PORT: 11000 Setting OOZIE_ADMIN_PORT: 11001 Using OOZIE_HTTPS_PORT: 11443 Setting OOZIE_BASE_URL: http://quickstart.cloudera:11000/oozie Using CATALINA_BASE: /var/lib/oozie/tomcat-deployment Setting OOZIE_HTTPS_KEYSTORE_FILE: /var/lib/oozie/.keystore Using OOZIE_HTTPS_KEYSTORE_PASS: password Setting OOZIE_INSTANCE_ID: quickstart.cloudera Setting CATALINA_OUT: /var/log/oozie/catalina.out Using CATALINA_PID: /var/run/oozie/oozie.pid Using CATALINA_OPTS: -Doozie.https.port=11443 -Doozie.https.keystore.pass=password -Xmx1024m -Doozie.https.port=11443 -Doozie.https.keystore.pass=password -Xmx1024m -Dderby.stream.error.file=/var/log/oozie/derby.log Adding to CATALINA_OPTS: -Doozie.home.dir=/usr/lib/oozie -Doozie.config.dir=/etc/oozie/conf -Doozie.log.dir=/var/log/oozie -Doozie.data.dir=/var/lib/oozie -Doozie.instance.id=quickstart.cloudera -Doozie.config.file=oozie-site.xml -Doozie.log4j.file=oozie-log4j.properties -Doozie.log4j.reload=10 -Doozie.http.hostname=quickstart.cloudera -Doozie.admin.port=11001 -Doozie.http.port=11000 -Doozie.https.port=11443 -Doozie.base.url=http://quickstart.cloudera:11000/oozie -Doozie.https.keystore.file=/var/lib/oozie/.keystore -Doozie.https.keystore.pass=password -Djava.library.path=:/usr/lib/hadoop/lib/native:/usr/lib/hadoop/lib/native Using CATALINA_BASE: /var/lib/oozie/tomcat-deployment Using CATALINA_HOME: /usr/lib/bigtop-tomcat Using CATALINA_TMPDIR: /var/lib/oozie Using JRE_HOME: /usr/java/jdk1.7.0_67-cloudera Using CLASSPATH: /usr/lib/bigtop-tomcat/bin/bootstrap.jar Using CATALINA_PID: /var/run/oozie/oozie.pid Starting Solr server daemon: [ OK ] Using CATALINA_BASE: /var/lib/solr/tomcat-deployment Using CATALINA_HOME: /usr/lib/solr/../bigtop-tomcat Using CATALINA_TMPDIR: /var/lib/solr/ Using JRE_HOME: /usr/java/jdk1.7.0_67-cloudera Using CLASSPATH: /usr/lib/solr/../bigtop-tomcat/bin/bootstrap.jar Using CATALINA_PID: /var/run/solr/solr.pid Started Impala Catalog Server (catalogd) : [ OK ] Started Impala Server (impalad): [ OK ]
Summary
For more information on Cloudera" QuickStart and how to use it, you can consult the Cloudera" documentation or visit the Cloudera" website.
Could You Please Share This Post?
I appreciate It And Thank YOU! :)
Have A Nice Day!