Connection Between Talend And Cloudera. Check My Secret! 3 Simple & Easy Steps!

You are currently viewing Connection Between Talend And Cloudera. Check My Secret! 3 Simple & Easy Steps!
Share This Post, Help Others, And Earn My Heartfelt Appreciation! :)
4.9
(733)

In this post I will show you how to set up a Connection between Talend and Cloudera to be able to connect to CDP. When you will go through this tutorial you will be able to use Talend and connect to Cloudera.

Cloudera (CDP) is one of the three major players in the market alongside Hortonworks and MapR, which distributes the Hadoop general-interest.

Create A New Talend Cloudera Connection

In the Metadata section, right-click on Hadoop Cluser and select Create Hadoop Cluster.

Connection configuration between Talend and Cloudera - 3 simple steps!
Connection between Talend and Cloudera

In the new window, enter the name of the connection (optionally you can add a destination and description) and click Next.

In the next window “Hadoop Configuration Import Wizard” set in order:

  • Distribution = Cloudera
  • Version = in my case it was the highest available version CDH5.12 with YARN. When you do not see your version, choose the one that is closest to yours.
  • Option = Chanege to “Retrieve configuration from Ambari or Cloudera”.

Connect To Cloudera

Once we have everything selected, click the Next button.

Connection configuration between Talend and Cloudera - 3 simple steps!
Connection between Talend and Cloudera

Now enter the server address where you will find CDP Manager. The standard port is 7180. In addition, you must provide the user and password to CDP Manager.

When we have all the fields completed, we must:

  1. Click ‘Connect‘. After a few seconds, we should have our cluster available in the “Discovered clusters” section.
  2. Click on the “Fetch” button.

We click the “Finished” button.

Connection configuration between Talend and Cloudera - 3 simple steps!
Connection between Talend and Cloudera

In the next window, fill in the next portion of information.

Very important: use host names instead of IP addresses!

It may happen that host names will not resolve to IP addresses. In this case, add the host names to the hosts file.

If you do not know how to do it, go to the post: Windows: How to add a server name and IP to the hosts file?

  • Namenode URI – starting with “hdfs”. The port is not necessary. Default 8080.
  • Resource Manager
  • Resoure Manager Scheduler
  • Job History
  • Staging directory
  • User name – the user you will be used for. reading/writing data from HDFS.

Now check your connection by clicking the “Check Services” button.

Connection configuration between Talend and Cloudera - 3 simple steps!
Connection between Talend and Cloudera

You will see a new window where TOS will check the connection to the cluster. If everything is ok, you will receive a green bar at the level of each site.

Connection configuration between Talend and Cloudera - 3 simple steps!
Connection between Talend and Cloudera

Click “Finished” and we can now use the defined connection in subsequent jobs. That’s it. I hope this post will help you solve your problem and you have learned something new that will be useful for you in the future! Enjoy!

If you enjoyed this post please add the comment below and share this post on your Facebook, Twitter, LinkedIn or another social media webpage.
Thanks in advanced!

How useful was this post?

Click on a star to rate it!

Average rating 4.9 / 5. Vote count: 733

No votes so far! Be the first to rate this post.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments