In the training, we learned to build simple integration processes, master and standalone processes, we learned about context variables and learned how to use objects in the repository. It’s time to learn about the best practices for building data flows in Talend Data Integration, which will facilitate the work of you and others when analyzing already prepared ETL processes.
From left to right
In accordance with Talend best practices, all data flows within the process should be built from left to right and from top to bottom.
Name your components
The component with the name tDBInput_1 is really fine, but for a beginner developer who built his first process quickly and still has no idea where he could change the name. To make the process more readable and easier to analyze for others, name the components in a way that describes task, but also is short.
Use context variables
If your process depends on external data, always store it as context variables. Keep file paths or names as variables instead of leaving them hard-coded. Remember to make your processes flexible and easy to modify in the case of changes.
Order in the repository
If you use Talend DI only for your own use, you probably won’t feel a big mess in the repository. However, imagine what it would look like if you worked on several projects, and all processes would be thrown under the Job Designs tab – finding a specific process certainly would not be the fastest and easiest tasks. So remember to put processes in dedicated project folders. In addition, you can ensure greater order by giving processes the appropriate names, e.g. 100 – processes loading the stage data layer, 200 – CDC, 600 – test processes, and master processes as 900.
Don’t forget about the documentation
The last good practice I would like to share with you is documentation. Remember to make a brief annotation after creating the data flow process using the tNote component with basic information, e.g. the author or date the process was created. You can also add information about individual components through the record in the tab Component -> Documentation.
When creating each job, only its name is required – but remember that providing the purpose of its creation and description will facilitate its later analysis.
If you enjoyed this post please add the comment below or share this post on your Facebook, Twitter, LinkedIn or another social media webpage.
Thanks in advanced!