Share this post and Earn Free Points!

One of the many possibilities of using triggers is presented in the description of this training (Talend DI Tutorial Triggers and error handling) – we do not need to connect to the database directly from Input and Output components, creating several separate connections. We can connect to the database with a component of the Connection type, e.g. tMysqlConnection, in case of get the connection – we perform the main process, and in case of failure – we handle the error.

Introduction

Error handling

Error handling is the process of anticipating, detecting, and responding to errors that may occur during the execution of a program. It is an important aspect of software development because it helps ensure that a program can handle and recover from unexpected situations without crashing or behaving unexpectedly.

There are many types of errors that can occur in a program, including syntax errors, runtime errors, and logic errors. Syntax errors are mistakes in the structure of the program’s code and are detected when the program is compiled. Runtime errors are mistakes that occur when the program is executed, and they can be caused by a variety of factors, such as invalid input, resource availability, or network issues. Logic errors are mistakes in the program’s logic that can cause the program to produce incorrect results or behave unexpectedly.

To handle errors effectively, a program should anticipate and detect errors as early as possible, and it should provide a mechanism for responding to and recovering from errors. This can be done using various techniques, such as try-catch blocks, exception handling, and error logging.

Error handling in ETL

Error handling is an important aspect of ETL (extract, transform, load) processes because it allows you to anticipate and respond to errors that may occur during the extraction, transformation, and loading of data.

There are several strategies that you can use to handle errors in ETL processes, depending on the specific requirements of your use case. Here are a few common approaches:

  1. Try-catch blocks: You can use try-catch blocks to enclose code that may throw an exception and specify a block of code to execute in the event of an exception. This allows you to handle exceptions gracefully and take appropriate action, such as logging the error or retrying the operation.
  2. Exception handling: You can use exception handling to catch and handle specific types of exceptions that may be thrown by the ETL code. This allows you to handle different types of errors differently and take specific actions based on the type of error.
  3. Error logging: You can use error logging to record errors that occur during the ETL process and store them in a log file or a database. This allows you to track errors and troubleshoot problems, and it can also help you identify patterns or trends in errors.

Talend DI Tutorial Triggers and error handling

Subprocesses are another case to using the triggers. Imagine that after loading all the data into the table, you would like to get the last identifier from the table and store it in the configuration table. Nothing easier! All you have to do is build another flow below and run it on the trigger link.

Triggers – what’s that?

Simply, a trigger is a task or process that is performed automatically as a response to some event we define, e.g. if the process ends with error, you would like to automatically receive an email with this information.

In other words triggers in Talend are used to execute a job or a component in response to a specific event or condition. Triggers can be used to automate the execution of jobs and components, making it easier to build data pipelines and ETL (extract, transform, load) processes.

Triggers in Talend – Trigger Types

Talend Studio offers three types of triggers:

  • for subprocesses: On Subjob Ok, On Subjob Error
  • for components: On Component Ok, On Component Error
  • conditional: Run if

As the name implies, triggers for subjobs depend on the outcome of the whole process. So if you connect the tMysqlCommit component with tMysqlInput trigger On Subjob Ok, the data will be saved in the database after the whole process starting with tMysqlInput will end wuth the success. If you used the On Component Ok trigger, data will be saved if only the Input type component completes its work successfully.

So you may get an unhandled error for an Output type component or for a transformation. Run if allows the execution of a component or subjobs depending on the defined condition. It’s good to know that you can also use global component parameters as conditions, which are defined in the Outline tab, e.g. FileInputDelimited_2_NB_LINE.

A bit of practice

As the main process I will use the flow prepared in the previous lesson. Let’s add three components to it: tMysqlConnection (above the main process), tMysqlCommit and tMysqlRollback (below the main process). This placement of subprocesses is in line with the best practices of building ETL processes – from left to right and top to bottom. (Talend DI Tutorial Triggers and error handling)

Right click on the tDBConnection component and select Trigger -> On Subjob Ok and then drag the link that appeared to the tFileInputDelimited component.

Talend DI Tutorial Triggers and error handling - check in 5 mins!

Then right click on the tFileInputDelimited component and select Trigger -> On Subjob Ok and connect the link to the tDBCommit component. Connect the Input type component with tDBRollback in the same way – however, use the On Subjob Error trigger.

We still need to set up a database connection and associate it with commit and rollback components. To do this, click on tDBConnection, select Property Type as Repository and select the appropriate database. (Talend DI Tutorial Triggers and error handling)

Talend DI Tutorial Triggers and error handling - check in 5 mins!

Then, in the components tDBCommit and tDBRollback, select the name of the database connection component in the Component List field.

Ready! Try to start your process.

See that you can additionally secure your process if you do not connect to the database in the tDBConnection component (we used only the On Subjob Ok variant).

Summary

Overall, effective error handling is essential for ensuring the reliability and stability of ETL processes and for handling unexpected situations that may arise during the extraction, transformation, and loading of data.

There are several types of triggers available in Talend, including:

  1. On component OK trigger: This trigger executes a job or component when the preceding component executes successfully.
  2. On component error trigger: This trigger executes a job or component when the preceding component encounters an error.
  3. On component reject trigger: This trigger executes a job or component when the preceding component generates a reject flow.
  4. On component output trigger: This trigger executes a job or component when the preceding component generates output data.
  5. On component end of day trigger: This trigger executes a job or component at a specific time of day or on a specific day of the week or month.

You can use triggers in Talend to build complex data pipelines that execute tasks automatically based on specific events or conditions. This can help you automate and streamline your ETL processes and improve efficiency.

That’s all about topic: Talend DI Tutorial Triggers and error handling!

Error handling in Talend Studio, How do you handle error in Talend? How does Talend deal with rejection?
Could You Please Share This Post? 
I appreciate It And Thank YOU! :)
Have A Nice Day!

How useful was this post?

Click on a star to rate it!

Average rating 4.8 / 5. Vote count: 591

No votes so far! Be the first to rate this post.

As you found this post useful...

Follow us on social media!

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?

Leave a Reply