In the previous training, we were loading data into the MySQL database and because there were not many of them and the database configuration was correct, we did not receive any errors and the data finally were loaded to the database. But what if we load thousands or millions of records and get an error through the process? Should we rollback the records loaded up to that time or save them in the database? The Output type component allows you to set the Commit frequency in Advanced setting. So it would be possible to set it to a very large number and be sure that data recording will always appear at the end of the process. However, this brings performance problems and how big should the number be? Of course, you don’t have to worry about that – Triggers come with the help.

In this training I will:

  • tell you what triggers are in Talend Studio,
  • present their types,
  • show you an example of how you can use them.

Triggers – what’s that?

Simply, a trigger is a task or process that is performed automatically as a response to some event we define, e.g. if the process ends with error, you would like to automatically receive an email with this information.

One of the many possibilities of using triggers is presented in the description of this training – we do not need to connect to the database directly from Input and Output components, creating several separate connections. We can connect to the database with a component of the Connection type, e.g. tMysqlConnection, in case of get the connection – we perform the main process, and in case of failure – we handle the error. Subprocesses are another case to using the triggers. Imagine that after loading all the data into the table, you would like to get the last identifier from the table and store it in the configuration table. Nothing easier! All you have to do is build another flow below and run it on the trigger link.

Trigger types

Talend Studio offers three types of triggers:

  • for subprocesses: On Subjob Ok, On Subjob Error
  • for components: On Component Ok, On Component Error
  • conditional: Run if

As the name implies, triggers for subjobs depend on the outcome of the whole process. So if you connect the tMysqlCommit component with tMysqlInput trigger On Subjob Ok, the data will be saved in the database after the whole process starting with tMysqlInput will end wuth the success. If you used the On Component Ok trigger, data will be saved if only the Input type component completes its work successfully. So you may get an unhandled error for an Output type component or for a transformation. Run if allows the execution of a component or subjobs depending on the defined condition. It’s good to know that you can also use global component parameters as conditions, which are defined in the Outline tab, e.g. FileInputDelimited_2_NB_LINE.

A bit of practice

As the main process I will use the flow prepared in the previous lesson. Let’s add three components to it: tMysqlConnection (above the main process), tMysqlCommit and tMysqlRollback (below the main process). This placement of subprocesses is in line with the best practices of building ETL processes – from left to right and top to bottom.

Right click on the tDBConnection component and select Trigger -> On Subjob Ok and then drag the link that appeared to the tFileInputDelimited component.

Then right click on the tFileInputDelimited component and select Trigger -> On Subjob Ok and connect the link to the tDBCommit component. Connect the Input type component with tDBRollback in the same way – however, use the On Subjob Error trigger.

We still need to set up a database connection and associate it with commit and rollback components. To do this, click on tDBConnection, select Property Type as Repository and select the appropriate database.

Then, in the components tDBCommit and tDBRollback, select the name of the database connection component in the Component List field.

Ready! Try to start your process.

See that you can additionally secure your process if you do not connect to the database in the tDBConnection component (we used only the On Subjob Ok variant).

If you enjoyed this post please add the comment below or share this post on your Facebook, Twitter, LinkedIn or another social media webpage.
Thanks in advanced!

0 0 vote
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments