Matillion Orchestration

Now we will put together a simple ETL Orchestration with Matillion.  Basically we are creating a workflow of jobs. When doing this be sure to take into account job run times, load on the system, and dependencies (ie, what tables need to be loaded before the job can run).

Here we will just create one Master Sequence to run all the jobs.  Normally I would have more levels grouping together similar jobs into separate batches, and then an overall Master Batch to run everything.  You want to think about how you would rerun these jobs if they were to fail.  Since we a have a simple destructive load here that runs in a short period of time, if something goes wrong we can just rerun the master.  In real life it may not be that simple.

Let’s first create a new “Batch” folder.

Matillion Orchestration - batch folder

Matillion Orchestration - folder name.png

Add a new Orchestration Job Name ‘Master Batch’.

Matillion Orchestration - add job
Matillion Orchestration - job name.png

Click ‘Yes’ When prompted to switch to new job.

switch to new job

Grab the staging jobs and drag them to the canvas.  Connect from start to each job as depicted below.
Matillion Orchestration - add stage

Under components drag the ‘And’ stage onto the canvas.  Connect each staging job to this stage. Be sure to grab the GREEN connectors.  These are job flows for successfully completed jobs. The and stage waits until all connectors are complete before executing the next step.

Matillion Orchestration - and

Matillion Orchestration - and2

Now drag the dimensions and fact onto the canvas.  Connect them from the ‘And’ component.

Note:  Due to the way our jobs are designed, the fact table and dimension tables can be processed in parallel.  This depends on what data the ETL job is accessing.  In some systems the fact job will actually join to dimension tables to pull back the Surrogate ID.  This would mean that the dimension tables would need to be fully loaded before the fact table could run.

Matillion Orchestration - dims

Matillion Orchestration - fact

Right click start and select ‘Run from Component’.

orchestration - run.png

Check the log for a complete run.  You may also want to run some the validation SQL from earlier sections.

Check the tasks tab.  Click arrow on far right to see a more detailed view.

Matillion Orchestration - logMatillion Orchestration - log1

Check results of query from fact build section.

snowflake results