Ultima Blog

Writing an Azure Function to Programmatically Update Azure Data Factory

Written by Nigel Wardle | 04-Sep-2017 11:29:00

Let’s imagine that we create an Azure Data Factory (ADF) with a pipeline containing a Copy Activity that populates SQL Azure with data from an on premise SQL Server database. If we set the schedule with a short interval, say to run every 15 minutes over a 3 month period, then we will discover that it will generate a large number of executions or slices (around 9,000).


So many slices can be difficult to navigate and manage through the Azure Portal. Currently, the ADF portal has a very basic scheduler and it is not possible to
set a schedule executing regularly - but only running Monday and Friday 7am to 7pm - as an example.
 
Thankfully, ADF, along with many other Azure resources, can be updated programmatically. An Azure Function (AF) can be used to dynamically update ADF properties, including the pipeline/activity schedule. AF itself can be triggered from its own scheduler using the much more powerful and very flexible CRON syntax.
 
Hence in the above example, the ADF pipeline can instead be initially configured in the JSON template to repeat every 15 minutes but with a placeholder, non slice generating, start and end date. The AF can then be scheduled to run at midnight every Monday and Friday that updates ADF to start at the current day: 7am and end at the current day: 7pm. The advantage of this is that it only generates new slices for each day as AF executes. Also, because you are not scheduling slices when data sources or sinks could be down for maintenance you don’t flood your mailbox with failure alerts.

To do this you first have to create an AF in the Azure portal and set the trigger with a CRON schedule. You can then enter the code to perform this action by pasting into the portal. What’s the catch you say? Well there is a catch, you will need to register the AF as an app in Azure Active Directory (AD) so that it can authenticate itself before updating the ADF. An excellent article to do this can be found here.

The code below is sufficient to update the ADF start and end dates. Function.json contains the JSON for the AF trigger. In this case we are using a timerTrigger, with the schedule using CRON syntax. Project.json provides references to NuGet packages that will be downloaded on demand. Run.csx is the actual C# code that is compiled on the fly using Roslyn and executed. Note the slightly different syntax for the using statements.

If you have any queries about Azure, or would like to discuss this in further detail please don't hesitate to contact us.

- By Nigel Wardle (Application Architect)

* All links and URLs provided are at the discretion of the author