SQL Join based on a key value extracted from a text field using regular expressions in Alteryx

One of the issues I came across on a recent project was the fact that one of the keys on which I needed to join to another table, was embedded in a text field. RegEx to the rescue. Using the RegEx Parser tool, I was able to easily parse out the key values (in that case loan number) from the field and send that value down the pipeline in a separate field on which later I was able to do a join to another table.

While I can't go into much detail about the code in this project, you can see the RegEx tool (green hexagon) which I use to pluck out the loan number value from the text field (actually in this case I leave it in the same field and discard everything else but it could easily be sent to a separate field.)

Here's the regex config screenshot:

Very simple:

Parse the field named Line Description.

Look for instances of # followed by an optional space and 10 digit grouping

Replace the contents of the field with the second grouping (the 10 digits) (one could also use the Parse option to send the matched data to a different field)

You have the option of sending the data that did not have a match down the pipeline in a separate field (perhaps to go to an exceptions file??)


Check out Alteryx, it has its quirks and oddities, but it is at the same time a very powerful and handy tool that can hold its own with many different applications, including ETL. I can put together an ETL workflow in a fraction of the time it takes to do it in SSIS.

Visual programming definitely removes the possibility of silly syntax errors, like “FORM” instead of “FROM”. Logical use of the tool is the key.

Like
Reply

To view or add a comment, sign in

More articles by Paul Ulici

Others also viewed

Explore content categories