Using Regular Expressions(RegEx) in Power BI

Shivam Shukla
3 min readAug 30, 2020

--

Power BI does not have any native functions in the Power query to perform any RegEx operations so far. Hers is an Idea submitted to Microsoft which you can use to vote for highlighting this feature.

In this article, I will be sharing a few approaches to give you a fair idea about how you can use RegEx in Power BI.

What is RegEx?

A regular expression (RegEx)is a sequence of characters that define a search pattern. Usually, such patterns are used by string-searching algorithms for “find” or “find and replace” operations on strings, or for input validation. It is a technique developed in theoretical computer science and formal language theory. — Wikipedia

Regular Expression Pattern

Why needed?

It is widely used in ETL (Extract, Transform, Load) operations across industries. With the RegEx enabled system,You can perform various pattern-based filtering on the large volume of the dataset and filtered out the unwanted data to make it more meaningful.

There are a wide range of operations which we can be performed using Regular Expressions such as searching or replacing a String based on a specific pattern, say

If we want to filter out all the rows with numerical values from the alphanumeric type of data? or

If we want to check if the E-mail ID column contains the right formatted E-mail IDs? or

If we want to extract the numbers starting from 9 and ends with 0.

How to do that?

We can use ‘R’ or ‘Python’ scripts within the Power Query to perform the RegEx Operation on the dataset. Here are a few examples that can help you in understanding this concept.

Business Case: To scan all the alphanumerical column to find any numerical values or any month description present in any of the rows.

Solution using R:

Prerequisite: Make sure that you have ‘R’ installed on your machine, if not, please download it from here.

Step1- Connect to the data source and click on the Transform button in the Power Bi desktop, Go to Transform(from Home Button) -> Run R Script to open the R script editor.

R Script Editor in Power BI

Step 2- Run the following script which scans the column named Expression based on the defined Pattern and create another column named as “isValid” with the Boolean type of the value, True/False.

# ‘dataset’ holds the input data for this script

#define a Pattern as per your need

pattern <- c(“[0123456789]”,”jan”,”feb”,”mar”,”apr”,”may”,”jun”,”jul”,”aug”,”sep”,”oct”,”nov”,”dec”)

#created a function which reads the above pattern and add them all using a OR(|) separator and do the search ignoring the case sensitivity.

isFound <- function(x) {grepl(paste(pattern, collapse=”|”), as.character(x), ignore.case=TRUE)}

#Calling the isFound function and returning the output

output <- within(dataset,{isValid=isFound(dataset$Expression)})

R Script Output

Business Case: Checking if E-mail IDs are correctly formatted.

Solution Using R:

Step1- remains the same as mentioned above.

Step2- Run the following R script which scans the column named Emails based on the defined Pattern and create another column named as “isValidEmail” with the Boolean type of the value, True/False.

# ‘dataset’ holds the input data for this script
isValidEmail <- function(x) {
grepl(“\\<[A-Z0–9._%+-]+@[A-Z0–9.-]+\\.[A-Z]{2,}\\>”, as.character(x), ignore.case=TRUE)
}
output <- within(dataset,{ValidEmail=isValidEmail(dataset$Email)})

R Script output

I hope you got a fair idea about how to use RegEx in the Power BI. Please leave a comment for any feedback or any clarification needed.

Thank you!

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Shivam Shukla
Shivam Shukla

Written by Shivam Shukla

Microsoft Certified Solution Associate(MCSA) | Azure Data Engineer | Data Analyst| LinkedIn https://www.linkedin.com/in/shivam-shukla-49993472/

Responses (3)

Write a response