How can SQL Data Generator help you to build data for Regex validation?

How can SQL Data Generator help you to build data for Regex validation?

Some weeks ago I was dealing with a requirement in a project from my office, it was related with a specific process of validation to a bunch of data which will be loaded in some tables, after to prove with some methods I was ready to measure and get the best option, but I found one interesting challenge, how can I generate a group special of data to apply the treatment (filter through Regular Expression). In this scenario the Red-Gate people give us a specialized tool for generating data tests.  The excelent tool is SQL Data Generator.

The first stage was the configuration of my testing table:

UserBrowser VARCHAR(100) ,


In the SSMS we point out the target table and right click over this



Immediately we get the SQL Data Generator screen, like the next:


I would like to explain step by step the approach that we can follow to get the expected result, putting in perspective my goal which is generate in the UserBrowser column a group of rows that fill a specific pattern to validate the Regular Expression methods that I developed.

I have the next Regular Expression like part of my tests:


For testing regular expression I usually use the webpage: www.reg. This screen is an example of these:


We should click over the column that we want to feed data with customized regular expression, in this case the column UserBrowser, immediately SQL Data Generator show us the different options to load the columns (from SQL Sentences until Regex).

We choose in the Generator option the category Generic



When we have chosen Regex Generator show us the next screen:


The most important action to do in this section is the definition of Regular Expression for generating our test data, in my case I had previously defined them, but for some Regex it’s necessary to change a few details to adjust the validator of the tool, in my case I needed to test fundamentally four Regular Expression, these are:


The use of the symbol “|” let us to incorporate OR conditional, for this scenario is a useful option to generate through SQL Data Generator a series of data that it will be fixed to the tests. However there is a special case related with the use of escape key, the previous example with HttpComponents\/1.1 is not possible to implement because Regex Generator detects the symbol <<\>> like a unsupported escape like the message show


I would like to point out an interesting aspect that the Red-Gate team have explained me, the flavor of regular expressions that SQL Data Generator uses is Python, for this reason the forward slash character (i.e. <</>>) is not a special character and does not need escaping, basically as long as your regular expression conforms to the Python specifications (in my case I was working with .Net), you should be fine. For more information about it you can visit the next link:

The change required to avoid this previous Parse Error Message consists in remove the escape key, once time that we removed it the next window show a preview of data to be generated:


Note: Is interesting note that how the SQL Data Generator generates the different option for each Regex, for example you can see for Regex (Web(\s|\+)Downloader) two distinct values are generated.

After a briefly review of the previous data, we can start with the generating data process, for this we would click in the next button



And the final result is ready for testing and verify the effectivity of my Regex Solution with SQL CLR. I hope that this briefly article will be useful for creating your tests data.

Leave a Reply

Your email address will not be published. Required fields are marked *