SQL Database Seeder

Aque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo.

Category:Computer Science

Released :August 31, 2016

Title :SQL Database Seeder

Created By :Colin Christmas

Contributors:
Alexander Svistun
Colin Christmas
CSCI 27
Sierra Fall 2015

During the Fall 2015 semester in our SQL class we both realized that inputting basic data into tables was rather tedious and time consuming. The data was always just a filler to test joins and queries with no need to have valid data. This is where our solution comes in. By freeing up time lost due to four or five inserts we can do thousands with the utilization of our application in just a few short seconds. The hope would be one day to have a program creating enough random data that you could create and test multiple tables. Under our current solution it creates enough data for a user table and possibly enough for a contact table but not simultaneous queries.
When we first met to discuss the project we knew we wanted to have simple functionality that creates long term reliability as we would like to continue adding more options in the future. Our solution is simple, yet contains enough complexity to make it somewhat of a complex logic flow.

READ AND SORT DATA:

To build this we needed outside sources for the random data to be created and we did not want to hard code a lot of data into the solution. To do this we knew we wanted to read in 3 separate files for common first names, last names, cities and states. To handle this we created 4 read loops into array lists that will read each file in sequence if it is located in the same directory as the .exe file. Each loop is run in its own try/catch block with specific error handling for the individual text file. For instance, if you receive an error for the Last Name file it will specifically state that file in the error and move on to the next file read.

ARRAY LISTS:

We created multiple array lists for this project that are all global. The first 4 stated above are populated but they don’t have any specific randomization applied. When the buttons are selected the application begins to randomize the output of the array lists and start to prepare the local variables for each line in the listbox. The first name, last name, city and state are all concatenated on the left with a comma string and output with the same randomizer so the email can be used in conjunction with the names. This effect will allow the emails to have a relative visual relationship with the names. The four strings become part of a new individual variable writing to each line of the output. By preparing each output in a similar sequence it allows us to concatenate an email for the user “John” “Smith” to john.smith@”gmail”.com. The emails are prepared with the same randomizer using a much smaller list of only four popular email domains. It takes the first name, adds a period and an @ symbol as well as a .com to the end to create that.

When it comes to the random phone number generator, it is designed to not have a leading zero. This brings a visually pleasing option as most programmers would prefer to see some basic validity to their database content, even if it is not real data. The random phone number generator randomizes three sets of numbers in a sequence of, three concatenated with a “-” then three more the same way and the last four without any concatenation. This allows for each element in the US based numbering system to seem legitimate.

BUTTON HANDLING AND STRING INSERTION:

When it comes to the button “On”/ “Off” logic it is rather complex as there are thirteen buttons that toggle specific strings to either position. By choosing this method of handling our prepared output it allows us to change the output very easily. For instance if the desired output is CSV then by choosing this output it turns off the Parenthesis string as they are not used in a CSV output as well as turning off any active Quote or Apostrophe strings via their buttons. The application looks for buttons being enabled. If they are enabled (not pressed) it passes an empty string. If they are pressed it checks to see this by using button.enabled = false which shows the button is selected by the user and will pass the specific string associated with the button.

OUTPUT HANDLING:

The user has two choices on this query builder to output the newly created data. They can either, copy and paste whatever sections are needed from the listbox. Or alternatively they can choose to save the file as a text document to be later imported on their server. By writing a simple save tool it will write each line in the listbox and output it to a document of their choice. For instance, if the user prefers a csv, they can choose the output in the query builder as CSV then save the output. The Dialogue Box that prompts them to save the output will allow for the user to type in .csv or .txt based off of their intended output.
In closing, our solution is a very lightweight random query builder. For ease of use we have forgone any unnecessary extra bells and whistles to make it fast and reliable. Initial testing will run 25,000 different random rows in about 5 seconds with duplicates only due to limited data on text files. This means that with a larger data set of names and cities from the text files we could potentially create a million different rows of data in just under 4 minutes. To populate a large table with a large data set like this can be very useful even if used on the college level for teaching purposes. The future use of this could create simultaneous random generators for a multitude of different random data needed to populate large data sets.

Computer Science - SQL Database Seeder

Aque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo.

SQL Database Seeder