ISSN 1866-5705


free digital version

print version 8,00 €

printed in Germany

The Magazine for Professional Testers

Open Source Tools

December 2010

© diego cervo - Fotolia.com

© Stephan Koscheck - Fotolia.com

Populating your database with a few clicks
by José Carréra

A common obstacle frequently faced by the testing team is the challenge of having to check several testing scenarios which require very large amounts of data that could take months to be obtained in a real production environment. Also, depending on the domain of the application, it may just not be possible to have the data made available from production environment due to security or privacy issues. Therefore, quality assurance teams need to find other ways to simulate these scenarios as closely as possible to real environments, whilst at the same time they usually face strict time constraints, which makes their work even harder. Test engineers have at their disposal some solutions to deal with this matter. However, they are not always easy to use or are not fast enough. A commonly used approach is to create SQL scripts manually using a text editor, which is a tedious task that consumes a lot of time. Another possibility is to use specific tools to aid in this task, allowing the user to quickly and easily generate large amounts of data. However, it is not easy to find a suitable opensource tool that is flexible enough to adapt to any type of scenario and all data types.

A couple of months ago I was introduced to the Data Generator tool, which according to the website www.generatedata.com is „a free, open-source script written in JavaScript, PHP and MySQL that lets you quickly generate large volumes of custom data in a variety of formats for use in testing software, populating databases, and scoring with girls“. I still haven‘t discovered how the tool may help you „score with girls“, but it sure is a great tool to assist software testing in populating databases. In this article I will describe the main features of this application, pointing out ways to better use these features based on our previous experience using Data Generator in our projects.

First Look and Installation
The first great thing about the Data Generator tool is that you can try it out without having to install it. After accessing the application‘s website www.generatedata.com and clicking on the Generator tab, you will be presented with that same web GUI that you will have if you later on decide to install it locally on your machine. The only constraint is that using this “demonstration”

Image 1 - Website Generator version


The Magazine for Professional Testers


version you can only create a maximum of 200 records at a time, whereas with the locally installed version you may reach 5000 records at a time. To install it locally on your machine only a few simple steps need to be performed. By clicking on the download tab at the website you can see what the system requirements are: • • • MySQL 4+ PHP 4+ Any modern, JS-enabled browser

To fulfill these requirements we installed WampServer (www. wampserver.com), another free, open-source project, which allows us to run Data Generator and all its features. After installing WampServer just a couple more steps need to be performed. To start running Data Generator locally on your machine, a simple five- step installation procedure is available via the download tab of the application‘s website. For Data Generator version 2.1, which is the version we used in our projects, the required steps are: 1. 2. Download the zip-file at the top and unzip the contents locally on your computer. In the zip-file, you‘ll find a file in the /install folder named db_install.sql. This contains all the SQL to create the MySQL tables and raw data used by the Data Generator (names, cities, provinces, states, countries, etc). You will need to execute these statements on your database through any database access tool, such as phpMyAdmin. Edit the global/library.php file. At the top, you‘ll see a section where you need to enter your MySQL database settings. Note: if you choose to change the database prefix, make sure you rename the tables after running the SQL in #2! Upload all files to your web server. Upload it to your web browser and get to work.

Image 2 - Available data types

Explaining its main features
First, I will describe how we can use the pre-loaded user database to generate data. Almost all applications have a user management feature that uses data like name, phone, e-mail, address, etc. It is also common that at some point during the project life cycle we need to use a larger amount of user data, in order to assess how the system behaves in different test scenarios that may focus on features like response time, data integrity, general user interface, and many others. On the Data Generator main screen you will be able to define on each row a data type for each column of the table that you want to populate. For each row, the user can define a column title and its data type along with its specific settings. Different types of pre-loaded data types will be available like: name, phone/fax, email, city, and others. These data types allow us to solve different issues, by attending various test scenarios.


4. 5.

After this small procedure, you can start using Data Generator. As we will describe in the next section, it comes with a pre-loaded database that might help you in several test scenarios.

Image 3 - Dates / pre-defined formats


The Magazine for Professional Testers


Among the pre-loaded data types, the “Date” option is one of the most useful, presenting a relevant group of variations, as shown in image 3. The “Date” option allows us to solve a common problem faced when generating data, which is that each database management system uses different date formats. With this defined data type, we can solve this issue with a few clicks selecting a date format and setting the desired date interval. The pre-loaded data can be used in isolation or together with custom data types like alpha-numeric and custom lists, which are two different data types available where the user can enter specific custom values. If you select custom list, you can choose among some pre-loaded available lists like: marital status, colors, titles, company names, etc., or you can define a customized list by entering values separated by a pipe | character. You may also define before generation the number of values that you want to be included in each record. You can also choose alpha-numeric data types, where you can either define a specific static value which will be repeated for each created record or use its feature that allows the generation of random values. To do so, you need to follow some defined rules presented below in image 4. For instance, if you set the value field to LLLxxLLLxLL, records will be generated replacing the ‚L‘ for random upper-case letters and the ‚x‘ for any number from 0 to 9.

Image 5 - Results settings

its features, adding new ones, fixing bugs or just by exchanging ideas with developers and providing important feedback for future releases.

Technologies applied on software projects get more complex every day, along with a greater challenge for quality assurance teams to assist the development team in releasing products with higher quality standards, capable of complying with user needs and fulfilling expectations. To achieve these demands, the usage of tools to assist the work performed by the quality assurance team is primordial. Allowing engineers to provide faster and more accurate results to their projects, Data Generator seems to be a great option, providing test teams with a suitable solution for populating databases. In this article, we presented the main features of Data Generator. Special thanks go to Benjamin Keen, responsible for the development and distribution of the tool. For further information regarding this tool or the challenges that it might help solving, please contact the author.

Image 4 - Alpha-numeric data type rules

Another important and useful data type available is the autoincrement, which we can use to generate a unique number for each row by incrementing an initial configured value by whatever value you enter. This functionality is very helpful for setting numeric values commonly used for primary keys on database tables. Finally, we present the settings that are available for generating records. Initially, the user can choose among the different exporting formats, a feature which has shown to be very useful considering that we might use the generated data in different situations, like using it to import via the application‘s front-end (if available) or data importing directly via the system‘s DBMS (Data Base Management System). We can also choose betweem different settings of the pre-loaded database, choosing which country we want the records to be retrieved from, allowing more precise information. Last but not least, we must define the number of results or records that we need to be generated for the selected configuration (as mentioned before, we can reach 5000 records at a time with the locally installed version). Remember that I am not trying to describe a perfect tool that can solve all your problems. You might find some drawbacks, limitations and also find yourself having to adapt the available data types to your needs. However, that’s the great thing about an opensource project; you can help to make it get better by improving

José Carréra, MSc, is a test engineer at C.E.S.A.R. (Recife Center for Advanced Studies and Systems) since 2006 and Professor of Computer Science at FATEC (Faculdade de Tecnologia de Pernambuco), Brazil, since 2010. He obtained his master degree in software engineering (2009), graduated in computer science (2007), and is a Certified Tester — Foundation Level (CTFL), by the ISTQB® (International Software Testing Qualifications Board - 2009). His main research interests include quality assurance, Agile methodologies, software engineering, performance testing, and exploratory testing.


The Magazine for Professional Testers


Sign up to vote on this title
UsefulNot useful