Você está na página 1de 60

Introduction

In this article we are going to see what Data Flow Transformations in SSIS are and the list
of controls that are provided in the data flow transformations followed by a series
about each control including their usage.

To follow my series of articles on SSIS packages, please check my profile.

Steps:

Follow steps 1 to 3 on my first article to open the BIDS project and select the right project
to work on integration services project. Once the project is created, we will see on what
exactly the data flow transformations are and where to locate the controls under this
transformation and the usage of those transformations. After opening a new project just
move to the Dataflow tab in the designer window and you can see the list of Data
Transformations as shown in the below image.

Data flow transformations are helpful to do any type of manipulations across the data which
are to be transferred and used in the package.

There are 28 data flow transformation controls and the list of them are as below with a
small description of what each control is used for.

S No Transformation Description

1 Aggregate Aggregates and groups values


2 Audit Adds audit information

3 Character Map Applies string operations to character data

4 Conditional Split Evaluates and splits up rows

5 Copy Column Copies a column

6 Data Conversion Converts data to a different data type

7 Data Mining Query Runs a data mining query

8 Derived Column Calculates a new column from existing data

9 Export Column Exports data from a column to a file

10 Fuzzy Grouping Groups rows that contain similar values

11 Fuzzy Lookup Looks up values using fuzzy matching

12 Import Column Imports data from a file to a column

13 Lookup Looks up values in a dataset

14 Merge Merges two sorted datasets

15 Merge Join Merges data from two datasets by using a join

16 Multicast Creates copies of a dataset

17 OLE DB Command Executes a SQL command on each row in a dataset

18 Percentage Sampling Extracts a subset of rows from a dataset

19 Pivot Builds a pivot table from a dataset


20 Row Count Counts the rows of a dataset

21 Row Sampling Extracts a sample of rows from a dataset

22 Script Component Executes a custom script

23 Slowly Changing Dimension Updates a slowly changing dimension in a cube

24 Sort Sorts data

25 Term Extraction Extracts data from a column

26 Term Lookup Looks up the frequency of a term in a column

27 Union All Merges multiple datasets

28 Unpivot Normalizes a pivot table

In our upcoming articles we are going to explore each of the major controls including the
purpose of each.

Conclusion

So in this article we have seen what exactly Data flow transformations are and the
list of available controls to perform these transformations.

SQL Server 2008 Integration Services Tutorial


Downloads Required:

Exercise Files

Sample DB from CodePlex

In this tutorial:

The Import and Export Wizard

Creating a Package

Working with Connection Managers


Building Control Flows

Building Data Flows

Creating Event Handlers

Saving and Running Packages

Files needed:

ISProject1.zip

ISProject2.zip

Related tutorials:

SSRS 2012: Enhanced Report Items

SSAS 2008

SSRS 2008

Microsoft says that SQL Server Integration Services (SSIS) is a platform for building high
performance data integration solutions, including extraction, transformation, and load (ETL)
packages for data warehousing. A simpler way to think of SSIS is that it's the solution for
automating data movements. SSIS provides a way to build packages made up of tasks that
can move data around from place to place and alter it on the way. There are visual designers
(hosted within Business Intelligence Development Studio) to help you build these packages as
well as an API for programming SSIS objects from other applications.
In this chapter, you'll see how to build and use SSIS packages. First, though, we'll look at a
simpler facet of SSIS: The SQL Server Import and Export Wizard.

If you choose to use the supplied solution files rather than building your
own, you may need to edit the properties of the OLE DB Connection
Managers within the projects to point to your own test server. You'll learn
more about Connection Managers in the Working with Connection
Managers section later in this chapter.

SSIS 2008 Tutorial: The Import and Export Wizard


Though SSIS is almost infinitely customizable, Microsoft has produced a simple wizard to
handle some of the most common ETL tasks: importing data to or exporting data from a SQL
Server database. The Import and Export Wizard protects you from the complexity of SSIS
while allowing you to move data between any of these data sources:

SQL Server databases

Flat files
Microsoft Access databases

Microsoft Excel worksheets

Other OLE DB providers

You can launch the Import and Export wizard from the Tasks entry on the shortcut menu of
any database in the Object Explorer window of SQL Server Management Studio.
Try It!
To import some data using the Import and Export Wizard, follow these steps:

1. Launch SQL Server Management Studio and log in to your test server.

2. Open a new query window.

3. Select the master database from the Available Databases combo box on the toolbar.

4. Enter this text into the query window: CREATE DATABASE Chapter16

5. Click the Execute toolbar button to create a new database.

6. Expand the Databases node in Object Explorer

7. Right-click on the Chapter16 database and select Tasks > Import Data.

8. Read the first page of the Import and Export Wizard and click Next.

9. Select SQL Native Client for the data source and provide login information for your
test server.

10. Select the AdventureWorks2008 database as the source of the data to import.

11. Click Next.

12. Because you're importing data, the next page of the wizard will default to connection
information for the Chapter16 database. Click Next.

13. Select Copy Data From One or More Tables or Views and click Next. Note that if you
only want to import part of a table you can use a query as the data source instead.

14. Select the HumanResources.Department, HumanResources.JobCandidate and


HumanResources.Shift tables, as show in Figure 16-1. As you select tables, the wizard
will automatically assign names for the target tables.
Figure 16-1: Selecting tables to import

15. Select the HumanResources.Shift table and click on the Edit Mappings button.

16. The Column Mappings dialog box lets you change the name, data type, and other
properties of the destination table columns. You can also set other options here, such
as whether to overwrite or append data when importing data to an existing table.
Click Cancel when you're done inspecting the options.
17. Click Next.

18. Check Execute Immediately and click Next.

19. Click Finish to perform the import. SQL Server will display progress as it performs the
import, as shown in Figure 16-2.

Figure 16-2: Import Wizard results

20. Click Close to dismiss the report.

21. Expand the Tables node of the Chapter16 database to verify that the import
succeeded.

In addition to executing its operations immediately, the Import and


Export Wizard can also save a package for later execution. You'll learn
more about packages in the remainder of this chapter.

SSIS 2008 Tutorial: Creating a Package


The Import and Export Wizard is easy to use, but it only taps a small part of the functionality
of SSIS. To really appreciate the full power of SSIS, you'll need to use BIDS to build an
SSIS package. A package is a collection of SSIS objects including:

Connections to data sources.


Data flows, which include the sources and destinations that extract and load data, the
transformations that modify and extend data, and the paths that link sources,
transformations, and destinations.

Control flows, which include tasks and containers that execute when the package
runs. You can organize tasks in sequences and in loops.

Event handlers, which are workflows that runs in response to the events raised by a
package, task, or container.

You'll see how to build each of these components of a package in later sections of the chapter,
but first, let's fire up BIDS and create a new SSIS package.
Try It!
To create a new SSIS package, follow these steps:

1. Launch Business Intelligence Development Studio.

2. Select File > New > Project.

3. Select the Business Intelligence Projects project type.

4. Select the Integration Services Project template.

5. Select a convenient location.

6. Name the new project ISProject1 and click OK.

Figure 16-3 shows the new, empty package.

Figure 16-3: Empty SSIS package


SSIS 2008 Tutorial: Working with Connection Managers
SSIS uses connection managers to integrate different data sources into packages. SSIS
includes a wide variety of different connection managers that allow you to move data around
from place to place. Table 16-1 lists the available connection managers.

Connection Handles
Manager

ADO Connecting to ADO objects such as a Recordset.

ADO.NET Connecting to data sources through an ADO.NET provider.

CACHE Connects to a cache either in memory or in a file

MSOLAP100 Connecting to an Analysis Services database or cube.

EXCEL Connecting to an Excel worksheet.

FILE Connecting to a file or folder.

FLATFILE Connecting to delimited or fixed width flat files.

FTP Connecting to an FTP data source.

HTTP Connecting to an HTTP data source.

MSMQ Connecting to a Microsoft Message Queue.

MULTIFILE Connecting to a set of files, such as all text files on a particular hard
drive.

MULTIFLATFILE Connecting to a set of flat files.

ODBC Connecting to an ODBC data source.

OLEDB Connecting to an OLE DB data source.

SMOSever Connecting to a server via SMO.

SMTP Connecting to a Simple Mail Transfer Protocol server.

SQLMobile Connecting to a SQL Server Mobile database.

WMI Connecting to Windows Management Instrumentation data.

Table 16-1: Available Connection Managers


To create a Connection Manager, you right-click anywhere in the Connection Managers area
of a package in BIDS and choose the appropriate shortcut from the shortcut menu. Each
Connection Manager has its own custom configuration dialog box with specific options that
you need to fill out.
Try It!
To add some connection managers to your package, follow these steps:
1. Right-click in the Connection Managers area of your new package and select New OLE
DB Connection.

2. Note that the configuration dialog box will show the data connections that you created
in Chapter 15; data connections are shared across Analysis Services and Integration
Services projects. Click New to create a new data connection.

3. In the Connection Manager dialog box, select the SQL Native Client provider.

4. Select your test server and provide login information.

5. Select the Chapter16 database.

6. Click OK.

7. In the Configure OLE DB Connection Manager dialog box, click OK.

8. Right-click in the Connection Managers area of your new package and select New Flat
File Connection.

9. Enter DepartmentList as the Connection Manager Name.

10. Enter C:\Departments.txt as the File Name.

11. Check the Column Names in the First Data Row checkbox. Figure 16-4 shows the
completed General page of the dialog box.
Figure 16-4: Defining a Flat File Connection Manager

12. Click the Advanced icon to move to the Advanced page of the dialog box

13. Click the New button.

14. Change the Name of the new column to DepartmentName.

15. Click OK.

16. Right-click the DepartmentList Connection Manager and select Copy.


17. Right-click in the Connection Managers area and select Paste.

18. Click on the new DepartmentList 1 connection to select it.

19. Use the Properties Window to change properties of the new connection. Change the
Name property to DepartmentListBackup. Change the ConnectionString property to
C:\DepartmentsBackup.txt.

Figure 16-5 shows the SSIS package with the three Connection Managers defined.

Figure 16-5: An SSIS package with two Connection Managers


SSIS 2008 Tutorial: Building Control Flows
The Control Flow tab of the Package Designer is where you tell SSIS what the package will do.
You create your control flow by dragging and dropping items from the toolbox to the surface,
and then dragging and dropping connections between the objects. The objects you can drop
here break up into four different groups:

Tasks are things that SSIS can do, such as execute SQL statements or transfer objects
from one SQL Server to another. Table 16-2 lists the available tasks.

Maintenance Plan tasks are a special group of tasks that handle jobs such as checking
database integrity and rebuilding indexes. Table 16-3 lists the maintenance plan tasks.

The Data Flow Task is a general purpose task for ETL (extract, transform, and load)
operations on data. There's a separate design tab for building the details of a Data
Flow Task.

Containers are objects that can hold a group of tasks. Table 16-4 lists the available
containers.

Task Purpose

ActiveX Script Execute an ActiveX Script


Analysis Services Execute DDL query statements against an Analysis Services server
Execute DDL

Analysis Services Process an Analysis Services cube


Processing

Bulk Insert Insert data from a file into a database

Data Mining Query Execute a data mining query

Data Profiling Task Generate a profile of sample data, determining distribution of values
or percentage of NULLs, etc.

Execute DTS 2000 Execute a Data Transformation Services Package (DTS was the SQL
Package Server 2000 version of SSIS)

Execute Package Execute an SSIS package

Execute Process Shell out to a Windows application

Execute SQL Run a SQL query

File System Perform file system operations such as copy or delete

FTP Perform FTP operations

Message Queue Send or receive messages via MSMQ

Script Execute a custom task

Send Mail Send e-mail

Transfer Database Transfer an entire database between two SQL Servers

Transfer Error Transfer custom error messages between two SQL Servers
Messages

Transfer Jobs Transfer jobs between two SQL Servers

Transfer Logins Transfer logins between two SQL Servers

Transfer Master Transfer stored procedures from the master database on one SQL
Stored Procedures Server to the master database on another SQL Server

Transfer SQL Server Transfer objects between two SQL Servers


Objects

Web Service Execute a SOAP Web method

WMI Data Reader Read data via WMI

WMI Event Watcher Wait for a WMI event

XML Perform operations on XML data

Table 16-2: SSIS control flow tasks


Task Purpose

Back Up Database Back up an entire database to file or tape

Check Database Integrity Perform database consistency checks

Execute SQL Server Agent Job Run a job

Execute T-SQL Statement Run any T-SQL script

History Cleanup Clean out history tables for other maintenance tasks

Maintenance Cleanup Clean up files left by other maintenance tasks

Notify Operator Send e-mail to SQL Server operators

Rebuild Index Rebuild a SQL Server index

Reorganize Index Compacts and defragments an index

Shrink Database Shrinks a database

Update Statistics Update statistics used to calculate query plans

Table 16-3: SSIS maintenance plan tasks

Container Purpose

For Loop Repeat a task a fixed number of times

Foreach Loop Repeat a task by enumerating over a group of objects

Sequence Group multiple tasks into a single unit for easier management

Table 16-4: SSIS containers


Try It!
To add control flows to the package you've been building, follow these steps:

1. If the Toolbox isn't visible already, hover your mouse over the Toolbox tab until it
slides out from the side of the BIDS window. Use the pushpin button in the Toolbox
title bar to keep the Toolbox visible.

2. Make sure the Control Flow tab is selected in the Package Designer.

3. Drag a File System Task from the Toolbox and drop it on the Package Designer.

4. Drag a Data Flow Task from the Toolbox and drop it on the Package Designer,
somewhere below the File System task.

5. Click on the File System Task on the Package Designer to select it.
6. Drag the green arrow from the bottom of the File System Task and drop it on top of
the Data Flow Task. This tells SSIS the order of tasks when the File System Task
succeeds.

7. Double-click the connection between the two tasks to open the Precedence Constraint
Editor.

8. Change the Value from Success to Completion, because you want the Data Flow Task
to execute whether the File System Task succeeds or not.

9. Click OK.

10. Select the File System task in the designer. Use the Properties Window to set
properties of the File System Task. Set the Source property to DepartmentList. Set the
Destination property to DepartmentListBackup. Set the OverwriteDestinationFile
property to True then click OK.

Figure 16-6 shows the completed set of control flows.

Figure 16-6: Adding control flows


As it stands, this package uses the file system task to copy the file specified by the
DepartmentList connection to the file specified by the DepartmentListBackup connection,
overwriting any target file that already exists. It then executes the data flow task. In the next
section, you'll see how to configure the data flow task.
SSIS 2008 Tutorial: Building Data Flows
The Data Flow tab of the Package Designer is where you specify the details of any Data Flow
tasks that you've added on the Control Flow tab. Data Flows are made up of various objects
that you drag and drop from the Toolbox:

Data Flow Sources are ways that data gets into the system. Table 16-5 lists the
available data flow sources.

Data Flow Transformations let you alter and manipulate the data in various ways.
Table 16-6 lists the available data flow transformations.

Data Flow Destinations are the places that you can send the transformed data. Table
16-7 lists the available data flow destinations.

Source Use

ADO NET Extracts data from a database using a .NET data provider

Excel Extracts data from an Excel workbook

Flat File Extracts data from a flat file

OLE DB Extracts data from a database using an OLE DB provider

Raw File Extracts data from a raw file (proprietary Microsoft format)

XML Extracts data from an XML file

Table 16-5: Data flow sources

Transformation Effect

Aggregate Aggregates and groups values in a dataset

Audit Adds audit information to a dataset

Cache Transform Populates a CACHE connection manager

Character Map Applies string operations to character data

Conditional Split Evaluates and splits up rows in a dataset

Copy Column Copies a column of data

Data Conversion Converts data to a different datatype

Data Mining Query Runs a data mining query

Derived Column Calculates a new column from existing data

Export Column Exports data from a column to a file

Fuzzy Grouping Groups rows that contain similar values

Fuzzy Lookup Looks up values using fuzzy matching


Import Column Imports data from a file to a column

Lookup Looks up values in a reference dataset

Merge Merges two sorted datasets

Merge Join Merges data from two datasets by using a join

Multicast Creates copies of a dataset

OLE DB Command Executes a SQL command on each row in a dataset

Percentage Sampling Extracts a subset of rows from a dataset

Pivot Builds a pivot table from a dataset

Row Count Counts the rows of a dataset

Row Sampling Extracts a sample of rows from a dataset

Script Component Executes a custom script

Slowly Changing Dimension Updates a slowly changing dimension table

Sort Sorts data

Term Extraction Extracts data from a column

Term Lookup Looks up the frequency of a term in a column

Union All Merges multiple datasets

Unpivot Normalizes a pivot table

Table 16-6: Data Flow Transformations

Destination Use

ADO NET Sends data to a .NET data provider

Data Mining Model Training Sends data to an Analysis Services data mining model

DataReader Sends data to an in-memory ADO.NET DataReader

Dimension Processing Processes a cube dimension

Excel Sends data to an Excel worksheet

Flat File Sends data to a flat file

OLE DB Sends data to an OLE DB database

Partition Processing Processes an Analysis Services partition


Raw File Sends data to a raw file

Recordset Sends data to an in-memory ADO Recordset

SQL Server Compact Sends data to a SQL Server CE database

SQL Server Sends data to a SQL Server database

Table 16-7: Data Flow Destinations

If you are running SQL Server Integration Services on a 64-bit machine,


the Excel source and destination will throw an exception. During
development, you can select Project > Project_name Properties, select the
Debugging page and change the Run64BitRuntime property to false.
When deploying the package, you'll need to shell out to the 32-bit SSIS
runtime when scheduling the package.

Try It!
To customize the data flow task in the package you're building, follow these steps:

1. Select the Data Flow tab in the Package Designer. The single Data Flow Task in the
package will automatically be selected in the combo box.

2. Drag an OLE DB Source from the Toolbox and drop it on the Package Designer.

3. Drag a Character Map Transformation from the Toolbox and drop it on the Package
Designer.

4. Drag a Flat File Destination from the Toolbox and drop it on the Package Designer.

5. Click on the OLE DB Source on the Package Designer to select it.

6. Drag the green arrow from the bottom of the OLE DB Source and drop it on top of the
Character Map Transformation.

7. Click on the Character Map Transformation on the Package Designer to select it.

8. Drag the green arrow from the bottom of the Character Map Transformation and drop
it on top of the Flat File Destination.

9. Double-click the OLE DB Source to open the OLE DB Source Editor. Notice that it uses
the Chapter16 OLE DB connection manager by default.

10. Select the HumanResources.Department table. Figure 16-7 shows the completed OLE
DB Source Editor.
Figure 16-7: Setting up the OLE DB Source

11. Click OK.

12. Double-click the Character Map Transformation.

13. Check the Name column.

14. Select In-Place Change in the Destination column.

15. Select the Uppercase operation. Figure 16-8 shows the completed Character Map
Transformation Editor.
Figure 16-8: Setting up the Character Map Transformation

16. Click OK.

17. Double-click the Flat File Destination.

18. Select the DepartmentList Flat File Connection Manager.

19. Select the Mappings page of the dialog box.

20. Drag the Name column from the Available Input Columns list and drop it on top of the
DepartmentName column in the Available Destination Columns list. Figure 16-9 shows
the completed Mappings page.
Figure 16-9: Configuring the Flat File Destination

21. Click OK.

Figure 16-10 shows the completed set of data flows.


Figure 16-10: Adding data flows
The data flows in this package take a table from the Chapter16 database, transform one of the
columns in that table to all uppercase characters, and then write that transformed column
out to a flat file.
SSIS 2008 Tutorial: Creating Event Handlers
SSIS packages also support a complete event system. You can attach event handlers to a
variety of events for the package itself or for the individual tasks within a package. Events
within a package "bubble up." That is, suppose an error occurs within a task inside of a
package. If you've defined an OnError event handler for the task, then that event handler is
called. Otherwise, an OnError event handler for the package itself is called. If no event
handler is defined for the package either, the event is ignored.
Event handlers are defined on the Event Handlers tab of the Package Designer. When you
create an event handler, you handle the event by building an entire secondary SSIS package,
and you have access to the full complement of data flows, control flows, and event handlers
to deal with the original event.

By adding event handlers to the OnError event that call the Send Mail
task, you can notify operators by e-mail if anything goes wrong in the
course of running an SSIS package.

Try It!
To add an event handler to the package we've been building, follow these steps:
1. Open SQL Server Management Studio and connect to your test server.

2. Create a new query and select the Chapter16 database in the available databases list
on the toolbar.

3. Enter this text into a query window:

CREATE TABLE DepartmentExports(


ExportID int IDENTITY(1,1) NOT NULL,
ExportTime datetime NOT NULL
CONSTRAINT DF_DepartmentExports_ExportTime DEFAULT (GETDATE()),
CONSTRAINT PK_DepartmentExports PRIMARY KEY CLUSTERED
(
ExportID ASC
)

4. Click the Execute toolbar button to create the table.

5. Switch back to the Package Designer in BIDS.

6. Select the Event Handlers tab.

7. In the Executable drop-down list, expand the Package node and then the Executables
node.

8. Select the Data Flow Task in the Executable dropdown list, click OK.

9. Select the OnPostExecute event handler.

10. Click the hyperlink on the design surface to create the event handler.

11. Drag an Execute SQL task from the Toolbox and drop it on the Package Designer.

12. Double-click the Execute SQL task to open the Execute SQL Task Editor.

13. Select the Chapter16 OLE DB connection manager as the task's connection.

14. Set the SQL Statement property to the following query:

INSERT INTO DepartmentExports (ExportTime)


VALUES (GETDATE())

15. Click OK to create the event handler.

This event handler will be called when the Data Flow Task finishes executing, and will insert
one new row into the tracking table when it is called.
SSIS 2008 Tutorial: Saving and Running Packages
Now that you've created an entire SSIS package, you're probably ready to run it and see what
it does. But first, let's look at the options for saving SSIS packages. When you work in BIDS,
your SSIS package is saved as an XML file (with the extension dtsx) directly in the normal
Windows file system. But that's not the only option. Packages can also be saved in the msdb
database in SQL Server itself, or in a special area of the file system called the Package Store.
Storing SSIS packages in the Package Store or the msdb database makes it easier to access and
manage them from SQL Server's administrative and command-line tools without needing to
have any knowledge of the physical layout of the server's hard drive.
Saving Packages to Alternate Locations
To save a package to the msdb database or the Package Store, you use the File > Save Package
As menu item within BIDS.
Try It!
To store copies of the package you've developed, follow these steps.

1. Select File > Save Copy of Package.dtsx As from the BIDS menus.

2. Select SSIS Package Store as the Package Location.

3. Select the name of your test server.

4. Enter /File System/ExportDepartments as the package path.

5. Click OK.

6. Select File > Save Copy of Package.dtsx As from the BIDS menus.

7. Select SQL Server as the Package Location.

8. Select the name of your test server and fill in your authentication information.

9. Enter ExportDepartments as the package path.

10. Click OK.

Running a Package
You can run the final package from either BIDS or SQL Server Management Studio. When
you're developing a package, it's convenient to run it directly from BIDS. When the package
has been deployed to a production server (and saved to the msdb database or the Package
Store) you'll probably want to run it from SQL Server Management Studio.

SQL Server also includes a command-line utility, dtsexec, that lets you run
packages from batch files.

Running a Package from BIDS


With the package open in BIDS, you can run it using the standard Visual Studio tools for
running a project. Choose any of these options:
Right-click the package in Solution Explorer and select Execute Package.

Click the Start Debugging toolbar button.

Press F5.

Try It!
To run the package that you have loaded in BIDS, follow these steps:

1. Click the Start Debugging toolbar button. SSIS will execute the package, highlighting
the steps in the package as they are completed. You can select any tab to watch what's
going on. For example, if you select the Control Flow tab, you'll see tasks highlighted,
as shown in Figure 16-11.

Figure 16-11: Executing a package in the debugger

2. When the package finishes executing, click the hyperlink underneath the Connection
Managers pane to stop the debugger.
3. Click the Execution Results tab to see detailed information on the package, as shown
in Figure 16-12.

Figure 16-12: Information on package execution


All of the events you see in the Execution Results pane are things that you
can create event handlers to react to within the package. As you can see,
DTS issues a quite a number of events, from progress events to warnings
about extra columns of data that we retrieved but never used.

Running a Package from SQL Server Management Studio


To run a package from SQL Server Management Studio, you need to connect Object Browser
to SSIS.
Try It!

1. In SQL Server Management Studio, click the Connect button at the top of the Object
Explorer window.

2. Select Integration Services.

3. Choose the server with Integration Services installed and click Connect. This will add
an Integration Services node at the bottom of Object Explorer.
4. Expand the Stored Packages node. You'll see that you can drill down into the File
System node to find packages in the Package Store, or the MSDB node to find packages
stored in the msdb database.

5. Expand the File System node.

6. Right-click on the ExportDepartments package and select Run Package. This will open
the Execute Package utility, shown in Figure 16-13.

Figure 16-13: Executing a package from SQL Server Management Studio

7. Click Execute.

8. Click Close twice to dismiss the progress dialog box and the Execute Package Utility.
9. Enter this text into a query window with the Chapter16 database selected:

SELECT * FROM DepartmentExports

10. Click the Execute toolbar button to verify that the package was run. You should see
one entry for when the package was run from BIDS and one from when you ran it
from SQL Server Management Studio.

SSIS 2008 Tutorial: Exercises


One common use of SSIS is in data warehousing - collecting data from a variety of different
sources into a single database that can be used for unified reporting. In this exercise you'll
use SSIS to perform a simple data warehousing task.
Use SSIS to create a text file, c:\EmployeeDept.txt, containing the last names, department
names, start and end dates of the AdventureWorks2008 employees. Retrieve the last names
from the Person.Person table and the department start and end dates from the
HumanResources.EmployeeDepartmentHistory table in the AdventureWorks2008 database,
and the department names from the Chapter16 database.
You can use the Merge Join data flow transformation to join data from two sources. One tip:
the inputs to this transformation need to be sorted on the joining column.
Solutions to Exercises

1. Launch Business Intelligence Development Studio

2. Select File > New > Project.

3. Select the Business Intelligence Projects project type.

4. Select the Integration Services Project template.

5. Select a convenient location.

6. Name the new project ISProject2 and click OK.

7. Right-click in the Connection Managers area of your new package and select New OLE
DB Connection.

8. Click New to create a new data connection.

9. In the Connection Manager dialog box, select the SQL Native Client provider.

10. Select your test server and provide login information.

11. Select the AdventureWorks2008 database.

12. Click OK.

13. Right-click in the Connection Managers area of your new package and select New OLE
DB Connection.
14. Select the existing connection to the Chapter16 database and click OK.

15. Right-click in the Connection Managers area of your new package and select New Flat
File Connection.

16. Enter EmployeeList as the Connection Manager Name.

17. Enter C:\EmployeeDept.txt as the File Name.

18. Check the Column Names in the First Data Row checkbox.

19. Click the Advanced icon to move to the Advanced page of the dialog box.

20. Click the New button.

21. Change the Name of the new column to LastName.

22. Click the New button.

23. Change the Name of the new column to Department.

24. Click the New button.

25. Change the Name of the new column to StartDate and the datatype to Date.

26. Click the New button.

27. Change the Name of the new column to EndDate and the datatype to Date.

28. Click OK.

29. Select the Control Flow tab in the Package Designer.

30. Drag a Data Flow Task from the Toolbox and drop it on the Package Designer.

31. Select the Data Flow tab in the Package Designer. The single Data Flow Task in the
package will automatically be selected in the combo box.

32. Drag an OLE DB Source from the Toolbox and drop it on the Package Designer.

33. Drag a second OLE DB Source from the Toolbox and drop it on the Package Designer.

34. Drag a Sort Transformation from the Toolbox and drop it on the Package Designer.

35. Drag a second Sort Transformation from the Toolbox and drop it on the Package
Designer.

36. Drag a Merge Join Transformation from the Toolbox and drop it on the Package
Designer.
37. Drag a Flat File Destination from the Toolbox and drop it on the Package Designer.

38. Click on the first OLE DB Source on the Package Designer to select it.

39. Drag the green arrow from the bottom of the first OLE DB Source and drop it on top of
the first Sort Transformation.

40. Click on the second OLE DB Source on the Package Designer to select it.

41. Drag the green arrow from the bottom of the second OLE DB Source and drop it on top
of the second Sort Transformation.

42. Click on the first Sort Transformation on the Package Designer to select it.

43. Drag the green arrow from the bottom of the first Sort Transformation and drop it on
top of the Merge Join Transformation.

44. In the Input Output Selection dialog box, select Merge Join Left Input.

45. Click OK.

46. Click on the second Sort Transformation on the Package Designer to select it.

47. Drag the green arrow from the bottom of the second Sort Transformation and drop it
on top of the Merge Join Transformation.

48. Click on the Merge Join Transformation on the Package Designer to select it.

49. Drag the green arrow from the bottom of the Merge Join Transformation and drop it
on top of the Flat File Destination. Figure 16-14 shows the Data Flow tab with the
connections between tasks.
Figure 16-14: Data flows to merge two sources

50. Double-click the first OLE DB Source to open the OLE DB Source Editor.

51. Select the connection to the AdventureWorks2008 database.

52. For the Data Access Mode, select SQL Command.

53. Enter the following query:

SELECT p.LastName, dh.DepartmentID, dh.StartDate, dh.EndDate


FROM Person.Person p
INNER JOIN HumanResources.EmployeeDepartmentHistory dh
ON p.BusinessEntityID = dh.BusinessEntityID
54. Click OK.

55. Double-click the second OLE DB Source to open the OLE DB Source Editor.

56. Select the connection to the Chapter16 database.

57. Select the HumanResources.Department table.

58. Click OK.

59. Double-click the first Sort Transformation.

60. Check the DepartmentID column.

61. Click OK.

62. Double-click the second Sort Transformation.

63. Check the DepartmentID column.

64. Click OK

65. Double-click the Merge Join Transformation.

66. Check the Join Key checkbox for the DepartmentID column in both tables, if not
already checked.

67. Check the selection checkbox for the LastName, StartDate and EndDate columns in the
left-hand table and the Name column in the right-hand table; alias the Name column
as DepartmentName. Figure 16-15 shows the completed Merge Join Transformation
Editor.
Figure 16-15: Setting up a Merge Join

68. Click OK.

69. Double-click the Flat File Destination.

70. Select the EmployeeList Flat File Connection Manager.

71. Select the Mappings page of the dialog box.

72. The LastName, StartDate and EndDate columns will be automatically mapped. Drag
the DepartmentName column from the Available Input Columns list and drop it on top
of the Department column in the Available Destination Columns list.

73. Click OK.


74. Right-click the package in Solution Explorer and select Execute Package.

75. Stop debugging when the package is finished executing.

76. Open the c:\EmployeeDept.txt file to inspect the results.

SSIS Package Control Flow Components


The control flow in the Integration Services package is constructed by using different types
of control flow elements: the containers that provide structure in packages and services to
tasks, tasks that provide functionality in packages, and precedence constraints that connect
containers and tasks into a control flow. Below is the description of the heavily used
component in control flow task;

1.) Sequence Container: This container can be used when business wants to create
logical group of control flow tasks.

Example: Consider a scenario when based on some condition we want to perform set of
operation. In that case we can combine task corresponding to conditional result and
accommodate tasks in various container where we can flow the control of the SSIS package.

2.) For Loop Container: This component is used when business wants to execute set of
tasks multiple time based on some conditions.

Example: Consider a scenario when we want to perform data transfer from source to
destination for 5 files stored in the database. With condition set as the 5 iteration, we can
run the control flow tasks 5 times.

3.) For each Loop Container: This component is used when business wants to execute set
of tasks based on some dataset that is mostly dynamic in nature. This dataset can be
accommodating more conditional information compare to for loop container.

Example: Consider a scenario when we have files stored in some folder and for each file we
want to fetch data and transfer file data to destination database.

4.) Analysis Services Execute DDL Task: This component can be used
when business wants to run the OLAP query on SSAS cubes.

Example: Consider a scenario when we want to create replica of the cube. Using this task we
can fire OLAP query that can create replica of SSAS cube.

5.) Analysis Services Processing Task: This component can be used to process the
SSAS cube full or partial mode.

Example: Consider a scenario when SSIS job is populating data into data warehouse that
contains fact and dimension table and we want to process the cube after pushing data into
dimension and fact tables.
6.) Bulk Insert Task: This component can load data into a table by using the BULK INSERT
SQL command.

Example: Consider a scenario when we have large flat files in some folder and we want to
transfer data from flat files to SQL server. If we have flat file size large and we want to
transfer data using traditional data flow task, it might affect performance of the package.
Using bulk insert task we can push data into destination with performance improvement but
here we cannot transform the data in between as we can do in data flow task.

7.) Data Flow Task: This component should be used when business wants to perform ETL
operation i.e. Extract data from source, Apply Transformation to data and Load data into
destination database.

8.) Execute Package Task: This component can be used when business wants to execute
SSIS packages from some other SSIS packages.

Example: Consider a scenario when we have data warehouse and we want to execute
dimension packages and fact packages from a master package that control execution
hierarchy.

9.) Execute Process Task: This component can be used when business wants to runs an
application or batch file as part of a SQL Server Integration Services package workflow.

Example: Business can use the Execute Process task to expand a compressed text file. Then
the package can use the text file as a data source for the data flow in the package.

10.) Execute SQL Task: This component can be used when business wants to execute SQL
statement of SQL objects like function, stored procedure on a particular DBMS.

11.) File System Task: This component can be used to perform file operation such as to
crate directory and files, copy or delete directory or files, to move files or directory, rename
files, set attributes etc.

12.) FTP Task: This component can be used when business want to download and uploads
data files and manages directories on servers.

Example: a package can download data files from a remote server or an Internet location as
part of an Integration Services package workflow. You can use the FTP task for the following
purposes:

Copying directories and data files from one directory to another, before or after moving
data, and applying transformations to the data.

Logging in to a source FTP location and copying files or packages to a destination directory.
Downloading files from an FTP location and applying transformations to column data before
loading the data into a database.

13.) Message Queue Task: The Message Queue task allows you to use Message Queuing
(also known as MSMQ) to send and receive messages between SQL Server Integration
Services packages, or to send messages to an application queue that is processed by a
custom application. These messages can take the form of simple text, files, or variables and
their values.

By using the Message Queue task, you can coordinate operations throughout your
enterprise. Messages can be queued and delivered later if the destination is unavailable or
busy.

Example: The output from a restaurant cash register can be sent in a data file message to
the corporate payroll system, where data about each waiter's tips is extracted.

14.) Script Task: This component can be used when business require code to perform
functions that are not available in the built-in tasks and transformations that SQL Server
Integration Services provides. This code can be written in C# or VB.NET.

Example: A script can use Active Directory Service Interfaces (ADSI) to access and extract
user names from Active Directory.

15.) Send Mail Task: This component can be used when business want to send email
notification. Business can configure subject, body text along with details such as CC, BCC,
from, to, attachments etc.

Example: Send mail task can be used to send notifications that indicate execution of SSIS
package and end of failure of SSIS package.

16.) Transfer Database task: This component can be used when business wants to
transfers a SQL Server database between two instances of SQL Server only. The
Transfer Database task supports SQL Server 2000 and SQL Server. It can transfer a database
between instances of SQL Server 2000, instances of SQL Server, and from an instance of
SQL Server 2000 to an instance of SQL Server.

17.) Transfer Error Messages Task: This component can be used when business wants to
transfers one or more SQL Server user-defined error messages between instances of SQL
Server. User-defined messages are messages with an identifier that is equal to or greater
than 50000. Messages with an identifier less than 50000 are system error messages, and
cannot be transferred by using the Transfer Error Messages task. The Transfer Error
Messages task supports a source and destination that is SQL Server 2000 or SQL Server.
There are no restrictions on which version to use as a source or destination.
18.) Transfer Jobs Task: This component can be used when business wants to transfers
one or more SQL Server Agent jobs between instances of SQL Server. There are no
restrictions on which of the two versions to use as a source or destination.

19.) Transfer SQL Server Objects Task: This component can be used when business
wants to transfers one or more types of objects in a SQL Server database between instances
of SQL Server. For example, the task can copy tables and stored procedures. The Transfer
SQL Server Objects task supports a source and destination that is SQL Server 2000 or SQL
Server. There are no restrictions on which version to use as a source or destination.

20.) Web Service Task: This component can be used when business wants to execute a
Web service method and utilize its result. You can use the Web Service task for assigning
value to a SSIS package variable or for writing value to flat/XML file that has been returned
by Web service method.

Example: Business could obtain the highest temperature of the day from a Web service
method, and then use that value to update a variable that is used in an expression that sets
a column value.

21.) Maintenance Plan Task: SQL Server Integration Services includes a set of tasks that
perform database maintenance functions. These tasks are commonly used in database
maintenance plans, but the tasks can also be included in SSIS packages. The maintenance
tasks can be used with SQL Server 2000 and MS-SQL Server databases and
database objects. The following table lists the maintenance tasks and its usage;

Task Description

Back Up Database Performs different types of SQL Server database


Task backups.

Check Database Checks the allocation and structural integrity of


Integrity Task database objects and indexes.

Execute SQL Server Runs SQL Server Agent jobs.


Agent Job Task
Execute T-SQL Runs Transact-SQL statements
Statement Task

Deletes entries in the history tables in the SQL


History Cleanup Task Server msdb database.

Removes files related to maintenance plans,


Maintenance Cleanup including reports created by maintenance plans
Task and database backup files.

Sends notification messages to SQL Server Agent


Notify Operator Task operators.

Rebuilds indexes in SQL Server database tables


Rebuild Index Task and views.

Reorganizes indexes in SQL Server database tables


Reorganize Index Task and views.

Reduces the size of SQL Server database data and


Shrink Database Task log files.

Updates information about the distribution of key


values for one or more sets of statistics on the
Update Statistics Task specified table or view.

SSIS Lesson 1 - Introdction to SSIS

SSIS ( SQL Server Integration Services ) is one of technologies which is in high demand
currently in IT industry.

All organisations (be it small, medium or large) have to maintain their data in different
databases like spreadsheet, flat files, RDMS systems etc.
Sometimes we need to access data from different sources and need to modify the data
according to business needs. Need for accessing the data from multiple data sources
and need for performing broad range of data migration tasks and transforming them
according to the business needs has been very much in demand now.

In 2000, Microsoft released DTS (Data Transformation Services) to transform data from
one source to different destination.

After 5 years in 2005, Microsoft launched another version of DTS with more powerful
features. This is named as SSIS. SSIS is more powerful than DTS and its user friendly.
SSIS has wide range of new features added to it. SSIS is part of SQL SERVER Business
Intelligence Studio.

Following Services are part of SQL SERVER Business Intelligence Studio.:

SSIS (SQL Server Integration Services )


SSRS (SQL SERVER Reporting Services)
SSAS (SQL SERVER Analysis Services)

SSIS is one of the most powerful features which were introduced in SQL Server 2005.

SSIS is an ETL (extract transform and load features) tool.

SSIS Definition:

SSIS can be defined in various ways which are listed below.


SSIS is a:

ETL(extract transform and load features) tool


Control flow engine
Application platform
High-performance data transformation pipeline
Data import/export wizard

As already mentioned SSIS is the successor of DTS. But, DTS and SSIS have very few
things in common. DTS code is completed rewritten to design and build SSIS.

Lots of new features are added in SSIS. Some of the new features in SSIS are listed
below.

Latest features in SSIS:

For Each loop containers


Package configurations
Property expressions
Better Active X controllers
Script tasks
Control flow and Data flow are separated to provide better design and to make
project/code more readable and understandable
Event Handlers
Precedence constraints are improvised to handle conditional checks/Conditional
Expressions.
Pivot and Unpivot

Building Blocks of SSIS:

Control Flow: Work flow or Process flow in SSIS is known as Control


flow. Control flow consists if one of more tasks that will be executed when SSIS package
runs.
Data Flow: Data Flow in SSIS defines/indicates how data should flow from one
data source to other destination. It holds information about data source, data
destination and data transformation.
Even Handlers: Event handlers are the tasks that are executed when some
event occurs during SSIS package execution. Ex: If some error occurs during SSIS
package execution, then event handler can be programmed to run to ignore that error
and continue to the next step.
Package Explorer: Package Explorer contains complete information about the
Variables, Precedence constrains, Event handlers, Connection Managers, Log providers,
Executable.
Precedence Constraints: Precedence Constraints links various tasks in SSIS. In
simple words it is the arrow marks that connect 2 difference tasks in SSIS. Based on the
direction of Precedent Constraints tasks will be executed in order. In other words
Precedence constraints are needed for ordering / organising the control flow in SSIS.
Connection Managers: Connection Managers contain information that is
needed to connect to various data sources and to data destinations.
Toolbox: Toolbox contains is collection of Control Flow items, Maintenance plan
Tasks, Data Flow Sources, Data Flow Destinations and Data Flow Transformations. In
simple words Toolbox in SSIS contains different tasks/ containers. Task is nothing but a
work unit to perform certain job/work/action.
Variables: Variables parameters store information that can be used by
Containers/tasks in SSIS during the SSIS package execution.

Advantages of SSIS:

SSIS has a very user friendly Graphical User Interface (GUI) through which
difficult tasks can also be done with very much ease.
We can perform different types of tasks (like loading data, extracting data,
renaming file, sending mails, sending files through FTP , Data mining and lot more) in a
single SSIS package without any manual intervention.
SSIS package can be scheduled to run at a given time as per the business needs.
SSIS can be used to connect to different data sources (like flat file, MS Access,
Excel, SQLS ERVER, SYBASE, MYSQL, ORACLE etc.,). In simple words using SSIS we can
connect to almost all the external data sources.
Deploying SSIS package is very easy.
SSIS Lesson 2 - SSIS Data Import/Export Wizard to Load Data

In this aticle, I will explain you how to create 1st SSIS package using SSIS data Import
and export Wizard.
Lets create an SSIS package to load data from Excel sheet into a table in sql server.
Follow the below steps to create SSIS package.

Go to Start --> All Programs --> Microsoft SQL Server 2005 --> SQL Server
Business Intelligence Development Studio.

Click on SQL Server Business Intelligence Development Studio. Following Screen


will be displayed.
Click on File --> New --> Project

Select Business Intelligence Studio from the left side list and Select Integration
Services Project. Then, Provide the Name for your SSIS Package. Here I gave name as
Load Data_SSIS_1. Then, click on OK.

SSIS Package will be created as shown below:


Go to Project --> SSIS Import and Export Wizard.

When you click on SSIS Import and Export Wizard , following screen will appear:
Select Data Source --> Microsoft Excel
Select Excel File Path and Click on Next
Select Destination --> Microsoft OLEDB Provider for SQL Server"
Slect Server name and Database. Then, Click on Next
Select the Spread Sheet Name from the Source and tables from Destination.
Click on Edit Mappings . Map source and destination fields. Then, click on Ok

Then Click on Next --> Finish


Click on Close. SSIS Package will be created as shown below:

Now, the SSIS package is ready. This SSIS Package can be executed by clicking on
Start Debugging icon or by pressing F5.
Before Executing this package, let me show you the data in STUDENTS table in
database (SQL SERVER).
STUDENTS Table dont have any information.

Now, Execute the SSIS package by Pressing F5. All executabes in the Control
Flow of SSIS pacakge are turned to Green Colour. Which indicates SSIS Package is
executed successfully.
Now, lets check the data in STUDENTS Table after SSIS package execution.

Data from the Excel file is loaded into STUDENTS table.

Using SSIS data from any source can be Loaded /Extracted into any destination.

Video tutorial for loading data from excel into a sql server table using SSIS:

SSIS Lesson 3 - SSIS Import and Export Wizard to Extract Data


In the earlier article, I have explained about how use SSIS import and Export Wizard to
load data from one source to other destination. I have given the example to load data
from Excel file into a table in sql server.

In this Chapter, we will see how to use SSIS Import and Export wizard to extract data
from one data source to other destination.

Lets see how we can use SSIS import and Export Wizard to perform data extract
operation with an example for extracting data from a table in sql server into a flat file.

Go to Start All Programs Microsoft SQL Server 2005 SQL Server Business Intelligence
Development Studio.

Click on SQL Server Business Intelligence Development Studio. Following Screen will be displayed.

Click on File New Project


Select Business Intelligence Studio from the left side list and Select Integration
Services Project. Then, Provide the Name for your SSIS Package. Here I gave name as
Extract_Data_SSIS. Then, click on OK.

SSIS Package will be created as shown below:


Go to Project SSIS Import and Export Wizard

Welcome Screen for SSIS Import and Export Wizard will appear. Click on Next
Choose the appropriate data source from the drop down menu . Here in this
example we want to extract data from sql server table. So, Choose data source as
Microsoft OLEDB Provider for SQL Server. Connect to the required Server and
Database as shown below. Then, Click on Next

Choose the appropriate Destination. In this example, we want to extract data


from sql server table into a flat file. So, chose destination as Flat File Destination and
provide the destination path. Select the Column Names in the first data row Check
box if you want to display column names in the extract file. Then Click on Next.
Following Screen will appear. Then click on Next

Select the Source table Name. I am using STUDENTS table in this example. So,
I have selected STUDENTS table from the drop down tables list.
Also, Specify the Row Delimiter and Column Delimiters to be used in the extract
file.
For this example, I am using New Line {CR}{LF} as the row delimiter and
Comma(,) as the column delimiter.

By Clicking on the Preview button we can the see the data preview

Click on Next once your done with chosing table name and delimiters for the
file. Click on Finish to complete the SSIS Import and Export Wizard setup.
Once the Wizard setup is completed successfully, then following screen will
appear. Click on Close to go the SSIS report.
Now, SSIS pacakge is created by the SSIS import and Export Wizard successfully
with 1 data flow task in the control flow panel. Control flow and Data flow Screen shots
for this SSIS package are provided below.

Now, SSIS package can be executed by clicking on Start Debugging button or


by pressing F5.

Before running the SSIS package lets check the data in the flat file and data in
Sql server table STUDENTS.
Data n Flat File:
Flat file is empty.
Data in the STUDENTS table:

Now, Execute the SSIS package by clicking on Start Debugging button or by


pressing F5.
SSIS Package is executed successfully. Lets check the data in the flat file now.
Data is extracted into the flat file. Columns are seperated by Comma delimiter
and rows are seperaetd by new line.

Similarly, we can extract data from any data source into any required destination.