Sunday, September 16, 2012

Different ways to Execute SSIS Packages

We have the following ways to execute SSIS packages:
  • DTExec Command Line Utility
  • DTExecUI Windows Application
  • SQL Server Agent

DTExec Command Line Utility
SQL Server provides the command line tool DTExec.exe which can be used to execute an SSIS package. DTExec can be run from a Command Prompt or from a batch (.BAT) file.
To begin, open a Command Prompt and navigate to the project folder as shown below (I am taking an example of SalesForcastInput package from local directory "D:\SSIS\Hari\Sample\SSIS-Sample1\SalesForcastInput.dtsx"):

Now type the following command to execute the SalesForecastInput.dtsx package:

DTEXEC /FILE SalesForecastInput.dtsx

To see the complete list of command line options for DTEXEC, type following:
DTEXEC   /? 

It is not necessary to navigate command prompt to package directory before executing DTExec command; You can give the full path of your package as shown below:
DTExec /f "D:\SSIS\Hari\Sample\SSIS-Sample1\SalesForcastInput.dtsx" /conf "D:\SSIS\Hari\Sample\SSIS-Sample1\SalesForcastInput.dtsConfig" /M -1

Here, /f (/File) parameter used to load the package that is saved in the file system in your system. Likewise, /conf (/ConfigFile) parameter used to load the configuration file that is saved in the file system in your system. /M (/MaxConcurrent) specifies the number of executable files that the package can run concurrently.

Click here for more information about DTExec utility and it's parameters.

DTExecUI Windows Application

SQL Server includes the Windows application DTExecUI.exe which can be used to execute an SSIS package. DTExecUI provides a graphical user interface that can be used to specify the various options to be set when executing an SSIS package. You can launch DTEXECUI by double-clicking on an SSIS package file (.dtsx).
To begin, navigate to the your project folder. Double-click on the package (For instance, SalesForecastInput.dtsx in my example) and you will see the following multi-page dialog displayed:

As you can see there are many settings available when you use this utility. As a general rule you can simply click the Execute button to run your package. You can also fine tune your execution by clicking through the various screens and entering your own settings. After changing the settings click on Command Line which will show you the DTExec command line based on the settings you have chosen.
Note: If you have a configuration file which is not configured at package level, then do not forget to add the configuration file through Configurations setting.

Scheduling SSIS Package through SQL Server Agent

SQL Server Agent includes the SQL Server Integration Services Package job step type which allows you to execute an SSIS package in a SQL Server Agent job step. This can be especially handy as it allows you to schedule the execution of SSIS package so that it runs without any user interaction.
To begin, open SQL Server Management Studio (SSMS), connect to the Database Engine, and drill down to the SQL Server Agent node in the Object Explorer. Right click on the Jobs node and select New Job from the popup menu. Go to the Steps page, click New, and fill in the dialog as shown below:

In the above example, the SSIS package to be executed is deployed to SQL Server (i.e. the MSDB database). You can also execute packages deployed to the file system or the SSIS package store. Note that the Run as setting is the SQL Agent Service Account. This is the default setting although from a security standpoint it may not be what you want. You can setup a Proxy that allows you to give a particular credential permission to execute an SSIS package from a SQL Server Agent job step.



Saturday, September 1, 2012

Columnstore Index

Columnstore index was introduced in the SQL Server 2012 to significantly speed-up the processing time of common data warehousing queries.

Generally data warehousing workloads involve summarizing large volume of data. As you might know, we use different techniques in data-warehousing (DW) and decision support systems (DSS) to improve performance like pre-computed summary tables, indexed views, OLAP cubes etc. Although, these techniques greatly speed up query the data processing but they have some side effects. These techniques can be inflexible, hard to maintain, and must be designed specifically for specific query problem.

Data warehousing processing overhead can be significantly reduced by using SQL Server Columnstore index. Not only that, columnstore indexes reduces the overhead of the other solutions. Columnstore indexes also enable queries to compute the results quickly so that pre-computation is not required.

Characterisitcs of SQL Server Columnstore Indexes:

·        Columnar data format – unlike the traditional row based organization of data (called rowstore format), columnstore indexes group and store data in one column at a time. SQL Server query processing can take advantage of this new data layout and significantly improve query execution time.

·        Faster query results – Columnstore indexes can produce faster results for the following reasons:

o    Only the columns needed must be read. Therefore, less data is read from disk to memory and then from memory to processor cache.
o    Columns are heavily compressed. This reduces the number of bytes that must be read and moved and hence less I/O operations.
o    Most queries do not touch all columns of the table. Therefore, many columns will never be brought into memory. This, combined with excellent compression, improves buffer pool usage, which reduces total I/O.
o    Advanced query execution technology processes chunks of columns called batches in a streamlined manner, reducing CPU usage.
·        Key columns – There is no concept of key columns in a columnstore index.

·        Clustered index key – all the columns of clustered index must be present in the nonclustered columnstore index. If a column in the clustering key is not listed in the create index statement, it will be added to the columnstore index automatically.

·        Partitioning – Columnstore indexes works with table partitioning. No change to the table partitioning syntax is required. Albeit, nonclustered columnstore index can only be created on a partitioned table if the partitioning column is part of the columnstore index.

·        Record Size – There is no index key record size limitation on columnstore indexes.

·        Query processing – Along with the columnstore index, SQL Server introduces batch processing to take advantage of the columnar orientation of the data. The columnstore structure and batch processing both contribute to performance gains.

·        Table cannot be updated – For SQL Server 2012, a table with a columnstore index cannot be updated.

How to create Columnstore Index:

Columnstore index can be create using SQL Server Management Studio as well as using T-SQL.

Creating Columnstore index by using SQL Server Management Studio (SSMS): 

1.       Open Management Studio and use Object Explorer to connect to the SQL Server Database Engine.
2.       In Object Explorer, expand the instance of SQL Server, expand Databases, expand a database, expand a table, right-click on required table, point to New Index, and then click Non-Clustered Columnstore Index.
3.       In the Index name dialog box, under the General tab, type a name for the new index, and then click Add.
4.       In the Select Columns dialog box, select the columns to participate in the columnstore index, and then click OK two times to create the index.

Creating Columnstore index by using T-SQL:

Create columnstore index on Employee table, name it CSI_Employee, and include all the columns!

-- Check whether index CSI_Employee exists on Employee
-- and delete if exists.
IF EXISTS(  SELECT  name FROM sys.indexes
            WHERE   object_id = OBJECT_ID('dbo.Employee')
            AND     name = 'CSI_Employee')
DROP INDEX CSI_Employee ON dbo.Employee
-- Create nonclustered columnstore index CSI_Employee on Employee.
ON dbo.Employee

Data Types

Columnstore index support some data-types and doesn't support others. You need to be aware of this while creating columnstore index to avoid any run time errors.

The following data types can be included in a columnstore index:
·        char and varchar
·        nchar and nvarchar (except varchar(max) and nvarchar(max))
·        decimal (and numeric) (Except with precision greater than 18 digits.)
·        int, bigint, smallint, and tinyint
·        float (and real)
·        bit
·        money and smallmoney
·        All date and time data types (except datetimeoffsetwith scale greater than 2)

The following data types cannot be included in a columnstore index:
·        binary and varbinary
·        ntext, text, and image
·        varchar(max) and nvarchar(max)
·     uniqueidentifier
·         timestamp
·         sql_variant
·         decimal and numeric with precision greater than 18 digits
·         datetimeoffset with scale greater than 2
·         CLR types (hierarchyid and spatial types)
·         xml

There are certain restrictions and limitations on columnstore indexes which are listed below:

·         Columnstor index cannot have more than 1024 columns.
·         Columnstor index annot be clustered or unique. Only nonclustered columnstore indexes are available.
·         Columnstor index cannot be created on a view or indexed view.
·         Cannot include columns of restricted data-types as mentioned above.
·         Cannot act as a primary key or a foreign key.
·         Cannot be changed using the ALTER INDEX statement. Drop and re-create the columnstore index instead.
·         Cannot be created with the INCLUDE keyword.
·         Cannot include the ASC or DESC keywords for sorting the index. Columnstore indexes are ordered according to the compression algorithms.
·         Does not use or keep statistics in the manner of a traditional index.

Note: You can compare the performance optimization be executing heavy volume queries before and after creating columnstore index. You might get 50% to 90% performance gain!