Menu Items

Tuesday, 20 March 2012

SSIS various tasks description

A task can best be described as an individual unit of work. We can also develop our  own tasks, but here are the current ETL Tasks available us to out-of-the-box:

ActiveX Script Task: Executes an ActiveX script in SSIS package. This task is only to facilitate conversion of legacy DTS packages that use this deprecated scripting method.

Analysis Services Execute DDL Task: Executes a DDL Task in Analysis Services. For example, this can create, drop, or alter a cube (Enterprise and Developer Editions only).

Analysis Services Processing Task: This task processes a SQL Server Analysis Services cube, dimension, or mining model.

Bulk Insert Task: Loads data into a table by using the BULK INSERT SQL command.

Data Flow Task: This very specialized task loads and transforms data into an OLE DB, and now, optionally, an ADO.NET Destination.

Data Mining Query Task: Allows us to run predictive queries against Analysis Services data - mining models.

Data Profiling Task: This exciting new task allows for the examination of data to replace our ad - hoc data profiling techniques.

Execute DTS 2000 Package Task: Exposes legacy SQL Server 2000 DTS packages to our SSIS package.

Execute Package Task: Allows us to execute a package from within a package, making our SSIS packages modular.

Execute Process Task: Executes a program external to our package, such as one to split our extract file into many files before processing the individual files.

Execute SQL Task: Executes a SQL statement or stored procedure.

File System Task: This task can handle directory operations such as creating, renaming, or deleting a directory. It can also manage file operations such as moving, copying, or deleting files.

FTP Task: Sends or receives files from an FTP site.

Message Queue Task: Sends or receives messages from a Microsoft Message Queue (MSMQ).

Script Task: This task allows us to perform more .NET - based scripting in the Visual Studio Tools for Applications programming environment.

Send Mail Task: Sends a mail message through SMTP.

Web Service Task: Executes a method on a Web service.

WMI Data Reader Task: This task can run WQL queries against the Windows Management Instrumentation. This allows us to read the event log, get a list of applications that are installed, or determine hardware that is installed, to name a few examples.

WMI Event Watcher Task: This task empowers SSIS to wait for and respond to certain WMI events that occur in the operating system.

XML Task: Parses or processes an XML file. It can merge, split, or reformat an XML file.

If your database user login is assigned to the db_ssisoperator role, which has only read access and does not have write access to a package stored in msdb, what are you able to do with the package?

Ans.  Because you are assigned the db_ssisoperator role, you can execute the package inside msdb, but you cannot delete the package from the msdb store or import packages into the store

What information stored in the package definition would cause a security risk if it were found?

Ans.  Although a package fi le does not contain data, it does contain the schema details about input sources and destinations. Even if these sources and destinations cannot be accessed, this information can be a security risk because it exposes table  and column names.

Can you configure the security of a package so that you can store a password in clear text in the underlying .dtsx fi le in the file system?

Ans. no, sensitive information such as connection passwords cannot be stored as clear text in the package fi le in the fi le system. The only way to store the connection password in the fi le is to encrypt the sensitive information by setting the ProtectionLevel to EncryptSensitiveWithPassword or EncryptSensitiveWithuser- Key. A better choice is to set the ProtectionLevel to DontSaveSensitive and use an SSIS confi guration to store the connection password.

Friday, 16 March 2012

To run a stored procedure within a SQL Server database, what task would you choose?

The Execute SQL Task can run a stored procedure within SQL Server or any relational database for which you have an installed data provider. The syntax of the statement entered in the Execute SQL   ask will be in the native language of the underlying database.

Thursday, 15 March 2012

Microsoft SQL-Server 2012 New Features

Microsoft SQL-Server 2012 New Features

Sequences have been requested by the SQL Server community for years, and it's included in this release. Sequence is a user defined object that generates a sequence of a number. Here is an example using Sequence.
/****** Create Sequence Object ******/
/****** Create Temp Table ******/
Fname varchar(100) NOT NULL
/****** Insert Some Data ******/
INSERT @ Emp (ID, Fname)
VALUES (NEXT VALUE FOR Seqexample, ‘Satya Abc’),
(NEXT VALUE FOR Seqexample, ‘Satya Mno’),
(NEXT VALUE FOR Seqexample, ‘Satya Xyz’);
/****** Show the Data ******/
The results would look like this:
ID       Fname
1         Satya Abc
2         Satya Mno
3         Satya Xyz

Ad-Hoc Query Paging:
 Paging results in SQL Server has been discussed for years. The Order By option in the SQL SELECT statement has been enhanced in SQL Server 2012. Using a combination of OFFSET and FETCH along with ORDER BY gives you control of paging through a result set. Using this technique can really help performance by bring back only the results you want to show to your users when they are needed. The following TSQL code runs against the Person table in the AdventureWorks sample database (available from Microsoft). In the sample query below, SQL Server would return 10 records beginning with record 11. The OFFSET command provides a starting point for the SELECT statement in terms of paging, and the FETCH command provides how many records to return at a time.
SELECT BusinessEntityID, FirstName, LastName
FROM Person.Person
ORDER BY BusinessEntityID
Availability Enhancements including AlwaysOn SQL Server Failover Cluster Instances, AlwaysOn Availability Groups and enhancements to Online Operations for online index builds (where index columns contain LOB data);

Scalability and Performance Enhancements including a new data warehouse query acceleration feature based on a new type of index called the column store;

Full Text Search: The Full Text Search in SQL Server 2012 has been enhanced by allowing you to search and index data stored in extended properties or metadata. Consider a PDF document that has "properties" filled in like Name, Type, Folder path, Size, Date Created, etc. In the newest release of SQL Server, this data could be indexes and searched along with the data in the document itself. The data does have to be exposed to work, but it's possible now.

SQL Server Express LocalDB — a new lightweight edition of Express that has all its programmability features

FileTables — extends FILESTREAM technology and allows you to now store files and documents in special tables in SQL Server, but access them from Windows applications as if they were stored in the file system, without making any changes to the Windows applications. File and directory data is exposed through a Windows share.

Statistical Semantic Search — provides deep insight into unstructured documents stored in SQL Server databases by extracting statistically relevant key phrases (allowing you to query the meaning of a document);

New built-in T-SQL functions: Conversion functions now include PARSE, TRY_PARSE, TRY_CONVERT; Date and time functions now include DATEFROMPARTS, DATETIME2FROMPARTS, DATETIMEFROMPARTS, DATETIMEOFFSETFROMPARTS, EOMONTH, SMALLDATETIMEFROMPARTS, TIMEFROMPARTS; Logical functions now include CHOOSE (gets item at the specified index from a list of values), IIF (a shorthand if-then construct already common in other languages); and String functions now include CONCAT and FORMAT;

ORDER BY clause improvements using OFFSET and FETCH (limit result set rows to a range) — this feature should come in very handy for all sorts of ad-hoc queries and/or web-applications that implement "paging" through data! Finally, less custom-code for this!

THROW statement can be used to raise an exception and transfer execution to a CATCH block of a TRY…CATCH construct;

OVER clause has been extended to support window functions which will reduce the need for custom code to compute things like moving averages or cumulative totals;

Additional analytic functions like: FIRST_VALUE, LAST_VALUE, , LAG (previous rows access without self-JOIN), LEAD (subsequent rows access without self-JOIN), CUME_DIST, PERCENTILE_CONT, PERCENTILE_DISC, PERCENT_RANK

EXECUTE statement can now specify the metadata returned from the statement by using the WITH RESULT SETS argument;

New and Enhanced Query Optimizer Hints: FORCESEEK includes further options and a new FORCESCAN hint;

Native XML Web Services (SOAP/HTTP endpoints) is Removed;
Finally! ANSI JOIN syntax is required! No more "*= and =*" join syntax, which hopefully you migrated away from years ago;

FASTFIRSTROW (query hint) is replaced with OPTION (FAST n);


SET ROWCOUNT for INSERT, UPDATE, and DELETE statements are deprecated and will not be supported in the next version of SQL-Server. Use the TOP keyword instead;

Wednesday, 14 March 2012

Hardware and Software Requirements to Install SQL Server 2012 (Denali)

·         Microsoft recommends using NTFS file format instead of FAT32. FAT32 will work but you should probably not use it.
·         You can't install SQL Server 2012 (code-named Denali) on mapped drives or compressed drives.
·         You have to have the "no-reboot" package installed prior to installing SQL Server 2012 (code-named Denali). This is included in Windows 7 SP1 and Windows Server 2008 R2. Otherwise, you can download the no-reboot package from Microsoft.
·         SQL Server 2012 (code-named Denali) requires the .NET Framework 4.0.
·         Virtualization is supported using Microsoft's Hyper-V technology.
·         You will need at least 3.6 GB of free disk space.
·         Microsoft recommends that you do not install SQL Server 2012 (code-named Denali) on a domain controller.

Recommended Processors & RAM
·         64-bit version: AMD Opteron, AMD Athlin 64, Intel Xeon with Intel EM64T Support or Intel Pentium IV with EM64T support running 2.0 GHz or faster. Recommended RAM is maximum that operating system supports or at least 2 GB.
·         32-bit version: Pentium III or compatible running at 2.0 GHz of faster. Recommended RAM is maximum that operating system supports or at least 2 GB.
·         Windows PowerShell 2.0 is a pre-requisite for installing SQL Server 2012 (code-named Denali). You can get this software from the Windows Management Framework page.

New features are available in SSIS Denali(2012) over 2008 R2

       Getting started window
       Package visualization
       SSIS toolbox
       Data flow source/destination wizard
       Sort packages by name
       Grouping in data flow

       Incremental load loads all rows that have changed since the last load
       How do we know what has changed?
o   Compare every source row with every destination row
o   Last modified date and a trigger to maintain this
o   Change tracking
o   Change data capture!
       SQL Server Enterprise edition, 2008 or higher
       Asynchronous process
       Captures all changes
       Maintains time window
       CDC data access via table valued functions Books online, change data capture

       CDC needs to keep track of which changes have already been processed
       CDC task does this by storing LSNs in a tracking table
       CDC Source component reads from the CDC table function, based on the LSN it got from the CDC task
       CDC transformation splits records into new rows, updated rows and deleted rows
       No documentation yet in RC0, check Matt Masson’s blog
       Based on Attunity CDC components

       When modifying a data flow, column remapping is sometimes needed
       SSIS 2012 maps columns on name instead of id
       It also has an improved remapping dialog

       ODBC was not natively supported in 2008
       SSIS 2012 has ODBC Source & Destination
o   Handy for connecting to SQL Azure
o   Essential if SQL Server stops supporting OleDb
       SSIS 2008 could access ODBC via ADO.Net:
o   Has create table option, which ODBC lacks
o   No control on batch inserts nr of rows ODBC ADO.Net % Diff
o   Low performance

       Script task and script component now support .Net 4.0
       Breakpoints are supported in script component
       When developing custom components, there is better backpressure support:
o   SupportsBackPressure property, IsInputReady and GetDependantInputs method

       The script task can be used to modify variable values… but it’s overkill• Expression task provides a simple task to change variable values

       DQS is a new service to clean domain data
       Domain knowledge base needs to be build
o   Based on rules, positive and negative examples
o   Potentially using external data from Azure Marketplace or other providers

       Cleaning and standardizing data before it is loaded in the data warehouse is essential
       DQS Cleansing task labels data in 4 categories:
o   Correct: a value accepted by the knowledge base
o   Corrected: a value on which DQS is confident it can correct to a valid domain value
o   Suggested: a value on which DQS is less confident, but can still suggest a domain value
o   New: DQS has no suggestions for this

       SSIS 2012 can work in the new project mode (default) or in old package mode (backwards compatibility)
       In project mode, many things change:
o   Project becomes the level of deployment
o   Deployment to SQL Server becomes obligatory
o   Packages not stored in msdb, but in dedicated user database:
·         The package catalog, named SSISDB
o   Logging happens automatically and is done in the package catalog
·         Custom logging still supported
       Projects can be converted from one deployment type to another
       Manage via SSMS: Relational engine• Fixed database name: SSISDB
       Stores projects, versions, logs, 5 reports, 25 views, 42 stored procedures, …
       This makes it possible to run, monitor and manage SSIS projects and packages via T-SQL!

       Just two scopes:
o   Package
o   Project!
o   Value is set when scope starts and cannot be changed
o   Can be set from SQL Server Data Tools configurations
       Often used together with environments
       Does not replace variables
o    a package configuration replacement
       Using the visual studio (SSDT) configurations we can configure default values for testing

       Shared connection manager is defined at project level and is automatically available in every package
o   Not copied as in SSIS 2008
       Shared connection managers can be parameterized as well
       When converting shared connection managers back to regular (package) connection managers, they disappear in all other packages
       Shared cache connection managers are supported as well
o   This allows to cache data in memory in one package and reuse it in multiple other packages

       Environments replace package configurations
       They can control parameter values and connection strings
       Environments are created in the package catalog
o   They are not deployed to the server, but created on the server
o   Don’t forget to reference the environment at the project level
o   Script them while creating, this eases creating multiple environments
       A server might have multiple environments
o   When we execute a package, we can choose which environment we’ll use

       Imagine a data viewer
o   Which can be added on the runtime server
o   Without modifying the package, but using T-SQL
o   Which writes the data to disk instead on visualizing it…
       Voila, you are now thinking about the data tap

More Enhancements

       .Net API and Powershell
       Pivot and row count transformation get a user interface
       Flat file supports
o   Embedded qualifiers
o   Variable number of columns (but still fixed meta-data)
       Raw file improvements
o   Generate empty raw file
o   Stores sort info
       DTSX files are becoming more readable and ‘mergeable’
       Merge and merge join improve backpressure handling
       4000 char expression length lifted
       New expression language keywords
o   LEFT as syntactic sugar for SUBSTRING(,1,)
o   TOKEN and TOKENCOUNT for shredding strings

New features are available in SSAS Denali(2012) over 2008 R2

·         The new release is based on the vision to reach a broader user base
·         SSAS brings together the existing  multi-dimensional model with the new tabular model pioneered by PowerPivot under a single unified platform – the BI Semantic Model
·         Data Mining Component remains the same
·         The BI Semantic Model promises one model for all end user experiences
o   End Users/Personal BI – PowerPivot for Excel
o   Team BI – PowerPivot for SharePoint
o   Coporate BI – Analysis Services
·         What is The BI Semantic Model? It is just another term for a Data Model hosted within Analysis Services
·         There are three approaches in developing a BI Semantic Model
o   PowerPivot Workbook
§  Design Type: Tabular
§  Tool: Excel 2010
§  Biz Logic: DAX
§  Data Access-Cache Passthrough: In Memory
§  Deployment: SharePoint Library/Analysis Services/PowerPivot
o   Tabular Project
§  Think of this as PowerPivot for the IT Pro. End user hands off the PowerPivot workbook to IT department.
§  Design Type: Tabular
§  Tool: SQL Server Data Tool
§  Biz Logic: DAX
§  Data Access-Cache Passthrough: In Memory Direct Query
§  Deployment: Analysis Services Tabular Mode
o   Multidimensional Project
§  Design Type: Multidimensional
§  Deployment: SQL Server Data Tools
§  Biz Logic: MDX
§  Data Access-Cache Passthrough: MOLAP/ROLAP
§  Deployment: Analysis Services Multidimenional
o   Front End Clients
§  PowerView (uses DAX to query the BI Semantic Model)
§  3rd party apps
§  Reporting Services
§  Excel
§  PowerPivot
§  SharePoint Insights (PerformancePoint Services)
·         Note that Tabular Models understand both DAX and MDX
·         DAX is new SQL Server 2012 and only works against tabular models
·         The advantage of a tabular model is that we can now work with:
o   Files
o   OData Feeds
o   Cloud Services
o   Relational Databases etc
·         BI Sematic Model supports Fine-grained security row and cell level
·         The following templates are available through SQL Server Data tools:
o   Analysis Services Multidimensional and Data Mining Project
o   Import from Server (Multidimensional and Data Mining)
o   Analysis Services Tabular Project
o   Import from PowerPivot
o   Import from Server (Tabular)
·         Multidimensional Model New Features
o   300+ improvements
o   New Developer tool: SQL Server Data tools (Visual Studio 2010)
o   Removal of 4G string limit for attributes
o   New events for monitoring locks and attributes
o   Support for XEvents for low overhead eventing support
o   PowerShell support
·         Installation and Configuration
o    Services instances are installed with SQL Server Installer
§  Server modes: Multidimensional and Tabular – both can be installed on the same server
o   PowerPivot Configuration tool is used to configure/repair a PowerPivot instance
·         New PowerShell cmdlets for AMO for command-line connectivity, navigation and discovery of SSAS databases
·         Existing apps – every  UDM  becomes (can be upgraded to) a BI Semantic Model

Tuesday, 13 March 2012

New features are available in SSRS Denali(2012) over 2008 R2

There's not much new in Denali for SSRS. I vaguely recall reading that the SSRS team got most of their big enhancements into 2008 R2 instead of needing to delay to Denali.
1. That said, BIDS will be Visual Studio 2010. That's worth a lot on its own.
2.  Word and Excel export will finally support Office OpenXML file formats (.xlsx and .docx) which will remove some limitations from the previous formats. The biggest of these are the row and column counts for Excel exports.
3. Then Project Crescent that you mentioned: self service Silverlight "Report" story authoring. I'm not sure this will really be part of SSRS proper; I'm not sure where it will fit.
4. I expect minor enhancements of the existing report objects, but I don't think there will be much new for SSRS in Denali.

What is the difference between a control flow task and a control flow container?

Control flow tasks perform operations, whereas containers coordinate and group tasks.

Example: a foreach Loop Container can look through the fi les in a system folder, and a file System Task embedded within the container can then move the files to a new folder location.

SSIS New Features in SQL Server 2008

Scripting Task
SQL Server 2008 scripting is built on the improved VSTA (Visual Studio Tools for Applications) rather than the VSA (Visual Studio for Applications) scripting engine. VSTA is  allows us to build scripts in both C# and VB.NET.

Data Profiling Task 
The Data Profiling Task is new in 2008 and provides a way to analyze data for discovering data issues, constraints, patterns, distributions, and other data profile characteristics

ADO.NET Support
Integration Services 2008 introduces an ADO.NET source and destination adapters. ADO.NET adapters  are full-featured adapters and provide the full custom UI experience. Because there are ADO.NET ODBC adapters, by nature of supporting ADO.NET, Integration Services now also supports connectivity through ODBC.

Integration Services 2008 introduces a new wizard called the Integration Services Connections Project Wizard that allows us to create connections, data flow, and a package as we create the project. This is a faster way to create a package

Improved Data Flow Lookup Component
The new Lookup Component is perhaps the most improved of all features in 2008. It now allows us to source the reference data from just about anywhere including another data flow task. The Lookup Component also allows us to store the reference data in a cache thus drastically reducing the load times.

Improved Data Flow Task
The Data Flow Task core processing has been improved to better utilize threads for more efficient and scalable execution trees. In certain data flow packages, the 2005 Data Flow Task could get into blocking situations or simply not utilize multiple processors as efficiently as expected.