Change Data Capture – Part I

May 31, 2009

With the arrival of SQL Server 2008 we gained access to a new feature called Change Data Capture (CDC). CDC enables you to constantly stream change data to a consuming process, most notably Extract, Transform, and Load (ETL) applications. In the SQL Server environment the ETL application used is of course SQL Server Integration Services (SSIS). 

the first obvious constraint in regard to SQL Server 2008’s CDC feature is that it is only available to SSIS applications that source their data from an OLTP system built on SQL Server 2008. In the field of implementing Microsoft BI there are many relational database systems we have to source data from that are not SQL Server based (and if they are SQL Server based there is a good chance it is not using the 2008 version). All major relational database systems use some form of a transaction log to record the history of its actions in case of rollback or hardware failure. I would imagine that the actual contents of each database vendor’s log is different, however they have the same basic requirement to temporarily persist database actions.

So how to enable Change Data Capture on a database, and on a table, and how to keep track of Data Definition Language changes on a table.

Step 1

Let’s create a database named SqlHero as shown below.

USE [master]
GO

/*** Object:  Database [SqlHero]   ***/
IF  EXISTS (SELECT name FROM sys.databases WHERE name = N'SqlHero')
DROP DATABASE [SqlHero]
GO
USE [master]
GO

/*** Object:  Database [SqlHero]   ***/
CREATE DATABASE [SqlHero]
GO

Step 2

Now let’s create a table named MyTable on the SqlHero database, as shown Below.

USE [SqlHero]
GO

/*** Object:  Table [dbo].[MyTable]   ***/
IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[MyTable]')
	AND type in (N'U'))
DROP TABLE [dbo].[MyTable]
GO
USE [MyDataBase]
GO

/*** Object:  Table [dbo].[MyTable]   ***/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

SET ANSI_PADDING ON
GO

CREATE TABLE [dbo].[MyTable](
	[ID] [int] NOT NULL,
	[Name] [varchar](100) NULL,
 CONSTRAINT [MyTable_PK] PRIMARY KEY CLUSTERED
(
	[ID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF,
	IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON,
	ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

SET ANSI_PADDING OFF
GO

Step 3

In order to track changes on the table, we need to enable the Change Data Capture feature on the database. We can enable the Change Data Capture feature using the following Transact SQL command, as shown below.

Until this point, the only schema that exists on the database is dbo. Once we enable Change Data Capture, a new schema with a bunch of objects will be created.

USE [SqlHero]
GO
EXEC sys.sp_cdc_enable_db_change_data_capture
GO

The following CDC tables are created under the CDC schema, as shown below

cdc.captured_columns
cdc.change_tables
cdc.ddl_history
cdc.index_columns
cdc.lsn_time_mapping

When you query these tables, you will see only zero number of rows.

select * from cdc.captured_columns
select * from cdc.change_tables
select * from cdc.ddl_history
select * from cdc.index_columns
select * from cdc.lsn_time_mapping

Result

(0 row(s) affected)
 (0 row(s) affected)
(0 row(s) affected)
 (0 row(s) affected)
 (0 row(s) affected)

 

Step 3

In order to track changes on the table, we need to enable the Change Data Capture feature on the table as well. Let’s enable the Change Data Capture feature using the following Transact SQL command, as shown below.

USE [SqlHero]
GO
EXEC sys.sp_cdc_enable_table_change_data_capture
@source_schema = 'dbo',
@source_name = 'MyTable',
@role_name = 'cdc_MyTable'
GO

Result

Job 'cdc.MyDataBase_capture' started successfully.
Job 'cdc.MyDataBase_cleanup' started successfully.

 

By reading the result, we can easily understand that SQL Server Agent is a must to do the capture and cleanup. We can see these jobs are actually created as SQL Server Scheduled jobs. [Refer Fig 1.6, 1.7]

Now let’s execute the following Transact SQL to see if any data has been inserted on any of the CDC tables.

select * from cdc.captured_columns
select * from cdc.change_tables
select * from cdc.index_columns

Result

object_id, column_name, column_id, column_type, column_ordinal, is_computed
389576426, ID, 1, int, 1, 0
389576426, Name, 2, varchar, 2, 0

(2 row(s) affected)

object_id,  version,  source_object_id, capture_instance, start_lsn, end_lsn, supports_net_changes, has_drop_pending,
	role_name, index_name, filegroup_name, create_date
389576426, 0, 53575229, dbo_MyTable, NULL, NULL, 0, NULL, cdc_MyTable, MyTable_PK, NULL,
(1 row(s) affected)

object_id, column_name, index_ordinal, column_id
389576426, ID, 1, 1

(1 row(s) affected)

You can see that the CDC schema stores metadata information about which tables and columns are being tracked by Change Data Capture. It also stores information about what Index the table that has been tracked has.

Step 4

We can check to see if Change Data Capture is enabled on a database by using the following transact SQL statement.

SELECT is_cdc_enabled FROM sys.databases WHERE name = 'SqlHero'

Result

is_cdc_enabled
--------------
1

(1 row(s) affected)

Step 5

We can check to see if Change Data Capture is enabled on a table, by using the following transact SQL statement.

SELECT is_tracked_by_cdc FROM sys.tables WHERE name = 'MyTable'

Result

is_tracked_by_cdc
-----------------
1

(1 row(s) affected)

Step 6

Now let’s make some changes to table structure and see if Change Data Capture captures the changes. Execute the following query as shown below.

USE [SqlHero]

GO

Alter Table MyTable add Address varchar(500)

GO

Alter Table MyTable add Salary money

GO

Alter Table MyTable add Bonus money

GO

 

Query the cdc table ddl_history as shown below.

select * from cdc.ddl_history

Result

source_object_id, object_id, required_column_update, ddl_command, ddl_lsn, ddl_time
53575229, 389576426, 0, Alter Table MyTable add Address varchar(500)
, 0x0000001C000001350001,
53575229, 389576426, 0, Alter Table MyTable add Salary money
, 0x0000001C000001370018, 53575229, 389576426, 0, Alter Table MyTable add Bonus money
, 0x0000001C0000013D0018,
(3 row(s) affected)

Advertisements

Virtual Labs

May 28, 2009

you can now test SQL Server 2008 new features through virtual labs on line in Business intelligence, Data warehousing, OLTP and Application Deployments

believe me really useful

http://www.microsoft.com/sqlserver/2008/en/us/virtual-labs.aspx