SQL Server 2012 Distributed Replay

 

Distributed Replay is a new feature intruded in SQL Server 2012. This blog post is to document the process of setting up distributed replay and a quick demo of how to use distributed replay.

 

First step is to install Distributed Replay Controller and Distributed Replay Client. Distributed Replay Controller and Distributed Replay Client don’t have to be on the same server. SQL Server Enterprise supports up to 16 Distributed Replay Clients. Using 16 Distributed Replay Clients to replay a workload doesn’t mean SQL Server will be getting 16 times of original workload, it means the original workload will be distributed into 16 players and database server will still be receiving same amount of workload.

 

 

After controller and client are installed, there will be 2 Windows services created accordingly. I used to a service account to run these 2 services:

 

Now we need to setup the permission so clients will be able to access controller. To do that, we will need to go to Component Services and look for DReplay Controller under DCOM Config. The permissions are required include “Local Access”, “Remote Access” for “Access Permissions” and “Local Activation”, “Remote Activation” for “Launch and Activation Permissions”:

 

 

Also, the service account needs to be added in “Distributed COM Users” group:

 

After permissions are setup, we will be able to start controller and client:

 

Note, firewall exception might be needed for successful connection.

 

The connection between client and controller has been successfully established, we will be able to see “Registered with controller” message in client log:

 

Now, we need to prepare a workload using SQL Server Profiler. We will use “TSQL_Replay” template to capture the workload needs to be replayed:

 

Save the captured trace file on local disk:

 

Use DReplay to preprocess the trace file for replaying:

 

Once preprocessing is done, we can start replaying the trace:

 

That’s it!

 

There are config files for both controller and client can be used for different test scenarios, refer to BOL for more details.

Step-By-Step: Extracting data from SQL Server using PowerShell

A script of mine was published on SQL Server Central: http://www.sqlservercentral.com/scripts/powershell/108915/

Reader swebster wanted a more detailed instruction of how to use this script. So here you go:

 

Step 1: Copy the script from SqlServerCentral: http://www.sqlservercentral.com/scripts/powershell/108915/

Copy the complete script (notice there is a scrollbar on the right side), make sure the first line is $ErrorActionPreference=”Stop” and last line is a single }

 

 

Step 2: In your folder, create a file called csv.txt then rename it to csv.ps1. If you’re not familiar with how to renaming an extension, read this.

Step 3: Open csv.ps1 with NotePad or any other text editing software. Copy the sript and paste it in the csv.ps file. Save it after. It should look like this:

 

Step 4: Save your SQL Server script in the same folder and make sure you have access to your database using Windows Authentication.

Step 5: Open Windows PowerShell, go to your folder:

 

Step 6: Run following command (Note: you need to use .\ before the filename, just like Linux does)

.\csv.ps .\SQL2012 master master.sql

Step 7: Now you can view the results using your favorite tool:

 

 

Enjoy!

In-Memory Is New #2: New Transaction Log

In SQL Server 2014, transaction log has also been enhanced to provide better performance for memory-optimized tables. Let’s see some real cases.

 

First thing is to create a test db. If you read my previous blog post, this is the same script I used:

USE MASTER;

GO

CREATE DATABASE DB1;

GO

ALTER DATABASE DB1 ADD FILEGROUP IMOLTP CONTAINS MEMORY_OPTIMIZED_DATA;

GO

ALTER DATABASE DB1 ADD FILE (NAME=‘IMOLTP’,FILENAME=‘C:\DB\IMOLTP’) TO FILEGROUP IMOLTP;

GO

USE DB1;

GO

 

CREATE TABLE T1 (ID INT PRIMARY KEY, MSG CHAR(8000));

GO

CREATE TABLE T1_OLTP (ID INT NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 1000000), MSG CHAR(8000)) WITH (MEMORY_OPTIMIZED = ON)

GO

 

Now DB1 is ready. Before we start testing, let’s insert one record into both tables to initialize a new page (memory-optimized table doesn’t use page, but let’s do it anyway for the sake of fairness).

INSERT INTO T1 VALUES (0,REPLICATE(‘A’,8000));

GO

INSERT INTO T1_OLTP VALUES (0,REPLICATE(‘A’,8000));

GO

 

The test is to insert 5 records into both tables and see how the transaction log behaves.

 

Let’s start with disk-based table T1.

Before we insert, let’s write down the how many log record we have so we can have a delta after:

SELECT * FROM sys.fn_dblog(NULL,NULL)

There are currently 703 log records. We will use these number later.

 

Now, let’s insert:

BEGIN TRAN 

DECLARE @N INT = 1;

WHILE (@N < 6)

BEGIN

    INSERT INTO T1 VALUES (@N,REPLICATE(‘A’,8000))

    SET @N = @N + 1

END

 

COMMIT TRAN

GO

 

This transaction completes successfully. Let’s check transaction log again.

SELECT * FROM sys.fn_dblog(NULL,NULL)

There are 803 log records now, 100 more records generated for inserting 5 rows in T1 table.

 

What happens to memory-optimized table?

Let’s insert 5 records into T1_OLTP table:

BEGIN TRAN

DECLARE @N INT = 1;

WHILE (@N < 6)

BEGIN

    INSERT INTO T1_OLTP VALUES (@N,REPLICATE(‘A’,8000))

    SET @N = @N + 1

END

COMMIT TRAN

GO

 

How many transaction log do we have now?

SELECT * FROM sys.fn_dblog(NULL,NULL)

813 record, which means there were only 10 records generated for memory-optimized table.

 

In this test, memory-optimized table only logs 10% of logs disk-based table does. Which significantly reduces the number of IO and improve overall performance.

In the meantime, if you look at the “Log Record Length” of memory-optimized table, they are also greater than disk-based table:

This is because SQL Server combines the log records into bigger chunks in order to reduce the total number of IO writes.

 

Furthermore, memory-optimize tables don’t generates logs for rolled backs transactions while disk-based tables still do.

Again, let’s test:

CHECKPOINT;

GO

DECLARE @cnt INT, @cnt2 INT;

SELECT @cnt =COUNT(*) FROM SYS.FN_DBLOG(NULL,NULL) 

 

BEGIN TRAN

INSERT INTO dbo.t1 VALUES  (101,REPLICATE(‘a’,8000)),
(102,REPLICATE(‘b’,8000)),

(103,REPLICATE(‘c’,8000)),

(104,REPLICATE(‘d’,8000)),

(105,REPLICATE(‘e’,8000))
ROLLBACK TRAN

SELECT @cnt2=COUNT(*) FROM sys.fn_dblog(NULL,NULL);

SELECT ‘Disk-based Table: # of logs generated for rollback’, @cnt2-@cnt AS ROW_CNT

GO

 

DECLARE @cnt INT, @cnt2 INT;

SELECT @cnt =COUNT(*) FROM SYS.FN_DBLOG(NULL,NULL)

BEGIN TRAN
INSERT INTO dbo.T1_OLTP VALUES  (101,REPLICATE(‘a’,8000)),
(102,REPLICATE(‘b’,8000)),

(103,REPLICATE(‘c’,8000)),

(104,REPLICATE(‘d’,8000)),

(105,REPLICATE(‘e’,8000))

ROLLBACK TRAN

SELECT @cnt2=COUNT(*) FROM sys.fn_dblog(NULL,NULL);

SELECT ‘Memory-Optimized Table: # of logs generated for rollback’, @cnt2-@cnt AS ROW_CNT

GO

 

This test inserts 5 records in both disk-based table and memory-optimized table and roll them back. It calculates the number of transaction logs generated in both cases. Here are the returned results:

 

As we can see from this screen shot, the memory-optimized table generated no log during transaction rolling back while disk-based table generated 108 log messages. This is another improvement memory-optimized tables have over disk-based tables!

Summary

Memory-optimized tables are using a newly designed, more efficient transaction log. This new transaction log allow memory-optimized table to reduce physical IO significantly and improve performance. Transaction Log in In-memory is new!

In-Memory Is New #1 : New Transaction Isolation Level

SQL Server 2014 supports 4 types of transaction isolation levels:

  • READ COMMITTED
  • SNAPSHOT
  • REPEATABLE READ
  • SERIALIZABLE

For memory-optimized tables READ COMMITTED transaction isolation level is only supported in auto-commit transactions and cannot be used within an explicit transaction. On the surface, the other 3 transaction isolations types look exactly like same as transaction isolation levels in disk-based tables with same names. Really?

 

Let’s take a look at how memory-optimized tables in snapshot isolation.

 

Disk-Based tables

First thing, create a test database – db1 and a test table table1:

CREATE DATABASE db1;
go
ALTER DATABASE db1 SET ALLOW_SNAPSHOT_ISOLATION ON
go
use db1;
go
CREATE TABLE t1 (id INT, msg varchar(20));
go
INSERT t1 VALUES (1,‘abc’);
go

 

Step 1: start transaction TX1 in session 1 and update table t1 (id = 1)

SET TRANSACTION ISOLATION LEVEL SNAPSHOT
go
BEGIN TRAN TX1
UPDATE t1 SET msg =‘session 1′ WHERE id = 1

 

 

Step 2: start transaction TX2 in session 2 and update table t1 (id =1)

SET TRANSACTION ISOLATION LEVEL SNAPSHOT
go
BEGIN TRAN
UPDATE t1 SET msg =‘session 2′ WHERE id = 1

 

TX2 is now being blocked by TX1

 

Step 3: commit transaction TX1 in session 1

COMMIT TRAN TX1

 

In disk-based environment, TX2 is rolled back immediately after TX1 commits. Note TX2 could continue if TX1 rolls back instead of commits.

 

Following picture shows how TX1 and TX2 behave in a update conflict scenario:

 

 

Memory-Optimized Tables

Now, let’s take a look at how memory-optimized tables behave in snapshot transaction isolation level.

First thing (again) – update database and create a memory-optimized table:

alter database db1 add filegroup IMOLTP contains memory_optimized_data;
go
alter database db1 add file (name=‘imoltp’,filename=‘c:\db\imoltp’) to filegroup IMOLTP;
go
create table t1_oltp
(id int not null primary key nonclustered hash with (bucket_count = 000000),
msg char(20)) with (memory_optimized = on)
go

 

Step 1: begin transaction TX1 and update row (id = 1)

begin tran TX1
update t1_oltp with (snapshot) set msg=‘TX1′ where id = 1

 

Step 2: begin transaction TX2 and update same row (id=1)

begin tran TX2
update t1_oltp with (snapshot) set msg=‘TX2′ where id = 1

 

SQL Server detects a write-write conflict and rolls back TX2 immediately:

Memory-optimized tables in snapshot transaction isolation:

Summary

Memory-optimized tables use a newly designed, lock-free optimistic transaction isolation level which works differently comparing to traditional disk-based transaction isolation levels. Transaction isolation level in In-memory is new!

Memory Optimized Table – Insert Test …Again

Last time my tests showed no major difference on simple inserting between disk-based table and memory-optimized table. On the opposite side, there is a Microsoft published script and test process to prove how much performance gain we can get using memory-optimized tables. This test, shows positive performance gains. Below is the test result I got from my VM:

 

 

This test had been done under different workload (5K, 10K and 15K) and different parallel connections (10 threads, 20 threads and 30 threads). The test result of this test shows memory-optimized table is actually faster than disk-based table, and it’s much more scalable than disk-based table.

 

Where did the difference come from? In Microsoft’s testing script, all the inserts are done through stored procedures, either native compiled procedures or traditional stored procedures. The performance gain we saw in this test is actually coming from compiled procedures. It also demonstrates traditional stored procedure is less memory-friendly!

 

Microsoft’s testing script can be downloaded here: http://msftdbprodsamples.codeplex.com/releases/view/114491

 

Enjoy!

Memory-Optimized Tables – Insert Test (continued)

Memory-optimized tables didn’t show much performance gain during the test last time. So I have decided to do more tests – just inserting tests and see how well memory-optimized tables perform.

 

Last time my VM had 2GB RAM and memory-optimized table could only hold 150K rows. To be able to test more cases, I added another 6GB and made it 8GB this time. In the meantime, I have also added 3 more logical CPU cores.

 

The tests I’m going to do this time is to compare insert performance of memory-optimized tables and disk-based tables under different workload: 50K, 100K, 150K, 200K, 250K and 300K row inserts. And I’m going to run these tests 3 times:

 

First run:

Second run:

 

Third run:

 

The results seem to be consistent – memory-optimized table showed performance gain only under 200K to 300K workload. For less heavy workload (<200K), disk-based table can perform just as well as memory-optimize table.

 

From this test, memory-optimized table seems to be working better under heavy workload. To prove this, I once again, added 4 GB to the testing VM and expand the test row set up to 450K. Again, 3 times:

 

First run:

 

Second run:

Third run:

 

The results are close – disk-optimized table can perform just as well as memory-optimized table under given workload. Unfortunately, I don’t have a power server handy for more tests. For now, I will consider memory-optimized table has no apparent performance gain over traditional disk-based table.