Linux Application Development (2nd Edition)

 

Does the SQLCLR Replace T-SQL?

I remember the first time I heard that SQL Server would be able to run .NET code. My first thought was that this was great for developers because they wouldn't need to know Transact-SQL (T-SQL) and could use the same language for every tier of a large application. Well, I was wrong.

As it turns out, T-SQL is alive and well and is still the best-performing language for executing set-based queries. As a general rule, T-SQL is still the best choice for set-oriented tasks or for queries that don't need a SQL cursor or function.

The SQLCLR, which runs on all SQL Server 2005 versions, including SQL Server Express, offers the addition of the supported .NET Framework Class Libraries (FCL) as well as third party libraries and any libraries you create. The SQLCLR is usually the best choice for implementing computation-intensive routines.

For example, you might have encountered situations in which you need to move a large amount of data across the network to perform a computation-intensive function or a function that simply cannot be done using T-SQL. Moving this data across the network almost always hurts performance.

I had a recent project that used large complex objects, about 4 MB each, that were stored in a relational fashion within SQL Server and spanned 125 tables. The objects were large because they contained many child objects around 100,000. The application had a complex, multipurpose cloning framework because users rarely created these objects from scratch they usually created a new object by copying and then modifying an existing object. They had to first load the object to the user's machine, clone it, and then save the new object to the database. To make matters worse, even though the server computers were connected with a gigabit bus, the network communication between the user's machine and the network switch was set to 10 megabits to ensure that no user could hog bandwidth.

Let's do some quick calculations. First, there are 8 bits to a byte, and when you factor in network overhead, communicating at 10 megabits/second is roughly 1 MB per second. This means it should (and did) take about 4 seconds to move the relational data from SQL Server to the application, which constructs the object in about 2 seconds.

You might think it would take 4 seconds to move the data back to SQL Server, but that's where the problem is. The data was in a DataSet, and the DataAdapter objects moved the data to SQL Server by essentially creating 100,000 INSERT statements. Just to jog your memory, here's what a typical SQL INSERT statement looks like:

INSERT TblCustomer(CustomerId, CustomerName, CreditLimit, ActiveCustomer) VALUES('69454446-24FE-48BE-A553-5AB44986C36A', 'Northwind', 5000.00, 1)

This has more than 150 characters, but the data is only about 50 characters. Obviously, every table is different, but you can see that even a simple SQL INSERT statement such as this one translates into three times as many bytes being sent to SQL Server as are received. The particular application I'm describing did not send INSERT statements to SQL Server; it executed stored procedures for the inserts. The typical stored procedure looked like this:

exec uspTblCustomerInsert @CustomerId='32F9AEC2-0E0C-4D29-BDFD-9885CC25E7D9', @CustomerName='Northwind',@CreditLimit=10,000.00,@ActiveCustomer=1

Not much better, huh? The actual application had about 8 times more bytes being transferred back to SQL Server because many of the tables and columns had verbose (better stated as "meaningful") names. It took 35 seconds to send the newly created object back to SQL Server, but about 3 seconds of the time was not related to network bandwidth it was time that SQL needed to store the data.

The total time to create the cloned object and store it in the database was approximately 41 seconds. You can see that network bandwidth was the primary bottleneck, consuming 36 of the 41 seconds, so eliminating the network from the equation would result in a major performance gain.

Implementing the same complex cloning framework with T-SQL would be a nightmare, and because the application's C# cloning framework is used for many types of cloning, the frame work could not be eliminated at the client even if a T-SQL implementation were possible. In other words, two cloning frameworks would need to be maintained.

This is where the SQLCLR can come to the rescue. At the time of this project I've described, SQL Server 2005 was not yet released; now the C# cloning framework libraries can be implemented in the SQLCLR to increase performance.

 

Категории