Solid Quality Learning, Microsoft Corporation Staff - Microsoft SQL Server 2005 Database Essentials Step by Step
| ||
| ||
|
You may choose not to write stored procedures and UDFs because the T-SQL language does not possess all of the bells and whistles of more developed languages, such as C# or Visual Basic.NET. T-SQL is constrained by the following limitations:
-
Limited ability to represent complex datatypes such as classes, arrays, and enumerations, or compositions of them
-
Lack of an extensive class library, such as the .NET Framework, which contains routines for tasks such as low-level disk access, TCP/IP and serial communication, and encryption
-
Limited performance due to its essentially interpreted nature compared with the compiled nature of other languages
-
A development environment that is not as powerful as other languages with capabilities such as automatic builds, unit testing, and powerful debugging
T-SQL works well as a language for data manipulation, but not for calculation or logic- intensive routines such as:
-
Heavy numeric (mathematical) calculations
-
Custom encryption
-
Heavy text handling and formatting
-
Complex financial calculations
The SQL Server functions and procedures written with a .NET Framework language, such as C# or Visual Basic.NET, are called common language runtime (CLR), which is the low-level layer of the .NET Framework.
Note | A CLR UDF is not as restricted as a T-SQL UDF. One of its primary relaxed restrictions is that it does not need to be deterministic. |
Working with Statistical Calculations
Assume that you have an application in which you must perform numerous statistical calculations over large batches of data. The factorial function is a building block for many probability calculations. A factorial (designated by the ! notation) of integer N is defined as the product of the first N integers. For instance:
-
4! = 1 * 2 * 3 * 4 = 24
-
6! = 1 * 2 * 3 * 4 * 5 * 6 = 720
It is possible to write a factorial function using T-SQL, as shown below. You can access this sample code from the sample files as \Ch09\Samples06.sql.
CREATE FUNCTION CalcFact ( @N int ) RETURNS float AS BEGIN DECLARE @R float SET @R = 1 DECLARE @I int SET @I = 1 WHILE @I <= @N BEGIN SET @R = @R * @I SET@I=@I+1 END RETURN @R END
Obviously, the factorial function is calculation intensive. To obtain a ballpark idea of how expensive the algorithm is, you can write a procedure to time it. I tested the procedure by using an Athlon XP 3500 processor; although the results are not very scientific, they are sufficient for a ballpark figure.
-- Call Factorial of @Val a total of @N times and clock the execution CREATE PROCEDURE TimeCalc (@Val int, @N int) AS BEGIN DECLARE @T0 datetime DECLARE @T1 datetime SET @T0 = GETDATE() DECLARE @I int SET @I = 1 WHILE @I <= @N BEGIN DECLARE @F float SET @F = dbo.CalcFact( @Val ) SET @I = @I + 1 END SET @T1 = GETDATE() PRINT DATEDIFF(millisecond, @T0, @T1) END
I calculated the factorial of 150 100,000 times and received this result:
TimeCalc 150, 100000 18616
Each calculation takes approximately 186 microseconds. Although this may seem fast, it may in fact be somewhat slow and become a bottleneck in your application if you must call this function millions of times in repetition.
All data that passes between SQL Server 2005 and CLR methods should be passed as some special type defined in the namespace System.Data.SqlTypes . Some of these special types are found in the middle column of Table 9-2.
SQL Server Datatype | CLR/SQL Function Type | CLR Datatype |
---|---|---|
varbinary, binary | SqlBytes, SqlBinary | Byte[] |
nvarchar | SqlChars, SqlString | String, Char[] |
uniqueidentifier | SqlGuid | Guid |
bit | SqlBoolean | Boolean |
tinyint | SqlByte | Byte |
smallint | SqlInt16 | Int16 |
int | SqlInt32 | Int32 |
bigint | SqlInt64 | Int64 |
smallmoney, money | SqlMoney | Decimal |
numeric, decimal | SqlDecimal | Decimal |
real | SqlSingle | Single |
float | SqlDouble | Double |
smalldatetime, datetime | SqlDateTime | DateTime |
Tip | SQLChars is a better match for data transfer and access, and SQLString is a better match for performing string operations. |
All of the types in the middle column possess a read-only Value property that is used to retrieve its CLR data value. You can assign CLR datatypes to them directly.
Note | Refer to SQL Server Data Types and Their .NET Framework Equivalents in SQL Server Books Online for a full list of types. |
Another option exists, however. You can write the factorial function by using a .NET language, such as Visual Basic.NET.
Writing a Factorial Function Using Visual Basic.NET
-
Use any text editor, such as Notepad, to write the following code. You can cut and paste this code from \Ch09\Probability.vb in the sample files.
Imports Microsoft.SqlServer.Server Imports System.Data.SqlTypes Public Class Prob <SqlFunction(DataAccess:=DataAccessKind.Read, IsDeterministic:=True)> _ Public Shared Function Fact(ByVal N As SqlInt32) As SqlDouble Dim R As Double = 1 Dim I As Integer For I = 2 To N.Value R = R * I Next Return R End Function End Class
Note Notice the use of the CLR SQL datatypes SqlInt32 and SqlDouble.
-
Create a new folder in the root of your C drive and name the folder MyFunc . Save the code as
Probability.vb under the new folder. -
Compile the code using the command prompt. The command prompt you use should have its PATH statement set to point to the .NET directory. You can simply use Start All Programs Microsoft Visual Studio 2005 Visual Studio Tools Visual Studio 2005 Command Prompt.
-
Type cd c:\MyFunc to change to the directory in which you stored the
Probability.vb file, then enter the following command at the c:\MyFunc> prompt: C:\MyFunc>vbc /target:library Probability.vb
The term vbc is the Visual Basic compiler, and /target:library means that a .DLL instead of an .EXE file is being generated.
Caution A CLR function may be deterministic or non-deterministic . It is your responsibility to supply the correct information in the IsDeterministic attribute. If you state that a deterministic function is not deterministic, you are restricting the situations in which it might be used, but no harm is actually done. On the other hand, if you state that a non-deterministic function is deterministic, you may end up with serious logical errors in your application.
-
After the compilation, load the assembly and make it callable from within SQL Server 2005. You can access this code from the sample files as \Ch09\Samples07.sql.
-- Load the assembly (DLL) into the database CREATE ASSEMBLY Probability FROM 'C:\MyFunc\Probability.dll' go -- Create a SQL Server function that calls the CLR function CREATE FUNCTION CalcFactCLR( @N int ) RETURNS float AS EXTERNAL NAME Probability.Prob.Fact GO
-
If this is the first time you are calling CLR functions, you must enable CLR code in the database.
-- Adjust SQL Server security to allow CLR functions EXEC sp_configure 'clr enabled', '1'; GO -- Required after calling sp_configure RECONFIGURE GO -- Make the assemblies in the database callable without signing the code ALTER DATABASE AdventureWorks SET TRUSTWORTHY ON GO
You now have a new SQL Server UDF that will call the CLR function. You can easily call it by doing the following:
PRINT dbo.CalcFactCLR(6) 720
-
Create a timing procedure.
CREATE PROCEDURE TimeCalcCLR (@Val int, @N int) AS BEGIN DECLARE @T0 datetime DECLARE @T1 datetime SET @T0 = GETDATE() DECLARE @I int SET @I = 1 WHILE @I <= @N BEGIN DECLARE @F float SET @F = dbo.CalcFactCLR( @Val ) SET @I = @I + 1 END SET @T1 = GETDATE() PRINT DATEDIFF(millisecond, @T0, @T1) END
-
Complete the timing by calling the function 100,000 times.
EXEC TimeCalcCLR 150, 100000 1436
My test indicates approximately fourteen microseconds per calculation, which is about thirteen times faster than using T-SQL.
Note This does not mean that CLR is thirteen times faster than T-SQL, for this timing is only valid for the current example. If a function is more database intensive, it is less advantageous to write a CLR function. Conversely, if a function is more calculation intensive, it is more advantageous to use a CLR function.
It is possible to write a CLR UDF that returns a table. Since the interest rate example is essentially a calculation-intensive function, it presents a good opportunity to rewrite it as a CLR function.
Using Table-Valued UDFs
A table-valued UDF is more complicated than a scalar UDF, and it will be defined as a static method in the CLR code. For the sake of the explanation below, it will be called the root method. This method should return either an IEnumerator or IEnumerable interface. Several CLR classes that deal with collections already implement the IEnumerable interface, so it is usually easy to use these classes. The IEnumerable interface has a single method called Get Enumerator that returns an IEnumerator interface. If you want to write your own class from scratch, you must implement the IEnumerator interface itself instead of implementing an IEnumerable interface, which returns the IEnumerator interface. To implement an IEnumerator interface, you must implement one property and two methods, as listed below.
-
Current property Returns the current row of the result set as an object
-
MoveNext method Moves the current position in the result set to the next row. By definition, the initial position of the cursor in the result set comes before the beginning record; therefore, the consumer (such as SQL Server 2005) will always call MoveNext before retrieving any data.
-
Reset method Resets the current position to fall before the first row
The root method should contain a SqlFunctionAttribute that indicates what the FillRow method is (typically, another static method in the same class as the root method). The FillRow method receives an object returned by a previous call to IEnumerator.Current and breaks it into multiple-column values using output parameters. Note that the first parameter is always of the type Object . The other parameters are ByRef , and their types correspond to a SQL Server type that is compatible with the TableDefinition attribute parameter.
Creating a Table-Valued UDF
-
Use any text editor, such as Notepad, to write the following code. You can cut and paste this code from \Ch09\Rates.vb in the sample files.
Imports Microsoft.SqlServer.Server Imports System.Collections Imports System.Data.SqlTypes ' Represents a table row in memory Public Class TableRow Public N As Integer Public Rate As Double Sub New(ByVal N As Integer, ByVal Rate As Double) Me.N = N Me.Rate = Rate End Sub End Class ' Main class that implements an IEnumerator Public Class TblLoader Implements IEnumerator ' Current values of the IEnumerator Private CurrentIndex As Integer = 0 Private CurrentFactor As Double = 1 ' Stored constructor arguments Private NumPeriods As Integer Private PercentInterest As Double ' Constructor with the table's arguments Sub New(ByVal NumPeriods As Integer, ByVal PercentInterest As Double) ' Store as class variables Me.NumPeriods = NumPeriods
Me.PercentInterest = PercentInterest End Sub ' Returns the current row Public ReadOnly Property Current() As Object _ Implements IEnumerator.Current Get Return New TableRow(CurrentIndex, CurrentFactor) End Get End Property ' Get the next element in the enumeration Public Function MoveNext() As Boolean _ Implements IEnumerator.MoveNext If CurrentIndex < NumPeriods Then ' Calculate next value CurrentIndex += 1 CurrentFactor *= (1 + PercentInterest / 100) Return True Else Return False End If End Function ' Reset the enumeration Public Overridable Sub Reset() _ Implements IEnumerator.Reset CurrentIndex = 0 CurrentFactor = 1 End Sub End Class Public Class Interest ' Main function <SqlFunction(FillRowMethodName:="FillRow", _ TableDefinition:="N int not null, Rate float not null")> _ Public Shared Function AppreciationTbl(ByVal NumPeriods As Integer, _ ByVal PercentInterest As Double) As IEnumerator Return New TblLoader(NumPeriods, PercentInterest) End Function ' Function called to retrieve each row Public Shared Sub FillRow(ByVal Obj As Object, ByRef N As SqlInt32, _ ByRef Rate As SqlDouble) ' Check if the object is valid If Not (Obj Is Nothing) Then ' Retrieve the values Dim R As TableRow = CType(Obj, TableRow) N = R.N Rate = R.Rate Else ' Invalid object N = 0 Rate = 1 End If
End Sub End Class
-
Save the file as
Rates.vb under the c:\MyFunc folder. -
Compile the code, as you did earlier with the Probability example, by using the Visual Studio 2005 command prompt.
C:\MyFunc>vbc /target:library Rates.vb
-
Load the assembly within SQL Server 2005. You can access this sample code from the sample files as \Ch09\Samples08.sql.
CREATE ASSEMBLY Rates FROM 'C:\MyFunc\rates.dll' GO CREATE FUNCTION IRate(@NumPeriods int, @PercentInterest float) RETURNS TABLE(N int, Rate float) AS EXTERNAL NAME Rates.Interest.AppreciationTbl GO
-
Call the table function:
SELECT * FROM IRate(12, 1)
Note | The difference between the results returned by the GetInterest function and the IRate function is due to different levels of precision in the variable used to calculate the result. GetInterest uses a money datatype, whereas IRate uses a double datatype. |
Working with CLR Stored Procedures
CLR stored procedures can be created in a similar fashion as UDFs. The main difference between the two is that a stored procedure more closely resembles a basic subroutine instead of a function because it cannot return a value and must be called without parentheses. Since a CLR UDF does not need to be deterministic, there is nothing that a CLR stored procedure can do that a CLR UDF cannot also do.
Performing File Operations
You can write a stored procedure to perform simple file operations such as copying, moving, and deleting. This procedure can have several uses, such as managing an exported file created with a utility (e.g., BCP utility).
Note | Stored procedures cannot be used to access opened files, such as the database files themselves . |
Creating a CLR Stored Procedure
-
Write the following Visual Basic code. This code is included in the sample files as
FileUtil.vb . Imports System.IO Imports System.Data.SqlTypes Public Class FileUtil Public Shared Sub CopyFile(ByVal SourceFileName As SqlString, ByVal DestFileName A s SqlString) File.Copy(SourceFileName, DestFileName) End Sub Public Shared Sub DeleteFile(ByVal FileName As SqlString) File.Delete(FileName) End Sub Public Shared Sub MoveFile(ByVal SourceFileName As SqlString, ByVal DestFileName A s SqlString) File.Move(SourceFileName, DestFileName) End Sub End Class
-
Save the code as
FileUtil.vb in the c:\MyFunc folder. -
Compile the code using the Visual Studio 2005 command prompt, as you did earlier with the Probability and Rates examples.
C:\MyFunc>vbc /target:library FileUtil.vb
-
Load the assembly inside SQL Server 2005. You can access this sample code from the sample files as \Ch09\Samples09.sql.
-- Load the assembly CREATE ASSEMBLY FileUtil FROM 'C:\MyFunc\FileUtil.dll' WITH permission_set = external_access GO -- Create a SQL Server function that calls the CLR function CREATE PROCEDURE CopyFile( @SourceFileName nvarchar(256), @DestFileName nvarchar(256) ) AS EXTERNAL NAME FileUtil.FileUtil.CopyFile GO -- Create a SQL Server function that calls the CLR function CREATE PROCEDURE MoveFile( @SourceFileName nvarchar(256), @DestFileName nvarchar(256) ) AS EXTERNAL NAME FileUtil.FileUtil.MoveFile GO -- Create a SQL Server function that calls the CLR function CREATE PROCEDURE DeleteFile( @FileName nvarchar(256) ) AS EXTERNAL NAME FileUtil.FileUtil.DeleteFile GO
Note Notice that with permission_set = external_access is being used in the above code. This is necessary because the CLR assembly will access the external file system, which is beyond the control of SQL Server 2005. Even then, the account under which the methods will be called must contain enough privileges to access the files, meaning both ACL permission and CLR Code Access Security permission. These permissions may be necessary for both UDFs and stored procedures.
-
Call the procedure, as in the following example. (Make sure you have a file named Export.txt in the c:\Data folder and that you have a c:\Backup folder.)
EXEC CopyFile 'c:\Data\Export.txt', 'c:\Backup\Export.txt'
| ||
| ||
|