Inside Microsoft SQL Server 7.0 (Mps)

The SQL Server engine is designed to support a variety of demanding applications, such as online transaction processing (OLTP) and decision-support applications. At the core of its decision-support capabilities is Transact -SQL, Microsoft's version of Structured Query Language. Beneath this query language are the components that support transaction processing and recoverability.

Transact-SQL

Industrywide, SQL is a well-known and widely used data access language. Every mainstream database management system (DBMS) product implements SQL in some way. Transact-SQL (often referred to as T-SQL ) is a powerful and unique superset of the SQL standard.

The SQL SELECT statement provides tremendous power and flexibility for retrieving information. Data from multiple tables can be easily projected and the results returned in tabular format with information chosen and correctly combined from the multiple tables. Check out the following two tables from the pubs sample database. (The pubs database, used for many examples in this book, is installed when SQL Server is installed. For brevity, an abbreviated amount of the data will sometimes be used, as in this example.)

publishers Table pub_id pub_name city state 0736 New Moon Books Boston MA 0877 Binnet & Hardley Washington DC 1389 Algodata Infosystems Berkeley CA titles Table title_id title pub_id BU1032 The Busy Executive's Database Guide 1389 BU1111 Cooking with Computers: Surreptitious 1389 Balance Sheets BU2075 You Can Combat Computer Stress! 0736 BU7832 Straight Talk About Computers 1389 MC2222 Silicon Valley Gastronomic Treats 0877 MC3021 The Gourmet Microwave 0877 MC3026 The Psychology of Computer Cooking 0877

The following simple SELECT statement logically joins the titles and publishers tables to project the names of the book titles with the names of the companies publishing each title:

SELECT title, pub_name, city, state FROM titles JOIN publishers ON titles.pub_id = publishers.pub_id

The following result appears.

title pub_name city state ----- -------- ---- ----- The Busy Executive's Database Algodata Infosystems Berkeley CA Guide Cooking with Computers: Algodata Infosystems Berkeley CA Surreptitious Balance Sheets You Can Combat Computer Stress! New Moon Books Boston MA Straight Talk About Computers Algodata Infosystems Berkeley CA Silicon Valley Gastronomic Binnet & Hardley Washington DC Treats The Gourmet Microwave Binnet & Hardley Washington DC The Psychology of Computer Binnet & Hardley Washington DC Cooking

This query, a simple SQL statement, shows that standard SQL provides a powerful way to query and manipulate data. (In Chapters 7 and 10, we'll explore SQL queries in much greater depth.)

The National Institute of Standards and Technology (NIST) has certified SQL Server as being compliant with the American National Standards Institute (ANSI) SQL-92 standard. However, considerably more power is available in Transact-SQL because of its unique extensions to the standard.

Standards and Testing

Although the ANSI standard is commonly referred to as SQL-92 , the official standard is ANSI X3.135-1992 and is entitled American National Standards Institute Database Language-SQL . X3H2 is the designator for the ANSI SQL committee. NIST, a division of the United States Department of Commerce, conducts a suite of tests (which vendors pay the costs of running) to certify compliance with the standard. You can find a summary of products currently certified as compliant at ftp://speckle.ncsl.nist.gov/sql-testing/VPLs.

Transact-SQL Extensions

Transact-SQL provides a number of capabilities that extend beyond typical implementations of SQL. Queries that are difficult to write in standard SQL can be easily and efficiently written using these capabilities. Some favorites include the ability to embed additional SELECT statements in the SELECT list and the ability to drill into a result set by further selecting data directly from a SELECT statement, a feature known as a derived table . Transact-SQL provides many system functions for dealing with strings (for finding substrings and so on), for converting datatypes, and for manipulating and formatting date information.

Transact-SQL also provides mathematical operations such as square root. In addition, special operators, such as CUBE and ROLLUP, allow multidimensional analysis to be efficiently projected at the database server, where the analysis can be optimized as part of the execution plan of a query. The CASE expression allows for complex conditional substitutions to be made easily in the SELECT statement. Multidimensional (sometimes referred to as OLAP, or online analytical processing) operators, such as CUBE, and conditional expressions, such as CASE, are especially useful in implementing data warehousing solutions with SQL Server.

The Query Optimizer

In Transact-SQL, a cost-based query optimizer determines the likely best way to access data. This automatic optimization allows you to concentrate on defining your query criteria rather than defining how the query should be executed. For example, this nonprocedural approach eliminates the need for you to know which indexes exist and which, if any, should be used. Would it be more efficient to incur additional I/Os to read index pages in addition to data pages, or would it be better just to scan the data and then sort it? The optimizer automatically, invisibly , and efficiently resolves these types of important questions for you.

The SQL Server optimizer maintains statistics about the volume and dispersion of data, which it then uses to estimate the plan most likely to work best for the operation requested . Because a cost-based optimizer is by definition probability-based, an application might want to override the optimizer in some specialized cases. In your application, you can specify optimizer hints that will direct the execution plan chosen. In addition, you can use one of SQL Server's SHOWPLAN options, which explains the execution plan chosen, provides insight into why it was chosen and even allows for tuning of the application and database design.

The Programmable Server

Transact-SQL provides programming constructs ”such as variables , conditional operations (IF-THEN-ELSE), and looping ”that can dramatically simplify application development by allowing you to use a simple SQL script rather than a third-generation programming language (3GL). These branching and looping constructs can dramatically improve performance in a client/server environment by eliminating the need for network conversations. Minimizing network latency is an important aspect of maximizing client/server application performance. For example, instead of returning a value to the calling application, which requires that the application evaluate and subsequently issue another request, you can build conditional logic directly into the SQL batch file so that the routine is completely evaluated and executed at the server.

You can use Transact-SQL to write complex batches of SQL statements. (A batch of SQL statements in a complex application can potentially be hundreds, or even thousands, of lines long.) An important capability of SQL Server is the SQL Debugging Interface, which allows debuggers such as those available with Microsoft Visual Studio to fully debug Transact-SQL routines, including stepping through the statements, setting breakpoints, and setting watchpoints on Transact-SQL variables.

Stored Procedures

Simply put, stored procedures are collections of SQL statements stored within a SQL Server database. You can code complex queries and transactions into stored procedures and then invoke them directly from the front-end application. Whenever a dynamic SQL command is sent to a database server for processing, the server must parse the command, check its syntax for sense, determine whether the requester has the permissions necessary to execute the command, and formulate an optimal execution plan to process the request. Stored procedures execute faster than batches of dynamic SQL statements, sometimes dramatically faster, because they eliminate the need for reparsing and reoptimizing the requests each time they're executed. SQL Server supports stored procedures that let developers store groups of compiled SQL statements on the server for later recall, to limit the overhead when the procedures are subsequently executed.

Stored procedures differ from ordinary SQL statements and from batches of SQL statements in that they're checked for syntax and compiled only the first time they're executed. SQL Server stores this compiled version in its memory cache and then uses the cached, compiled version to process subsequent calls, resulting in faster execution times. Stored procedures can also accept parameters, so a single procedure can be used by multiple applications using different input data.

Even if stored procedures provided no performance advantage (which, of course, they do), there would still be a compelling reason to use them: they provide an important layer of insulation from changes in business practices. Suppose, for example, that an application is used to maintain a mailing list for a retailer's catalog distribution. Subsequent to the application being deployed, a change in criteria and logic (that is, the business rules) occurs, thus affecting which customers should automatically receive new catalogs. If the business rules had been programmed directly into the company's applications, every application would need to be modified, likely an expensive and time-consuming operation. Furthermore, if multiple developers worked on the applications, the rules might not have been programmed with the exact same semantics by every programmer.

A stored procedure, on the other hand, can be modified once, in seconds, at the server. The applications don't need to be changed or even recompiled. The next time each application executes the stored procedure, the new rules would be in place automatically. In addition to providing a performance advantage, stored procedures can provide an important security function. By granting users access to a stored procedure but not to the underlying tables, you can allow them to access or manipulate data only in the way prescribed by the stored procedure.

Extended Stored Procedures

A unique capability of SQL Server, extended stored procedures allow developers to extend the programming capabilities provided by Transact-SQL and to access resources outside of SQL Server. Messaging integration, security integration, the ability to write Hypertext Markup Language (HTML) files (files formatted for use on the Internet), and much of the power of SQL Server Enterprise Manager (discussed later in this chapter) are all implemented using extended stored procedures. You can create extended stored procedures as external dynamic link libraries (DLLs). (DLLs are typically written in C and C++, although implementation in other languages is also possible.)

For example, you could write a DLL to establish a modem connection, dial the ACME Credit Service, and return a status indicating credit approval or rejection . (The C language more readily lends itself to particular tasks because of such language constructs as arrays, structures, and pointers.) For example, writing a financial function that uses recursion in C (for example, the internal rate of return, or IRR) might be more efficient than writing it as a Transact-SQL stored procedure. Microsoft Open Data Services (ODS) is an API that lets you build extended stored procedures that can return self-describing result sets to the calling client applications, just as a normal procedure would.

Extended stored procedures allow even Microsoft to extend SQL Server. Good engineering practices dictate that where code doesn't benefit from being shared or isn't in common, it should be segregated and isolated. With this principle in mind, Microsoft added integration with messaging via MAPI as a set of extended stored procedures ( xp_sendmail , xp_readmail , and so on) instead of directly modifying the SQL Server engine. Extended stored procedures allow powerful features to be added without any chance of disrupting the core server engine. More features can then be added quickly, with less risk of destabilizing the server. And because the code is loaded dynamically, the DLL is loaded only if a routine is implemented as an extended stored procedure, so the memory footprint of SQL Server doesn't grow for services that aren't being used.

Категории