Microsoft SQL Server 2000 High Availability
Whether you are just upgrading from another version of SQL Server, consolidating your database environment, or migrating to SQL Server from another platform, you need to be concerned with some basics before you even begin the process. This section focuses primarily on the technical aspects, but it is crucial to understand the business reasons or drivers for undertaking such a project to ensure that the technical meets the business, and vice versa. Follow the guidelines in Chapter 1 and Chapter 2 for setting up a steering committee and so on. Any planning fits squarely within Microsoft Solutions Framework (MSF) and Microsoft Operations Framework (MOF) models.
Your business sponsors should be able to answer the following questions. The information will greatly help you in your technical planning.
-
What business value do you believe will be gained from your upgrade, consolidation, or migration effort?
-
If you are consolidating, what is the desired consolidation ratio ”2:1, 5:1, 10:1, or what? Is it realistic?
-
What are the drivers for your effort? Why are you upgrading, consolidating, or migrating?
-
Have the IT and business stakeholders bought into and signed off on the effort? Is there adequate funding?
-
Is chargeback for resource usage a need or a consideration? Will the tools currently in place to measure chargeback still work?
-
How many users are expected to be supported concurrently by this solution? in the short term? in the long term ?
-
What is the life expectancy of this solution (that is, how long is this solution, along with its systems, supposed to stay in production)?
-
How much will the data grow over time? What is projected versus actual growth (if known)?
-
What is acceptable performance from end- user , administrative, and management perspectives? Keep in mind that performance is defined differently ”it could mean throughput, response time, and so on.
-
What is the current availability goal for each individual system? Each solution? Will the availability goal for the component or solution meet the new requirements?
-
How is maintenance going to be performed on this system? Will maintenance change due to consolidation?
-
How are backups going to be performed on the system and the data? How will this work affect performance?
-
What are the security requirements for both the application and the systems for the solution? Are they in line with corporate policies?
-
Are there security conflicts between systems and objects that might be consolidated together?
-
What is the cost of developing, implementing, and supporting? in the short term? in the long term?
-
What is the actual cost of downtime for each individual system or database installation? The entire solution? What is the current cost of administering and running the server?
-
What are the dependencies of the components in the solution? For example, are there external data feeds into SQL Server that might fail as the result of an availability problem or move in a system?
-
What technologies are currently used in the application, solution, or system, and what is desired? How will a change affect each component of your solution individually and as a whole?
-
What is the budget allocation for hardware? What is the available budget for the entire effort?
-
If this is an existing SQL Server, is the system already at or exceeding capacity, whether it is processor, memory, or disk?
-
What SLAs are in place and how will they be affected?
-
Where are the servers located? Are they currently in the same place? If not, how will changing location or geographic placement affect the entire group of users, performance, and the SLA?
-
What roles exist within the IT division? Do you have a person who is dedicated solely to capacity management?
-
Do you have effective change control processes in place with the existing systems or do you need new ones?
-
Do you have proper development, test, and staging environments?
-
Do you have a testing or quality assurance team? Are there dedicated DBAs? If so, how many?
-
What is the current ratio of SQL Servers and databases to DBAs?
-
Do you plan on eliminating or downsizing personnel as a result of this effort?
-
What are the different types of systems considered for upgrading, consolidating, or migrating ”production, test, and development?
-
Will you employ new technologies, such as clustering for high availability? How will that affect your employees from a skills, cost, and training perspective?
-
Do you have specific SLAs to maintain with each application even after consolidating them onto a single system in a server consolidation effort?
-
Are there multiple systems with conflicting SLAs currently or being planned on the same machine?
-
Are certain systems required to be separated from others due to things like performance or SLAs?
-
What is the desired end date for the project? Are there any constraints that might come into play, such as those posed by hardware or service vendors ?
-
Do you know if the systems are mission critical? If you are looking to combine certain target servers, how will that affect the other systems ?
-
How much business, in terms of both actual transactions (not database) and revenue, does each system generate?
-
List any reference projects done to date that have influenced your interest.
-
State any particular issues or information that must be taken into account, such as the following:
-
Are there any corporate policies that might affect the effort?
-
Such things as sole source providers are illegal with some government agencies, so that might restrict consolidation contracts in some way.
-
Corporate security policies might inhibit or change the nature of an effort, such as server consolidation, by requiring physical isolation of systems and data from different departments, as in the case of Limited Liability Partnerships.
-
Phase 1: Envisioning
While the business side of the house is figuring out the nontechnical drivers for your effort, the administrators (DBAs and non-DBAs, too) must start to gather information for the plan on the technical side. This involves a great deal of information that you should have been gathering all along to establish baselines and make judgments . Ask the following sample questions and document the answers, as each one will influence your planning. Add questions if they are relevant to your environment.
-
What technical advantages do you believe you will gain from this effort?
-
How did the original deployments occur? What can be learned (good and bad) from those deployments?
-
How many servers are going to be part of the effort? Have the servers even been identified? Who is your vendor, and what class will the target servers be?
-
How many databases are on the targeted servers?
-
Are the servers to be consolidated currently separated geographically ?
-
What are the usage patterns of the targeted servers, including use of the system databases? Do you have a performance baseline for each server? Is this baseline broken down to business units and their respective systems?
-
Do you know the resource utilization of each server? Can you accurately assess if the targeted servers are overutilized or underutilized ?
-
Are these servers already employing some form of high availability, such as failover clustering or log shipping?
-
Is replication configured for any of the databases?
-
How do you manage database schemas, including stored procedures and functions? Do you keep them in some form of version control?
-
Are security policies and procedures documented for each server (physical hardware, operating system, SQL Server, and the applications)? Are security policies in line with your current corporate policies?
-
Do you know all the information about each server (such as service pack levels, disk space usage, SQL Server configuration settings, and so on)?
-
Do you know the types of workload on each system (that is, OLTP versus OLAP/Decision-support)?
-
Do you use administrative tools such as Alerts and SQL Mail? Do you have standardization between all servers?
-
Do you have any custom extended stored procedures that might affect other databases?
-
Do you have conflicting objects with the same name ?
-
Do you have access to all source code for the applications accessing SQL Server as well as stored procedures ( especially if you use stored procedure encryption)?
-
Do you know how many users, on average, connect to SQL Server? What types of clients connect to SQL Server (such as handheld devices, fat clients , and so on)? How do they connect?
-
Do you know the requirements of the third-party applications that might use SQL Server and be consolidated?
-
What current backup technologies are used?
-
What current high availability technologies are used?
-
Have the DBAs in your organization been formally trained on SQL Server in all aspects, including administration, high availability, and performance?
-
Are the current servers in different domains?
-
What versions of SQL Server are these servers running?
-
What versions of the operating system are used on these servers?
-
How much will the data grow over time? What is projected versus actual growth (if known)?
-
What technologies are currently used in the solution?
Profile Target Systems
The most important component of any upgrade, consolidation, or migration is having adequate technical information about each system. Without this, you cannot proceed successfully with any of your efforts. If you do not know if you can combine a workload, it is impossible to make any decisions.
Each system that is being considered must be profiled, and not for just one day or, sometimes, even just one week. The profile is the result of monitoring over time to gather trends. These trends might be daily or weekly jobs, as well as daily, weekly, and even monthly trends in workload (for example, end-of- month sales reports ). Profiling includes much more than just the system name: every important performance aspect of the server must be captured. When figuring out if workloads and systems can be combined, you need to determine if incompatibilities will result from, say, a configuration setting or an application requirement.
Besides capturing general items such as operating system version, service pack levels, IP addresses, and disk configurations through the worksheets, the specific settings for each application also need to be captured. SQL Server provides the following methods to assist in the documentation of your systems:
-
Run the system-stored procedure sp_configure , which captures all of the configuration settings for SQL Server.
-
Run the system-stored procedure sp_helpstartup on Microsoft SQL Server 6.5 or sp_procoption on Microsoft SQL Server 7.0 and SQL Server 2000. These help determine the startup procedures and settings used by SQL Server.
-
Use the command-line Sqldiag.exe utility to capture SQL Server system information. This utility is usually used for diagnostic purposes, but this is one of its side benefits. This is the syntax to use:
SQLDIAG.EXE -X -U user_name -P password -I instance_name -O output_file - X skips the error logs; you can use - C as well to get cluster information; - I is only used if it is not a default instance.
You should also document all of the objects associated with each database. This includes stored procedures, jobs, logins, and system-level logins as well as their corresponding database users, maintenance plans, and so on, even if they reside outside of the database. You will use this information later to determine conflicts, such as duplicate stored procedure names and logins or differences in naming standardizations. If you want, run the same DBCC commands on each system. Document anything and everything, even if you think it might not be relevant. Include linked servers, modified system databases, and so on.
More Info | For more information on gathering information about your systems, consult Chapter 4, Disk Techniques for High Availability, Chapter 12, and Chapter 15, Monitoring for High Availability. The run book described in Chapter 12 is a great resource for information. Chapter 4 and Chapter 15 cover monitoring and capacity management from a system performance perspective. |
Gathering System Performance Metrics
Gathering performance metrics from your server is more than a one-time event. To truly understand what is going on with your system, performance metrics must be done over a period of time and then analyzed . At a minimum, you should gather statistics about your systems for one business week. During that time, record numbers when there is little to no load, medium load, and heavy load. Gathering numbers only at times that show one extreme or the other does not paint an accurate picture of system use.
Other SQL Server Information to Gather
This section describes the most common data to collate , which should help you plan for the issues that are most commonly encountered when consolidating. Refer to the section on technical considerations under the planning section of this chapter for more information and details on most of these points.
Consolidating System Databases
Be sure to take care of the following:
-
Identify any objects in system tables that are not part of a standard installation.
-
Identify any objects in system tables that have been modified since a base installation.
-
Look for duplicate names.
-
Review each object for relevance in the new environment.
-
Review each object to determine if duplicity exists at a more granular level.
-
Review each object for explicit references, such as path names, server names, and task names.
-
Do not address tasks found in msdb; this will be addressed in the next section.
-
Only search for nondefault objects found in tempdb. Use of tempdb is dealt with later.
Collation and Sort Order
Check out the following:
-
What is the sort order of the source server?
-
Identify any objects that deviate from the collation setting of the server.
Security (Logins, Security, and Permissions)
Be sure you have information about the following items:
-
Collect and identify duplicate logins.
-
Determine whether trusts are required across domains.
-
Determine if guest is an active user.
-
Collect data of logins with administrative rights.
-
Collect security settings and the permission structure for the public group.
-
Determine if you will need specific registry permissions.
-
Determine what permissions are required for extended stored procedures, particularly xp_cmdshell .
-
What accounts are the SQL Server services running under?
-
Understand the security model that the client is implementing.
-
Compare security models of source servers and target servers.
-
Collect database options, to ensure that read-only, single-user, and dbo-only settings are maintained during migration.
-
Collect the encrypted passwords to transfer to the new server.
-
Compile a list of all Windows Authentication and SQL logins that have administrative access.
-
Pay special attention to the scheduled jobs on the system. Not only do these jobs have to be migrated , but they also identify specific workloads that kick off at specific times. They also tell you which user accounts are commonly performing the maintenance for each instance.
Serverwide Configuration Settings
Here are some factors to check out:
-
Determine if any SQL Server configurations, such as affinity mask, Address Windowing Extensions (AWE), and so on, have been modified since installation ”that is, do they deviate from a standard installation?
-
Monitor context switches after migration, mainly to decide if fiber mode is necessary.
-
Determine if there are requirements for XML stored procedures and OLE automation objects.
-
Establish which statistics are inaccurate when running in fiber mode.
-
Monitor the worker threads to see if they are exceeding the maximum value allocated.
-
Collect error messages that have been added to sysmessages and sysservermessages.
-
Establish if any system-supplied error messages have been modified.
Determining the Existing Environment (Workstations, Servers, Applications, Users, and so on)
Some more information you must gather follows :
-
Verify that clients with older versions of Microsoft Data Access Components (MDAC) ”such as MDAC 2.5 ”can connect to a default or named instance of SQL Server 2000, depending on which you are implementing. Default instances should be fine, but named instance support was introduced with MDAC 2.6.
-
Collect the host names that connect to named instances of SQL Server.
-
Identify applications where the connection string is hard coded. This also includes connection objects, such as COM objects and data source names (DSNs).
-
Identify if any connections are connecting other than TCP/IP. Exclude internal connections that connect using named pipes.
-
Identify all applications that are known to have hard-coded parameters.
-
Identify all applications for which vendors are no longer accessible.
-
Collect information that describes the type of workload the application generates.
-
Determine the domain that the central server will reside in.
-
Identify the domains of the source servers.
-
Identify the domains of the connection sources.
-
Collect the trust relationships between central server and connection sources.
Health Checks
Take care of these, too:
-
Execute DBCC statements and scan logs for errors.
-
Review error logs and event viewers .
-
Determine if any procedures must be added to the cache before opening the environment up to your user community.
Multiple Versions
Take care of this:
-
Collect all versions of servers, binaries, and compatibility modes of the databases.
Excessive Use of Tempdb
Take the following actions:
-
Determine the maximum space used by tempdb for source servers.
-
Establish the amount of memory used for procedures.
-
Decide if each SQL Server instance should have its own physical place for tempdb or if they should all share a single space.
Database Options
Answer these questions, too:
-
How much space has been assigned to devices on source servers?
-
How much space is actually used?
-
What is the maximum space used for the transaction logs?
-
What has the growth rate been like over the last week, month, quarter, six months, and year?
-
Is the current backup strategy at source servers in line with the strategy to be used at the target server?
-
Are there requirements for connection settings to be other than the default ANSI settings and connection- related properties?
-
Collect the current database options.
Phase 2: Technical Considerations for Planning
Once you gather the necessary information, you can then perform the analysis to put the proper plans in place.
Single Instance Versus Multiple Instances
Before you can consider any other technical aspect, you must decide if you will be using one instance with many databases or multiple instances, each with its own set of databases. Instances were first introduced in SQL Server 2000. One instance of SQL Server 2000 equates to one installation of SQL Server on your machine. Prior to SQL Server 2000, you could only have one installation of SQL Server per physical server. There are two types of instances: default and named . A default instance is analogous to the current functionality of previous versions of SQL Server. A named instance of SQL Server is exactly like it sounds: you name your SQL installation with a unique name. SQL Server 2000 supports up to 16 instances on one physical server or in a single Windows server cluster, regardless of the number of physical nodes comprising the cluster. In other words, you can have 16 named instances or one default and 15 named instances. Consult SQL Server Books Online for any changes in the number of instances supported.
Note | You can only have one default instance, so older applications that cannot handle named instances might have problems. Please test if you are implementing named instances. |
Each instance gets its own set of core binaries, but they share underlying components such as MDAC, Microsoft Distributed Transaction Coordinator (MS DTC), and one Microsoft Search service. If there is more than one instance of SQL Server, each has the ability to stay at a certain service pack level, with the exception of any shared components. This is a benefit for a consolidation effort, because you can have instances at the specific levels at which you need them; however, for example, if you need to install a SQL Server 2000 service pack and it for some reason requires a physical reboot, you will affect all other services running on that server. In general, once you have upgraded one instance to that service pack level you will not have to reboot to upgrade the other instances because these shared files have already been replaced . Regardless, your availability and SLAs must be taken into account when designing consolidated SQL Servers.
From a performance perspective, each instance will get its own memory, including cache for stored procedures. Is it better for your endeavor to put hundreds of databases under one instance or to spread them out over more than one? The Memory and Processor section later in this chapter delves into this topic in more depth.
In most cases, a single instance means less administrative overhead: with only one instance to manage, the complexity of what goes on in SQL Server can be easily managed and localized. It also ensures that automatic settings, such as using dynamic memory, are easier to consider if there are no other processes to contend with. If you have limited hardware, this might be your best option.
Multiple instances allow for each instance to cater to a different set of rules, SLAs, and so on. The biggest advantage is that you can stop the services for each instance individually. Multiple instances also mean more complexity when it comes to system design and management on all levels, but the rewards of dedicated memory and separate processes (with the exception of shared components) might be worth the trade-off.
When thinking about single versus multiple instances ”as stated in the introductory paragraph of this section ”I/O, processor, and memory are the key considerations.
Application Compatibility with Instances
Because the architecture of SQL Server 2000 is different with instances, in particular named instances, you should test older applications to ensure that they still work properly. Named instances technically require MDAC 2.6 to connect properly. A default instance should be fine. In the event there is a problem if named instances are used, there are a few potential ways to solve any issues if the application cannot connect to a named instance:
-
Install MDAC 2.6 or higher on the machine hosting the application or on each client machine if the application is run directly by the client. This is the version of MDAC that ships with SQL Server 2000, and it contains support for named instances. Installing MDAC 2.6 does not guarantee that the application will work with named instances, however. Testing on one or a small number of machines is recommended before rolling this out to large numbers of users.
-
If it is not possible to update the client utilities to the SQL Server 2000 versions, do not use the name but rather connect directly to the IP address and port number. Configure that in Client Network Utility.
-
Update the client tools to SQL Server 2000. Then use the SQL Server 2000 version of Client Network Utility to create an alias to the named instance.
-
Update application code to support named instances. If the application already hard-coded server names, database names, and paths into the application, there are more significant issues than named instances.
Disk Subsystem
The first stop for any SQL Server system should be the architecture of its disk subsystem. If you are looking to combine workloads or to upgrade, it is imperative to know not only how much disk space you will need but, if you are consolidating, what workloads will work well together from an I/O perspective. That is a serious challenge faced by customers both big and small. Read-only reporting databases have different I/O requirements than heavily used OLTP systems with many writes and a smaller number of reads. Do not take this configuration lightly; you might initially save cost by consolidating, but if you have underperforming applications, the consolidated environment will be a failure.
More Info | Disks were covered in great depth in Chapter 4, Disk Techniques for High Availability. Refer to that chapter for any further discussions relating to disk issues. |
Memory and Processor
There are a few important considerations when determining how many processors the new servers will contain. As with disk and memory, prior benchmarking and application and system profiling data will need to exist to help you make a more accurate decision. Here are some of the main issues:
-
Based on your memory or processor needs, you need to choose the appropriate operating system. Do not assume you will be always be able to utilize the same hardware and budget appropriately.
More Info See the topic Memory Used by SQL Server Objects Specifications in SQL Server Books Online for more information on how much memory certain types of objects will consume .
-
When it comes to memory and SQL Server usage, the biggest consideration is your procedure cache. This also dictates whether you can use one instance or multiple instances for your consolidation. With 32 bits, you can access at most 2.6 GB of procedure cache (this is also dependent on what memory options you use). If you are now consolidating multiple databases, each with many stored procedures, you might run out of procedure cache. Each stored procedure you have will have a plan in cache so that it runs faster. If you have many databases, and each has the same stored procedures, they are still different to SQL Server because the statistics for each database are potentially different, and therefore the plans are different as well. On top of that, if your server contains more than one processor, you will have two plans for every stored procedure, because the SQL Server engine will decide at run time how many processors will be used to run this stored procedure at this moment. Thus you could potentially be trying to cache the number of stored procedures in each database times the number of databases times two. This will definitely affect performance and push you into multiple instances. You might not know this requirement up front, and you might discover it in your testing phase.
-
If you need more than 2 GB of memory for your SQL Server 2000 instance, you will need to use a form of advanced memory, which must be configured properly for SQL Server to recognize it.
More Info See the section Memory Management for SQL Server 2000 in Chapter 14, Administrative Tasks to Increase Availability, for a full description of how SQL Server uses memory, recommendations on tuning memory, and how to configure memory for use with SQL Server 2000.
-
Processor and connection affinity has two effects: it limits the SQL Server instance to using only certain processors instead of using all the processors on the system (which is the default); this is processor affinity. It also ties the user scheduler to each processor, which is connection affinity.
To restrict processor usage for a particular instance, use the affinity mask option of SQL Server. The ability to limit which processors are used would be useful if you had an eight-processor system and wished to run two instances such that neither ever affected the processor workload of the other. You would simply configure the first instance to use processors 0, 1, 2, and 3 and the second instance to use processors 4, 5, 6, and 7. This sounds great, but in practice is often less than optimal. For example, when instance 1 is very busy, it cannot use any of the processor time from instance 2, even when instance 2 is idle. A better approach would be to configure instance 1 with processors 0, 1, 2, 3, 4, and 5 and instance 2 with processors 2, 3, 4, 5, 6, and 7. Thus each server has two dedicated processors that it can use and four processors that are shared.
Tip Windows and other applications can still use all of the processors. This setting affects only SQL Server instances unless you restrict the operating system or other applications as well. For more information, see the topic Managing SQL Server Resources with Other Tools in Chapter 14.
Tying user connections, or connection affinity, is not as straightforward. This allows an instance to always direct a set of client connections to a specific SQL Server Scheduler, which in turn manages the dispatching of threads on the available processors. There is a limited set of conditions under which connection affinity provides a noticeable performance benefit. One of its advantages is that it can increase the number of hits against a Level 2 hardware cache, but it also prevents a task from going to another processor if the processor it is tied to is busy. The disadvantages of using connection affinity are similar to those discussed for processor affinity.
A good example of a disadvantage specific to connection affinity would be a Windows-level memory check that comes up and is assigned to a particular processor. Your Transact -SQL query can no longer jump to another processor for completion and must wait until this check completes. Processors multitask, but they also have a priority component; on occasion, Windows priorities can and need to outrank SQL Server thread priorities.
If misused, connection affinity can prevent one from fully utilizing all available resources. Keep this in mind when evaluating the use of this feature.
More Info For more information on the affinity mask option, see the topic Affinity Mask Option in SQL Server Books Online.
-
If you are moving an application from a system of one type ”for example, with one or two processors ”to one that has six or eight processors to handle the increased load, each query might perform differently because a new execution plan might exist due to the increased number of processors available. If this is the case, you might need to alter the configuration settings of max degree of parallelism, and possibly also affinity mask. The problem with combining multiple workloads is that tuning these two things affects everything running under that instance.
Important Do not use any current tool other than SQL Server to manage your processor resources, such as Process Control under Windows 2000 Datacenter Edition. These other tools might not produce the desired effect. You should do all processor-related tweaking with the SQL Server parameters. Consult updated versions of SQL Server Books Online for any information on other tools as they become available.
Networking
You might now encounter more network traffic flowing to one SQL Server. As with any other planning consideration, do you currently know how many concurrent users are usually connected to the SQL Server at various times, especially under maximum load? From a memory perspective, each connection consumes 12 KB added to the network packet size multiplied by three:
Connection Size = ( 12 + (3 packet size) )
Therefore, in addition to the memory required for the normal SQL Server usage for things like procedure cache and queries, you need to take into account network connections. This factors into whether you decide to consolidate on one instance or multiple instances.
Once you know how much memory user connections take up for each database, you need to worry about network bandwidth. Can what is coming in and going out be handled by the proposed networking on the server? For example, if you have one 10-Mb card but your overall required throughput is 20 MB, do you add another network card and IP address (both of which add a bit of system overhead) or get a higher bandwidth network card and ensure that the network itself can handle the load?
Other networking issues include the following:
-
What port is SQL Server using? On installation, SQL Server 2000 grabs a dynamic port and not necessarily 1433. This is partially due to the instance support, as multiple instances cannot really share the same port. Generally the first instance (whether named or default) is assigned to 1433. Use Server Network Utility to assign a known, static port to each instance.
-
Does the SQL Server require domain connectivity (that is, a failover cluster) or have processes that use other SQL Servers (such as log shipping)? Are they all in the same domain? Mixed domains can cause problems in consolidation.
-
Do you have a mixture of SQL Servers that use domain accounts for the service accounts? Are they all in the same domain? Do you also have SQL Servers set to use the local service account? This might seem easy to solve when you consolidate because you can standardize; however, when it comes to object ownership and the login that was configured to run SQL Server Agent jobs, it is a different story. You must open the properties SQL Server Agent job and make sure that the Owner is set to the proper SQL Server login.
Security and Logins
Security is an obvious concern. Are there different standards for all of the applications, databases, and servers targeted? If there are, will they be in conflict with one another if they are combined? Or, for example, is something like IPSec required, but you are consolidating to a cluster where IPSec is not designed for failover, meaning the application will need to be changed or possibly not consolidated but put instead on its own instance or server?
Logins and how applications and users actually access SQL Server are the biggest areas of concern for security in a consolidation effort. User migration could consume a great deal of planning time. The Networking section earlier in this chapter already mentioned domain connectivity and domain accounts, but here the focus is on the actual logins. Do you have servers that have a mix of both Windows Authentication and mixed mode? Will that introduce risk when workloads are combined?
Also, security has changed a bit if you are only familiar with SQL Server 6.5 and earlier. SQL Server 7.0 introduced the concept of roles, so you might want to rethink how security is done for a particular application. This might become scope creep, so do not consider or implement it if it will impede the consolidation effort.
Here are some issues to consider with logins:
-
Plan sufficient time for your login and user migration plan, test, and implementation.
-
You have a login named JoeJ at the SQL Server level who has sysadmin privileges on one database server, a generic user on another database server, and dbo on one specific database. JoeJ maps back to one person. The problem in consolidation is that you do not want to accidentally give one user more or less rights than he or she needs, so take that into account when combining workloads.
-
You have three different users named JoeJ across different database servers and, until now, this was not a problem. It will take good communication to resolve these types of issues.
-
Even worse is the same login for the same person with different passwords across different servers that are now being consolidated. You can either assign a new password, choose one of the two that exist, or ask the user to select which password to use.
-
Do you need the Builtin\Administrators login? Do you currently remove it now from all installations? If you do not remove it, but are considering it, what will its impact be? Does a third-party software application require it?
-
Because consolidation means that you will more than likely have more users connected to the server, what permissions will you grant to public and guest users?
-
Not all logins that exist are in use (for example, old employees, active users versus configured users, and so on). Only migrate logins that are used. Remember that adding and dropping logins impacts the transaction log.
-
Do the logins, if Windows Authentication is used, come from different domains? Will you need to set up two-way trusts at the Windows level?
-
Does the database, or specifically , the application using it, currently use or require sa access? If so, that should be evaluated.
-
Have you locked down Group Policies on Windows Servers since the server was rolled out? How will that impact the new consolidated environment (for example, have you locked down the policies needed for clustering if that is the desired end goal)?
-
Did you plan your port assignments for your SQL Server instances, or are you letting SQL Server choose them dynamically? It is best to use static, known ports.
High Availability
Each instance of SQL Server 2000 needs to be made highly available to meet your SLAs. Because this book is all about high availability, putting solutions together, planning, and so on, this section is pretty self-evident.
Replication
If replication is configured on any, or many, of the databases being considered, planning will certainly be impacted. Each database or publisher needs a distributor, and the distributor can live on the publisher, the subscribers, or be hosted on a separate system. Do you plan on having a 1:1 ratio of publishers to distributors? That is probably unrealistic in a consolidated environment, which means that you will need to do some up-front capacity planning to ensure that the distributor can handle multiple publishers. Disk I/O will be impacted greatly, followed by processor and memory, especially depending on the type of replication implemented. This also means that the distributor will have to exist in a completely separate instance or, most likely, on a different server to ensure good performance and high availability. You might consider a remote distributor as well. Remote distributors for various publishers can be consolidated under one SQL Server instance, but distribution databases per publisher should be distinct. One instance with a publisher and distributor might be acceptable in a nonconsolidated environment, but it will not be acceptable in a consolidated environment. You also have to consider the base snapshot directory and its location per publisher: will you share one per SQL Server instance or use another one?
Also consider the subscribers, as they will be impacted, too, if you move publishers and distributors. This is important if you have to disable replication during the implementation process.
Migrating Objects
One of the hardest tasks for any upgrade, consolidation, or migration effort will be migrating objects, such as alerts, SQL Server Agent jobs, and Data Transformation Services (DTS) packages. For alerts, jobs, and operators, you can easily script them; however, before applying them to the consolidated SQL Server, ensure that there are no naming conflicts and that when you migrate to the new server, the proper user is associated with SQL Server Agent jobs. For DTS packages, you have a few options to migrate the packages to other servers, but once you move them, you might need to make modifications to them. Do not assume that they will run as is. Also ensure that the package is compatible. As of SQL Server 7.0 Service Pack 2, Microsoft changed the format from SQL Server 7.0 RTM and Service Pack 1. In that case, you would have to completely re-create the package under SQL Server 2000. Remember to account for time in your testing phase for testing and possibly fixing these types of issues.
Tip | There is a utility called the Copy Database Wizard included that can help move databases and their corresponding users, jobs, and other elements. An interesting option with this utility is to select the Run Later option and allow it to create five separate DTS tasks to move these pieces. You can then run all five or any combination of them that you need to transfer your database information. |
Administration
Prior to migration, think about the following considerations that impact the eventual administration of each consolidated SQL Server instance:
-
Serverwide server settings definitely need to be reconciled before bringing the new server online. Each individual SQL Server might have its own, and one setting could impact a database in many ways, so coming up with the final set of global server settings is crucial.
-
Similarly, if you use any startup options, those also have to be reconciled.
-
Do you have any history for objects or technologies configured, such as replication or SQL Server Agent jobs existing in msdb that are still valid and need to be migrated?
-
How will your maintenance change as a result of adding multiple databases or instances? For example, how will your entire enterprise backup strategy change? Will one backup now conflict with another and affect performance? Is your maintenance window for the server now smaller because of increased load?
Tip Be especially careful if you plan to back up to disk, and then from disk to tape. This speeds the backup process but triples the workload on disk spindles (the first operation is the read from the database, the second is the write to disk, and the third is the read from disk to write to tape).
-
Watch msdb growth, because with more databases it might increase in size due to more statuses and histories being written (depending on which functionality of SQL Server you are using).
Chargeback
Chargeback is the process of assessing the cost to utilization of the hardware so business units or customers can be charged appropriately. If chargeback is a consideration, you have a few options. One is to use SQL Server. Third-party tools such as ARMTech from Aurema can assist you in your cost accounting for system resource usage.
System Databases
System databases are an important consideration, especially in a consolidation effort. The system databases contain data and structures that are relevant to the entire server installation. Unlike user databases that can exist independently of each other, system databases are closely associated with each other as well as to the user databases from the source server.
How will master, model, and msdb be configured? Because you are now combining multiple servers, each with its own copy of the databases, you cannot just restore multiple copies, as the last restore will overwrite the previous copy. Remember that msdb includes all of your jobs, alerts, and operators (as well as history for some features such as log shipping), so plan on a size of at least 45 MB, and then add the additional amount per instance as well as about 10 percent overage.
Besides these space requirements, you need to analyze master, model, tempdb, and msdb to detect duplicate database objects and document them. This will extend from logins, stored procedures, user-defined data types, and tasks through to the specifics of objects. For instance, an object of the same name will represent or perform actions particular to the source server, or an object of a different name will represent or perform actions that accept modifications at the server level. This becomes particularly difficult when modifications have been made to these databases outside of logins. Remember to include any and all startup stored procedures.
You must identify any elements that are not part of a standard SQL Server build or installation, whether they are modifications or new objects. Review each object for relevance in a consolidated environment. Review each object to determine if duplicity exists at a more granular level. Review each object for explicit references, such as path names, server names, and task names. Do not address tasks found in msdb, as those are handled separately and must be dealt with on their own. Only search for nondefault objects found in tempdb. Also consider in advance how you will respond to application needs for previous compatibility levels. In general, these tips can save you much trouble in having to update applications, but remember that being fully on the current version usually minimizes support difficulty and costs.
Collations and Sort Orders
You must also take into account the collations and sort orders between servers. It is more likely that databases of varying collations will coexist in a consolidated environment than in numerous stand-alone servers. You can encounter unexpected results after migrating to a consolidated environment, and you must realize that certain applications depend on the collation and sort order setting of the database. If they are not set properly, the applications will not work after the switch to the consolidated environment.
Temporary tables will be created in tempdb with collation settings of the tempdb if the collation syntax is not included. Passwords in a case-insensitive SQL Server are converted to uppercase before being stored or used. Passwords in a case-sensitive SQL Server are not converted to uppercase. Because of this difference, passwords originally encrypted on a case-sensitive server and later transferred to a case-insensitive server cannot be used unless all alphabetic characters in the password are uppercase.
Similarly, understand how you need NULL to behave in your queries. If you turn the database option ANSI_NULLS on, all comparisons to a null value evaluate to NULL, which equates to unknown. If this option is off, comparisons of non-Unicode data evaluate as true if both values are null (that is, NULL = NULL is true). This is set to Off by default, so you should see the latter behavior. Also look at any other ANSI-related setting that might affect your environment.
More Info | For more information about collations, see the Effect on Passwords of Changing Sort Orders topic in either the SQL Server 6.5 Books Online or 7.0 Books Online and Selecting Collations in SQL Server Books Online. |
Other Technical Considerations
Here are other technical considerations to think about prior to migration:
-
Are you currently using linked servers to enable connectivity and queries between servers that might or might not exist any longer? Are you changing domains that might also impact linked servers? If you consolidate, how will it impact the application? Will queries need to be rewritten? How will you migrate your linked server settings?
-
As mentioned a few times, certain resources, such as DTC and the underlying Microsoft Search service that powers full text, are shared among all of the instances and even among other applications. That will be a concern in a consolidated environment as you add more databases and resources that use shared resources.
-
General application and database upgrade and migration rules apply in a consolidation effort. Remember to take into account running of processes like DBCCs, reviewing logs, updating statistics, and so on, and fix any issues prior to moving the database. Also remember that depending on the physical layout (that is, where you put the data and log files for your consolidated SQL Server), what you do to one database might affect others.
-
Have you modified any of the system tables? If so, those modifications might not work with the consolidated SQL Servers. Also, Microsoft does not recommend that you modify the system tables or add objects to databases like msdb or master.
-
Are there collation and sort order conflicts between the current SQL servers and the proposed consolidated SQL Server environment? This is an easy item to miss if the planners, implementers, and DBAs are not in touch. Resolve any conflicts prior to consolidating.
-
Does the database to be consolidated use extended stored procedures? If so, how will that impact other databases and, potentially, other instances?
-
Service packs are a major consideration for a consolidated environment. If you have one SQL Server at one level and another at a different level, is that acceptable? (With SQL Server 2000, you can have this situation.)
-
Remember that if you previously implemented SQL Server 7.0 clustering and upgraded to SQL Server 2000, you cannot go back on the same hardware without reinstalling the operating system due to the changes in MDAC that SQL Server 2000 installs . Plan for this.
-
XML stored procedures are not supported in fiber mode, so use thread mode or a separate instance to avoid error 6604. Thread mode is excellent for systems performing MS DTC tasks, remote queries, linked servers, and extended stored procedures.
Phase 3: Consolidation Planning ”The Server Design and the Test Process
Once the envisioning phase is done and guiding principles are agreed on and documented, work on planning the consolidation can begin. There are two main parts to planning: designing what the consolidated servers will look like and testing the implementation process.
Designing the New Production Servers
When you start to design the new environment, you need to take into account every aspect from the ground up: administration and operations (including monitoring), performance, backup and recovery, chargeback (if necessary), disaster recovery, high availability, security, and so forth. Do not assume that you can just use what you have in place for your current system, as the new environment might be different. Also, with more databases and SQL Server instances, you will have different rules and options that need to be reconciled. At this stage, if you identify new tools that need to be either built or acquired , document and plan for those, too.
Migration of Applications, Users, and Data
One of the important factors for the DBA is how the applications, users, and data will be migrated to the new consolidated environment. At this stage, any potential SQL Server 6.5 and SQL Server 7.0 migration issues should be completely isolated and identified, from Transact-SQL incompatibilities to system settings and everything in between. You might also need an intermediate staging server. Determine the order in which you will migrate the applications, users, and data. Resolve any conflicts in object names and location. Worry about the end users experience and how you will make the effort as transparent as possible for them. Also determine how you will notify your end users about the move. How will business processes be impacted?
Avoid scope creep. You are consolidating, not adding new functionality or enhancements to applications or data schemas. Problems could be identified during the migration planning that, as long as they will not break in the consolidated environment, should be left alone. Once you have consolidated, you might want to consider these problems, but trying to solve them at this stage only causes the consolidation to take much longer than necessary.
Finally, do not waste time adding applications to a consolidation until the application is reasonably stable. Stability is crucial; many of the troubleshooting tools that might be required to diagnose the flaky application (for example extra perfmon counters, debug builds, and so on) could adversely affect the performance of the entire system.
Test the Process
Prior to actually doing the planned upgrade or consolidation in a production environment, you must build and test all of the new procedures, tools, and the entire migration. The capacity and capability of the environment used for testing should be the same as, if not close to, the final production environment. However, you may be constrained by budget, resources, and other limitations. The importance of this stage is to isolate, identify, document, and fix any defects, errors, or problems that you encounter so a production effort will, one hopes, be flawless. Devising the proper test plans will ensure a successful production rollout. The plans should identify the following:
-
Do applications and utilities still function in the same way as before the change, and how that is tested and measured? Remember, it is not just the changed component, but components that have dependencies on it. This can be a large matrix that must be modeled and then tested.
-
Does the system, and subsequently, SQL Server and all related applications still provide the required performance as they did before the change and how that can be tested and measured? This is as difficult to test as functionality. Do you have playback scripts or dummy test loads that you can use to reliably and accurately assess and compare the performance impact of a change?
Building a test platform is not an easy task. In some cases, it does not need to be an identical system or set of systems from the standpoint of number of processors, amount of memory, or total storage accessible by the cluster nodes, but in all other ways the test platform and the production environment should be identical. The ability of the system or systems to be partitioned affects this as well. Partitioned systems can be set up so that a small partition has exactly the same BIOS, chipset, devices (include firmware and BIOS versions), driver versions, kernel components (for example, antivirus, storage management, security monitor, system management utilities, firewall, disk mirroring, and so on), and user mode applications. All versions of all components should match the production environment and configuration, otherwise the confidence gained from testing an update or upgrade is lowered . Systems that do not support partitioning must be mirrored separately, but it is still important to ensure that all the components are identical between the test system and the production system. The same logic applies to clusters.
The most critical factor affecting whether the update or upgrade will be successful (and not cause any unplanned downtime) is interactions between different versions of hardware and software components and not the total size of the system as measured by number of processors, amount of memory, total storage, number of network or storage adapters, and so on. Thus, a test system or partition need not be exactly the same capacity as the production system, but it should be representative. An example for the use of a partitionable system might be that the production environment is 12, 16, or even more processors, but to be representative, the test partition might only need to contain 4 CPUs. Similarly, the production system might include 32 GB, 64 GB, or even more RAM, but to be representative, the test partition might only need to contain 8 GB of RAM.
Continuing with the same logic, it might not be necessary to fully reproduce the storage capacity of the production system, but only to provide the same behavior and characteristics, scaled to match the smaller test configuration. Some storage arrays support multiple systems and provide internal partitioning. The test system could be attached to the same partitionable storage resource as the production system but scaled down, providing the same storage behavior and characteristics for the test system as the production system. Finally, if you are using a server cluster, it is important to model the test cluster to match the production cluster configuration as closely as possible. Doing this ensures that planned and unplanned failovers allow the application or entire solution to function correctly, guaranteeing the highest possible availability. You can accomplish this testing with just two cluster nodes even if you have a four-node cluster in production, but if you do not match capabilities as closely as possible, your testing may be somewhat invalid.
The reasons that versions are the key factor to test, rather than size or capacity, include the following:
-
No single vendor can test against all possible permutations that customers might deploy. They number in the millions.
-
Quick fix engineering updates (QFEs) are typically less risky than version changes because the architecture of the component (that is, how it functions internally as well as how it interacts with the rest of the hardware components, operating system, the various applications and utilities, and so on) does not change.
-
Version changes, even point version changes such as a difference between version 1.0 and 1.1, or service packs from hardware and software vendors, might have many small changes compiled together or might include architectural changes. The aggregation of many small changes or the change in architecture might impact other components in the complete configuration or solution and thus need to be more extensively tested.
-
Obviously, a full version change of any type, be it for hardware, the operating system, applications, or utilities, requires similar or even larger amounts of testing than that done for service packs.
-
From a Windows Datacenter Edition standpoint, certain conditions must be met for a configuration to be considered tested well enough to mitigate the risks inherent in updates and upgrades. These include the support of an original equipment manufacturer (OEM) as part of the Datacenter program, where the customer and the OEM have mutually agreed on configuration management and change control processes and policies as well as the pretesting required for an SLA.
The use of a scaled-down environment is usually possible, with the most glaring exception being when the production system is running close to its maximum capacity, as measured in processor utilization, memory utilization, paging activity, network bandwidth, storage queues, and so on. In this case, even small changes in the behavior of components might cause the production system to begin to function noticeably more slowly and not be responsive enough for production use. This is unavoidable and simply a function of how any system responds when the work demand exceeds its ability to complete that work. In a worst case, the system might effectively be unusable. Thus, customers should always leave headroom in the production system that can absorb an unexpected performance impact. This also gives the system the same headroom for spikes in demand. As a rule of thumb, many customers limit average utilizations of various system resources to between 65 percent and 70 percent. This guideline, however, may not apply to your systems at all. There is a difference between peak usage and overall average use of systems. Those same customers begin planning to expand or otherwise upgrade the system capacity when those levels are consistently exceeded.
Once the test systems are built and are identical with the production systems, baseline testing should occur to validate functionality and assess performance characteristics and how they scale in comparison to the production systems.
There is some expense involved for the test system or systems, adapters, fabrics , networks, storage, and possible clusters and other hardware components. This is also true for the various software components, such as the applications and utilities that run on the system. However, you need to measure these expenses against the cost of encountering unplanned downtime that results from a failure, or when the update or upgrade goes bad and results in hours ”or even more ”of downtime. For all of these components, both hardware and software, customers should ask their vendors about costs and licensing, vendor policy on product instances used strictly for testing purposes, and other concerns.
Finally, it might be necessary to have some clients and a small network set up so that loads can be applied to the system or components using synthetic or dummy loads supplied by vendors or built by the customer.
Important | Pay attention to time in your testing. Too often, testing occurs in the weeks before a migration when there is plenty of time. Your staff might kick off jobs and go home. The next morning, the jobs have completed successfully and everybody is happy. However, when you go to migrate production applications in the 8- hour window of downtime you have arranged with users and find out the migration takes 12 hours, you have a problem. Every test task should measure not only the migration function, but also the time it takes to do this function. You might find that your available downtime window limits how much of the task you can accomplish in a night or a weekend . |
Determining Risk
It is important to determine the risks ”whether based on people, technology, or cost ”that will be measured against. The success of a plan depends on knowing where potential failures can occur so you can mitigate them and see if the risks are worth taking. The best thing you can do to determine these risks is to test the entire process. However, you should also discuss the risks with anyone who has migrated these applications in the past. They might save you a lot of trouble by helping you to avoid any problems they encountered.
Phase 4: Developing
The developing stage is when the plans are realized and implemented for the first time in a testing or staging environment. This requires hardware completely separate from the eventual production servers, as it is crucial to work out any potential problems before going live. Ideally, the servers used in this stage have the same capacity and horsepower as their production counterparts, but that might be unrealistic due to budgetary constraints. The bottom line is that the better your testing or staging environments mirror your production environment, the more successful you will be when the actual production migration occurs.
There are at least three steps during this phase: technology validation, proof of concept, and piloting.
Once you decide on and plan for the technology that will be used during this phase, it is time to see if the technology performs as expected or if it is not appropriate at all. This is one of your last chances to modify the consolidation effort s physical design. Once the technology has been proven, it is time to roll out a proof of concept to show that the final environment will work as expected. The proof of concept should mirror the production environment exactly, just on a smaller scale. Finally, choose one of the servers that will be consolidated and make a pilot of it. Even if the technology has been evaluated and the proof of concept was successful, you still need a proper pilot to prove that in a production capacity ”unlike the proof of concept, which is on a much smaller scale ”the consolidation will benefit the entire business. If the pilot is a failure or is scrapped for other reasons, this is the time to figure things out.
Follow the configuration plans and migration plans as documented. If problems arise, document them, and if there is a fix, document the fix and revise the plans accordingly . After migrating applications, data, and users to the new consolidated SQL Server, put it through its paces ”use it as it would be used on a daily basis ”and test the servers under load. It is crucial to ensure that the databases that are now consolidated under one SQL Server will work well together when all applications are experiencing high utilization. Otherwise, there will be performance issues in production.
If any issues arise due to coexistence after testing, document the issues, and possibly rethink the strategy for that particular application s databases. Mitigating risk at this stage is crucial. Remember to test not only the migration, but also any administration plans, scripts, procedures, tools and so on that will be used in the eventual production environment. Just testing the migration process is not enough.
Phase 5: Deploying and Stabilizing
There are two main steps in this phase: back-end deployment and application deployment.
Back-end deployment is when the backbone of the system (hardware, operating system, networking, SQL Server, and so on) has been completely configured and tested. Application deployment is just that ”configuring the databases and rolling out the applications in the new consolidated environment. At some point, you reach the go/no-go point, beyond which you cannot easily go back to the old environment. Stabilization continues through and beyond this stage.
Once you are confident that the proper hardware designs have been crafted and the migration plans are well tested, it is time to build and deploy the production consolidated SQL Servers. Even if your plans are well tested, do not perform the entire consolidation at once. Take a phased approach: Deploy one consolidated server, migrated server, or upgrade, thoroughly test it, compare it with the original environment, and then finally retire the old environment. Only at this point should you consider consolidating another SQL server, because if you do not completely verify that one effort went as expected and other problems are encountered during another migration, you could obscure and add to the problems. If it was unsuccessful for whatever reason, you can update your plans accordingly. Keep the old environment available until you have completed several or all of the upgrades, just in case.
Категории