Windows Server 2003 on Proliants. Deployment Techniques and Management Tools for System Administrators
< Day Day Up > |
The location of file, print, and application servers depends on a variety of factors, including user population, required accessibility, and more. Deployment of file and print servers is covered later in this chapter. The inclusion of application partitions in Windows 2003, described in Chapter 1, shows the potential of allowing applications to take advantage of Lightweight Directory Access Protocol (LDAP) services and to replicate application date via Active Directory (AD) replication. Another solution is the Microsoft Active Directory Application Mode (ADAM) product. ADAM could well be named "Active Directory Lite" because it provides pure LDAP Directory Services (DS) and replication for applications. With ADAM, Microsoft has decoupled DS from the Network Operating System (NOS) directory that AD represents. ADAM requires Windows 2003. This is similar to how application partitions work, except ADAM does not require the intervention of DCs or AD. Thus, applications can be installed on member servers. Another advantage is that each ADAM session has its own schema, so application owners can update their individual schemas without affecting other applications. Multiple instances of ADAM can run on a single server, making very powerful and flexible application servers. The down side of ADAM is that it can't use Domain Name Server (DNS) SRV records to locate servers. For additional information about ADAM, download the whitepaper from Microsoft at http://www.microsoft.com/windowsserver2003/techinfo/overview/adam.mspx. The point here is that if you plan to use ADAM, you need to plan for server deployment for ADAM- related applications in addition to the criteria noted in the "File and Print Services" section of this chapter. DNS Placement
Although the Windows community has more than four years of Windows 2000 experience, there is still a lot of difference of opinion regarding placement of DNS server. A large number of deployments I've seen specify every DC to be a DNS. I suppose it's a natural progression to make all DCs DNS server as well, but this isn't necessary. Typically, DNS server ”at least caching-only servers ”are placed in remote sites or sites across slow links to provide DNS name resolution on a more reliable basis than connecting to remote DNS server, which is a good practice. However, just blindly making every DC a DNS is not following good design practices. My experience working in troubleshooting DNS problems over the past four plus years has taught me three things about DNS and AD:
This list would indicate that making every DC a DNS is not a good idea. If you analyze each situation and it happens that every DC should be a DNS, that's fine. For instance, if all of your remote sites have a single DC, it makes sense that they should also be DNS server if the links are slow or unreliable. HP has three domains worldwide and employs three DNS ADI servers per domain. Thus, only three DNS server for North, South, and Central America; three more for Europe, the Middle East, and Africa; and three more for Asia-Pacific. In the Qtest environment at HP, we have a similar configuration, and place those DNS server in sites on or, one hop from, the corporate backbone. Note that ADI DNS stores the DNS records in the AD. Thus, even if all the DNS server in a domain become unavailable, you can simply install DNS on another DC, and the zone will be populated . Of course, to complete the transformation you would need to do the following:
The first option is usually the easiest , because it avoids having to change the DNS resolver on all the clients. This built-in redundancy caused one company I know of to move from UNIX BIND for its corporate DNS structure to Windows 2000. Consider these points when determining how many DNS server to deploy and where to deploy them. Site Affinity
Windows 2000 introduced the Site Affinity feature to allow clients who request services from a DC or a GC to contact a DC or a GC in their local site. This could be for authentication, access to Distributed File System (DFS) shares, GC searches, Exchange GC access, and many other applications. Although DCs have to be associated with a site, a site doesn't necessarily have to have a DC in it. DC-less sites often cannot justify a DC, but want to have Site Affinity defined for the benefit of applications such as DFS. When designing sites without DCs, it's important to note that AD employs auto site coverage . Site coverage is an algorithm that defines a DC in one site to provide services to clients in another site that has no DC. This DC responds to a request as if it were indeed in the same site as those clients. This means that clients in the DC-less site authenticate to the DC that is "covering" that site. Because DCs have domain boundaries, this principle also applies to clients that are members of a domain for which there are no DCs for that domain in the client's site. Thus, if the clients in the Atlanta site were members of the B.A.com domain, but there was only a DC for the A.com domain in Atlanta, the clients would authenticate to a DC in another site that is a member of B.A.com and is covering Atlanta for that domain. In terms of design, Site Affinity follows the least-cost path rule. For instance in Figure 6.1, four sites ”Atlanta, San Francisco, Charlotte, and San Jose ”are all in a single domain. Only Atlanta and San Francisco have DCs. The costing has been constructed so that the DC in Atlanta covers Charlotte, and the DC in San Francisco covers San Jose. This is true because the cost from Charlotte to Atlanta is 50 and Charlotte to San Francisco is 70, and likewise from San Jose to San Francisco is 50 and to Atlanta is 70. Figure 6.1. A well-designed cost structure allows Site Affinity to be forced to preferred DCs for DC-less sites.
One customer I worked with had been the victim of bad information and it nearly caused the company to implement a poor design. With more than 200 physical locations, the company had DCs in only about 15 sites. However, the company wanted to implement Site Affinity for clients in all sites. The "Automatic Site Coverage" section of the Distributed Systems Guide (in the Windows 2000 Resource Kit) describes the site-coverage process. It specifies that sites that have no DCs for a particular domain are "covered"by a DC for that domain in another site. The "closest" DC is defined as a DC in a site that has the least-cost path to the DC-less site. For example, if a user from Domain A logs into the Boston site, and there are no DCs for Domain A in Boston (or none respond), the Knowledge Consistency Checker (KCC) determines the closest site that has a DC for Domain A by evaluating the site cost to that site from Boston. Now, if the KCC determines that 2 sites have Domain A DCs, and they both have the same cost from Boston, then there are a couple of tie-breaker rules. The rules are the site with the most DCs will cover the DC-less site and, if that fails, then the site whose name is highest in alphabetical order will provide coverage. This customer decided to put all sites in a single site link, DefaultIPSiteLink, and let the KCC use these tie-breaker rules to determine site coverage. The company also wanted all the sites in the United States to be affiliated with the New York site, all the sites in Europe to be affiliated with the London site, and all the sites in Asia to be affiliated with Singapore or Tokyo. This could never work reliably. The company's own testing determined that the tie-breaker rules didn't always work as expected (remember these are for tie-breakers, not design points). Even if the rules did work, it's highly unlikely the company could ever get lucky enough to have the Site Affinity to New York, London, Singapore, or Tokyo as desired. The problem, as noted in the "Replication Topology" section in Chapter 5, was that the company was giving the KCC too much freedom to decide. Throwing all sites in one basket , letting the KCC sort them out, and having it all work out in a certain way is virtually impossible ”unless you happen to get lucky once or twice. The solution was simply to create a multi-tier topology with site links, as shown in Figure 6.2. Note that New York is the first tier; London, Singapore, Tokyo, and Amsterdam are in the second tier; and DC-less sites are in the third tier. Site links were assigned costs according to their tier level, forcing replication up the tree to the core sites. This forced a DC-less site in Berlin to be covered by the Amsterdam DC who replicates with the central hub in New York. Under the original design using the tie-breaker rules, all site link costs were equal, and because there are more DCs in NYC than Amsterdam, Berlin would have had automatic site coverage provided by the DCs in NYC. Figure 6.2. Site link costing is used to define site coverage for DC-less sites as well as for replication for sites with DCs.
Note that designing Site Affinity for DC-less sites follows the same rules as if they were sites with DCs. Use explicit site links and costing to force replication (and site coverage) in the way you want it to go. Designing Site Affinity really isn't that hard. It took this company a couple of weeks of testing and it still didn't have the answer. It took me about an hour and a half to do it the right way. DC Placement
DC placement requires some forethought, especially for environments with a lot of remote sites over slow links or remote sites with few users. Implied with DC placement is the decision of whether to make a physical location an AD site. The criteria generally depends on several factors:
Reed Elsevier, a case study referred to in Appendix A, "Case Studies," and in other chapters in this book, developed an ingenious flowchart to determine whether a physical location should be designated an AD site, and if so, how many DCs should be deployed there. Reed Elsevier has graciously agreed to allow us to publish the flowchart, shown in Figure 6.3. The chart uses a weighted system where each "Yes" answer to questions in the decision tree is worth a weight of 1 or more. The cumulative score at the end determines whether it will be an AD site and how many DCs will be deployed there. Figure 6.3. Reed Elsevier's flow chart for determining viability of an AD site and the number of DCs to be deployed there.
Note that if the site is unable to physically secure the DC/GC or if that site does not have the finances to obtain the hardware and software, the path ends and the request is denied . Obviously, the financial requirement depends on the company's business structure, but the security issue is important. A number of cases have been reported where thieves have stolen disk drives from DCs or GCs and were able to extract user information such as account names , personal information, and financial data. In the design of security, don't forget that a lack of physical security can negate all the software security you put in place. Of course, Reed Elsevier's logic might not include the criteria or the weight on certain criteria that your company might have, so this should be used as a guide, not a definitive solution. For instance, Reed Elsevier attaches importance to the residence of company executives at the site, but you might not want to add this into your equation. GC Placement
Windows 2003 made some significant enhancements to universal group functions. Chapter 1 describes the new Universal Group Membership Caching feature. In a Windows 2000 multiple-domain forest with native-mode domains, users must contact a GC at logon to retrieve their universal security group memberships. This forces the system architect to either put a GC at a remote site or incur the additional traffic required for users to connect to a GC over the WAN. In the four years I've worked with customers in designing AD installations and troubleshooting problems, where GCs should be placed is one of the most frequently debated issues. Consider using the following checklist to determine GC placement:
tip Outlook 2003 has a feature called Cached Exchange Mode. Outlook XP and 2000 showed significant performance degradation, including a significant increase in TCP/IP traffic. Cached Exchange Mode, enabled offline, caches the user's mailbox locally on the computer, in much the same way the Offline Files feature does now. The user is thus reading cached mail, which significantly improves perceived performance, especially when viewing messages with large attachments. Utilizing this feature in sites that don't have GCs could improve the user's experience and not require a local GC.
You might consider using a table like that in Table 6.1 to identify GC placement. Table 6.1. GC Placement Schedule
Note that this table identifies domains that are hosted at each site, implying DCs for those domains existing in those sites. Exchange servers are noted as existing in the site or the Exchange server used is noted. Also, the GC that serves the site is noted. If a GC name is noted in the GC column for the site, the GC exists at that site. Note that GC Caching is identified for the small sites of Boise and Lyon. Flexible Single Master Operations (FSMO) Placement
The placement of FSMO role holders, sometimes referred to as operations masters, is an important design consideration. Although FSMO placement is important to ensure high accessibility by clients, DCs, and GCs for such things as password changes, new account creations, Group Policy modifications, and Time Services, recovery of FSMO role holders is equally critical. The recovery of failed FSMO role holders is described in detail in Chapter 11, "Disaster Recovery." A brief review of FSMO roles will help determine where the DCs hosting those roles should be placed. The sections following describe the potential impact to the environment at the loss of each FSMO role holder. This helps determine the importance of availability of that DC; what kind of support effort needs to be expended to keep it online in case of a failure (that is, does it require immediate attention or can it be handled on a lower severity level.) This also helps determine whether another DC should seize the role, and if so, when that should be done. DCPromo
When demoting a DC that is an FSMO role holder, the roles should be transferred to another DC first. However, the demotion process identifies any FSMO roles held by the DC and transfers them to another DC. This usually works successfully, but the DC that DCPromo chooses to receive the roles might not be the one you want. Make sure to transfer the roles before demoting a DC if possible. If the DC has to be forcefully demoted, or if the DC can never be brought back online, then seizure is the solution to making the roles available again. Forest-Wide Roles
These roles relate to operations in the schema and configuration naming contexts (NCs), and apply to every DC in the forest. As such, any DC in any domain in the forest can hold these roles. The two forest-wide roles are schema master and domain naming master. Schema Master
Availability of the schema master is required only when modifications to the schema take place. This could include execution of Exchange's ForestPrep utility, which adds classes and attributes to the schema in preparation for installing Exchange; execution of the ADPrep utility to prepare a Windows 2000 forest to be upgraded to 2003; execution of Windows 2003's Domain Rename operation; and installation of third-party applications that modify the schema. However, on a day-to-day operational level, loss of this role holder will not affect the user population. If the original role holder can come back online before any schema modifications must take place, don't move the role to another DC. In most cases, the schema master can be left offline until the original is restored. Domain Naming Master
Contact with the domain naming master is required to create or delete domains (during DCPromo) and for the Domain Rename process as well as other operations that require modifications of the domain structure. Loss of the domain naming master role holder does not have an immediate impact on the forest and usually does not require seizing the role to another DC that has good network connectivity to the other DCs as well as the server resources to handle the extra load. The domain naming master role must be held by a GC server.
note If you have a mixed Windows 2000 and 2003 domain in the forest (that is, Windows 2000 and Windows Server 2003 DCs are in the domain), you must put the domain-naming master on a Windows Server 2003 DC so it can support application partitions.
Domain-Wide Roles
The Relative IDentifier (RID) master, Primary Domain Controller (PDC) Emulator, and infrastructure master are the three operation master roles whose scope is the domain. Thus, each domain contains DCs with these roles. RID Master
Each security principal (user, computer, or group) is identified in AD by a unique Security IDentifier (SID). The SID consists of two parts : a domain SID (SID that is unique for the domain) and an RID. All security principals in the domain contain the same domain SID and the unique RID, which forms the object's unique SID. RIDs are assigned by a DC in the domain in which the security principal resides. Because RIDs must be unique, a single DC holds the FSMO "RID master" role. The RID master is a single source of generating RIDS and handing them out to the DCs, thus ensuring that two DCs don't give out the same RID to an object. The RID master allocates blocks of RIDs to DCs to allow them to create new accounts (user and computer) and groups, and assign a unique SID to each account. The RID master allocates blocks of 500 RIDs at a time to each DC and when that block is 50% depleted, another block is allocated to the DC. Thus, the DC has a considerable buffer to allow it to create accounts even if connectivity to the RID master is broken. Loss of this role holder is not critical unless a large number of accounts are created, such as during a migration, or an application is run that creates large numbers of accounts. The size of the RID pool allocated to each DC can be modified in the Registry at HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NTDS\RID Values\ by setting the RID Block Size (REG_DWORD) to a value greater than 500. Setting it to less than 500 leaves the default setting of 500 in place. Microsoft recommends leaving this at the default, but if you do modify it, note that setting it abnormally high can have "adverse effects on the domain longevity," although the reasoning for these recommendations is not given (see Microsoft KB article 316201, "RID Pool Allocation and Sizing Changes in Windows 2000 SP4").
note Windows 2000 preSP4 reallocated the RID pool to a DC when it was 80% depleted, or contained about 100 RIDs. When performing migrations or other automated tasks in which large numbers of users, groups, or computer accounts were created, this could deplete the RID pool quickly. Lowering the threshold for a refresh of the pool is intended to minimize the probability of exhausting the RID pool and prevent creating accounts on a DC if the RID master cannot be contacted or the RID pool cannot be refreshed in time. The threshold for Windows 2000 SP4 and Windows Server 2003 RID masters to refresh the DCs' RID pool is now 50%.
One issue that surfaced in Windows 2000 was the case of an RID master being brought back online when the role had been seized. This sometimes caused duplicate SIDs to be assigned because the old RID master had not replicated to find out that it wasn't the RID master anymore. Windows 2000 postSP2 and Windows 2003 changed this behavior by requiring the RID master to do one full synch with its NC (domain) before advertising itself as a RID master and handing out RIDs. PDC Emulator
The PDC Emulator has a number of critical roles, many of which affect users. These functions include
The PDC Emulator plays a number of roles, and the list is growing. You will probably note additional functions given to the PDC Emulator as Microsoft develops the operating system (OS) and finds additional need for a single source for functions. Thus, the PDC Emulator failure is immediately visible to users in a mixed-mode domain or in a domain supporting downlevel clients because it has security implications, can cause browsing failures, and could cause time sync failures, possibly resulting in authentication failures and security breakdowns. The PDC Emulator is perhaps the most critical of all role holders and should be brought back online via transfer or seizure if it will be offline for an extended period of time. Each organization must define this period. Infrastructure Master
The infrastructure master is responsible for resolving interdomain lookups. If a user from the Americas domain is added to a group in the Europe domain, the infrastructure master compares the user-group references that it knows about for objects in its domain with what a GC knows about those objects. If the GC has different information, the infrastructure master updates its data. Loss of the infrastructure master is not serious and won't affect users. For example, suppose an account was created for Abigail Witbeck in the North America domain, was added to the LondonUsers security group in the Europe domain, and replicated to other DCs in the domain as well as GCs in the forest. Abigail sends a request that her username be changed to Shanna Witbeck (as she prefers using her middle name), so you change the account to Shanna Witbeck. If the infrastructure master is unavailable in the North America domain, the LondonUsers group still contains the object "Abigail Witbeck." This poses no security risk, and would cause confusion only if an Administrator was observing the group membership before the infrastructure master came back online to make the change. Note that the infrastructure master in the domain that the group lives in (Europe domain in this example) is responsible for updating the name change in the group membership. The infrastructure master should not be on a GC. This would cause the infrastructure master to fail to update other DCs, because it updates only data that differs from the GC. If the infrastructure master is a GC, there is no difference and the domain DCs don't get the update. A good description of this is contained on page 210 of my book, Windows 2000: Active Directory Design and Deployment (New Riders, 2000). In two instances, the infrastructure master role is irrelevant:
With the FSMO roles defined, let's examine how to determine where to place the FSMO role holder DCs to make sure they are able to efficiently serve their purposes. Placement of FSMO Role Holders
In general, the placement of DCs holding FSMO roles should
Transfer and Seizure of Roles
To move roles from one DC to another, they should be "transferred." This is accomplished via the Active Directory Users and Computers snap-in for domain-wide roles, the Active Directory Domains and Trusts snap-in for the domain naming master role, or the Schema Manager snap-in for the schema master role. The bad thing about using these snap-ins is that there are three different snap-ins to change the five roles. The NTDSUtil.exe tool, available in the Windows 2003 support tools, is my personal favorite because it displays all current role holders and permits easier recognition of role holders and transfer or seizure to new DCs. Another tool, Replication Monitor, available in the Windows 2000 and 2003 support tools, not only allows you to see who the role holders are for all five roles and transfer the roles, but it also allows you to see whether the current role holders can be contacted. In the Replication Monitor application, you can add a server using the Add Monitored Server menu option. Once added, right-click on the server icon, go to Properties, and then select the FSMO tab. The role holders are listed along with a Query button. Clicking this button causes that server to query the FSMO role holder to see whether it can be contacted.
tip The fastest way to find who the role holders are for the five FSMO roles is using the Netdom command: C:\>netdom /query fsmo Schema owner qtest-dc22.Qtest.cpqcorp.net Domain role owner qtest-dc22.Qtest.cpqcorp.net PDC role qtest-dc22.Qtest.cpqcorp.net RID pool manager qtest-dc22.Qtest.cpqcorp.net Infrastructure owner qtest-dc5.Qtest.cpqcorp.net The command completed successfully.
Transferring roles requires that the existing role holder be online and accessible during the transfer process. The role is moved to another DC and the original DC relinquishes the role. However, in the case of a DC that is unavailable because of hardware failure, network failure, and so forth, transferring the role is not possible. Seizure of roles can be executed via the snap-in or NTDSUtil, which simply assigns a particular DC to be the new role holder and advertises that fact to the other DCs in the domain or forest as needed. The danger, of course, is when the original comes back online and it doesn't know of the role change. This scenario has been modified somewhat in Windows 2003 to reduce problems that occurred in Windows 2000, such as duplicate RIDs being assigned. (This is described in detail in Chapter 11.) Whenever a seizure is attempted via the snap-in or NTDSUtil, a transfer is always attempted first. If the transfer fails, the seizure proceeds. A seizure should only be used when it's critical that the role holder comes back online without waiting for the original.
note Since the early days of Windows 2000, Microsoft has always recommended that FSMO role holders never come back online after their role has been seized. Although Windows 2003 and Windows 2000 SP3+ has made the AD more tolerant of this situation, it's best to be safe and just wipe and reload the machine, cleaning the objects out of the AD. (See Microsoft KB article 216498, "How to Remove Active Directory objects after an unsuccessful DC demotion," for more information.) Thus, in determining whether a role should be seized, assess the impact to the environment by going without that role as opposed to wiping and reloading a DC or GC and cleaning up the AD. The policy of FSMO role seizure should be defined in an SLA. This definition will vary from environment to environment.
|
< Day Day Up > |