Upgrading and Repairing Servers

Nearly one out of every two large IT projects ends in failure. Failure takes many forms. A project can fail because it doesn't do what it was supposed to do for its users. It can fail because it runs out of money or out of support from project champions. It can fail because technology moves on past it. It can fail because the necessary personnel aren't found to implement or staff the project. The field is full of some amazing horror stories, including stories about IT projects that sank entire large companies because they affected primary business systems.

The bigger the project and the longer a deployment takes, the harder it is to make that project succeed. There are so many ways a project can fail that sometimes it's a wonder that wonder how projects do succeed, in spite of all the things that can possibly go wrong.

However, many large server deployments do succeed, and a few of them even come in close to budget. A few common themes seem to run through all successful deployments. The projects that are successful are carefully planned and implemented in stages, using an iterative approach. That's true for the smallest data room up to huge SAP or PeopleSoft systems that integrate several enterprise database applications running on hundreds of servers.

A project has the highest chance for success when it has the following separate phases:

1.

Project specification phase

2.

Budget phase and time lines

3.

Testing phase

4.

Implementation phase

5.

Review phase

The following sections take a little closer look at what these different phases entail.

The Project Specification Phase

During the project specification phase, you collect the requirements for the system. When a good database consultant sets out to design a custom database for a client, he or she goes through a project specification phase that involves interviewing all the appropriate knowledge experts, collecting every possible form or input screen, and talking with different classes of potential users. The consultant is on a fishing expedition to find all the information possible to use as input into the project.

For server deployment, the project specification phase should consist of many of the same common elements. During this phase, you need to set the requirements of the system, based on the expectation of management and users. This is a critical step and one that often isn't handled very well by people who have done a few deployment projects already. First of all, the interested parties involved often either have no idea what they need or expect, or have an unrealistic idea. So it is up to you to set the expectations in consulting with these users so that what's possible intersects with what's desired.

The second glitch that's commonly encountered in specifying a server deployment involves the hubris of the project developer, or (worse yet) that of management. To make a server deployment work, you want as many of the people who will use the system to sign onto the project as possible. If you take the attitude that you know best, or that it is "your" money because you control the budget, you will lose valuable, maybe even critical, input from knowledge experts. Worse still, people may not use the expensive systems you deploy.

However you come by the information, you need to develop a working hypothesis as to what the system(s) specifications should be. To meet those expectations, you should consider the characteristics of similar working systems. Usually there are two approaches to a first cut at the system configuration:

  • One is based on models developed by vendors and their previous clients.

  • The second is based on historical data that you collect in-house.

With this information, it should be possible to draw up an initial project specification document that offers a reasonable road map from your wish list to a final implementation and installation of system(s) to be deployed. An initial project specification should not be carved in stone but be approached as a hypothesis that is meant to be tested. All deliverables should be spelled out in this document.

The project specification phase is probably the most important part of any deployment. Every dollar you spend planning your deployment will result in many more dollars in savings and often a much faster deployment time. If you are hiring outside consultants to deploy a system, it is a good idea to insist on a project specification that results in both a budget and a time line. Any consulting firm that doesn't do a project specification is either going to seriously pad the budget to protect itself against overruns or won't be in business long enough to complete your project.

The person or people who are going to deploy your system(s) aren't necessarily the ones who need to do the project specification. The project specification can be done in-house by staff and, when it's completed, the specification can be offered to all interested parties in order to request quotes. You may also want to call for an RFP (request for proposal), a process that matches your specification to what each person thinks he or she can achieve with the new system. The RFP wrestles with the budgetary aspects of the project and tries to match your requirements to a budget that can do the job. The more complete a set of requirements you can provide at this stage, the more precise the RFP should be.

Many consulting firms want to charge for the RFP process because they expend significant resources to create an RFP. If you think you can get a fully developed project specification from an RFP you pay a consultant to create, it may be more economical and more successful go this route than to do it yourself. If you decide to give someone a commission, you should make no commitments to implement the project with the firm preparing your specification, and you should make sure that you have free rights to use the plan the firm produces as you see fit.

Wise consulting firms recognize that a project specification is a unique selling tool that gives them a leg up on the competition when it comes time to bid on the work. Indeed, Dave Rothfield from Creative Sales + Management (www.csm4tqs.com), a marketing consulting firm, has likened creating a project specification to docking a large ocean liner. When an ocean liner gets close to the dock, hands onboard throw down small ropes with large knots on the end, called monkey paws. Those small ropes are attached to the larger ropes that are then pulled in and fastened to the dock to secure the liner. Similarly, a small project specification can often lead to larger things.

The Budget Phase and Time Lines

A project specification should lead naturally to the second phase, which is where the budget is set for the project. You might find that the budget constrains your project so that some resources you hoped to deploy can't be deployed, or you might find that your budget can support more of the resources that you first specified, resulting in a larger project. Budgets are often negotiated. You might find that an initial budget results in rewriting the project specification to match a new reality, and that is all to the good. The more iterative the process, the more accurate and smooth the results.

A project specification should end up with the budget and time line appended to the back of the proposal and with signature sign-offs by all the significant parties concerned. The purpose of having a complete project specification sign-off on the budget and the time line is that it limits the project so that it doesn't run wild with what is often called "scope creep" or "feature creep," where additional functions are created, resources are deployed, and contractor time is spent beyond what the budget calls for. Feature creep is the single most common reason software projects run well over budget or fail.

For large deployment projects, you should use project management software such as Microsoft Project. When multiple resources are required to complete a deployment, and where one phase of the project is dependent on other related parts of the projects, there is too much activity going on for progress against a time line and budget to be manually tracked.

Note

Microsoft Project is probably the best-known project management software, but it is far from being the only product. Many other commercial products are available; some of them are web based, and others are shareware or open source products. For Yahoo's Project Management Software jump page, go to http://dir.yahoo.com/business_and_economy/business_to_business/computers/software/business_applications/Project_Management/.

Good project management software tracks costs versus budgets, tracks activities, and measures progress against project milestones. The most valuable feature of project management software is the identification of the critical path of the project. The critical path is the set of tasks that determines the overall progress of the project and affects its overall completion time. Identification of the critical path is essential in keeping many large projects on budget. If other people and project resources are inactive and waiting for critical tasks to be completed, your deployment will undoubtedly accrue additional costs for which you didn't plan.

It's important to realize that any deployment has two separate budgets. The first budget is the actual deployment, and the second budget (which is often ignored) is the annual operating costs. Many people simply don't take into consideration the annual costs, which can lead to some very nasty surprises later on.

A good project specification not only identifies the primary budget for deployment but projects the annual costs two or three years out. To create an operational budget, you have to make some assumptions about the following:

  • Reliability Server hardware is pretty reliable, but components do fail, and if you don't have a warranty covering the full replacement, you need to pay for replacements or warranties.

  • Cost of services being used As best you can, you should break down the budget so that each item in the project is budgeted individually. It's good project management to phase your project so that payments are made based on project milestones.

  • Electricity and other utilities These costs may include space rental if you are deploying servers in leased space, such as a cage at an ISP.

  • Support staff Of all of the aforementioned factors, IT support staff can be the most expensive component of a deployment. It is critical that staffing be accurately gauged and accounted for.

The electricity you use, the portion of floor space, and the amount of staff time needed to manage the system are knowable factors, even if you don't know what the exact cost of these things will be a year, two, or three from now. So be sure to include them in your specification.

The Testing Phase

A multiserver deployment should proceed to a testing phase in order to establish whether it is possible to meet the project specifications by using the hardware and software technologies selected during the project specification phase. The test bed you deploy should be a realistic microcosm of the larger system. The testing phase serves as a reality check.

If it's possible to stage your deployment so that your test bed is a module in a set of replicated modules, then there are fewer surprises when the deployment is implemented. However, that's not always possible.

Consider an example in which you are establishing a website with your deployment. The site will eventually have 50 blade servers in a couple racks. As a first pass, you decide to deploy two blade servers to test your design assumptions. Those webservers are loaded with traffic from a client load simulator or from actual users, and you derive your parameters from that experiment. You want to know how many simultaneous connections you can use, how much traffic the two blades support, the impact of adding more memory or changing disk configurations, and so forth.

Describing performance testing in detail would require an entire book. For now, you need to know that a load simulator will allow you to run your two servers at their full load but may not really be representative of the traffic that your servers will actually encounter when they are deployed. Therefore, a better test might be to select an appropriate subset of users whom you think will be representative of the actual loading in practice.

What you may come to realize in your testing is that your design requires that the webservers be fronted by an IP director that routes webserver requests to the least active webserver. To fully utilize the IP director you chose, you need to have it front 5 blade servers; however, when you create this design, you find that your 5 blade servers are now 20% more efficient than your original specifications thought they'd be. So this is an iteration of your project specification that needs to be accounted for: You need to change the design to add 8 IP directors and reduce your blade server count by 10, to 40.

Now not only does your project specification changed, but your test bed setup changes as well. In order to get the correct performance characteristics, you need to have a test bed setup that has one IP director fronting five blade servers. That test bed represents a base module that you can expand when you are confident that the system works.

The point is that your test bed phase should be a positive feedback loop. You alter your test bed to get the best results you can; then you change your budget accordingly. As you learn more about the system, you can go through another round to more precisely specify the system.

The most important criterion for running a test bed and getting the results you need is to be as accurate as you can. Creating a test bed with a representative sample of your actual users is preferred over running a client load simulator and trying to guess what loading your users will actually present to your system. If you can locate your test systems in the domain in which they are going to be positioned, you can determine the effects that the domain's security features have on your website's potential performance. Little things mean a lot, so if you can discover that the router in front of your IP directors is underpowered for the job in the test phase, you can save yourself that much extra grief later on.

The Implementation Phase

If you are deploying a single server or a small number of servers, chances are that your test phase and implementation are one and the same thing. However, for larger systems, implementation is the phase of deployment in which what was once small is made large. You know that the theory of your system is correct because you've verified it with a test setup. However, practice is not theory, and anything that involves money and people can still go very wrong.

Implementation is where your resource planning pays off. If your project planning is well done, then the equipment should be available when you need it, the people necessary to install and manage the equipment should be there when you need them, and the money to pay the bills should be available when it is needed. In this dream scenario, the bills don't appear until after the money is available. In the real world, it is never possible to choreograph all the dependent resources perfectly. Things just tend to go wrong, and you have to make the best of your situation.

The key to making an implementation work well is to recognize early on just which part of the deployment is going to be troublesome and plan around it. Not all troubles affect your deployment's schedule or budget. If you have a contractor who is slow but whose work is acceptable, and the person's work doesn't affect anyone else's work, then the contractor's speed is more an issue for his or her bottom line and not yours. Your real troubles appear when a task on the critical path for your deployment goes wrong. Because most people don't bother to analyze what their critical path is, you are already ahead of the game if you've thought this through.

The arrival of the servers you ordered is often one of the steps on any project's critical path. You can't set up your racks or install your applications until the servers are in place. So if you anticipate a problem with the delivery of the servers, you could tie a certain date into your contract with the OEM so that if the servers don't arrive because they take a new kind of high-speed memory that isn't readily available, for example, you can replace them with a similar model from another vendor. Alternatively, you could switch to a similar model in the original vendor's line that doesn't use that kind of memory. Or, you could take delivery of the originally specified model, using an older memory type, and then swap it out when the new memory becomes available. The point is that whatever you decide to do, you need to build into your project a set of contingencies that allow you to work around the problem.

One aspect of implementation involves making decisions about single-, dual-, or multisourcing. Single-sourcing is when you use only one vendor to supply a system, part, service, or even the entire project. A vendor that is your single-source partner for server deployments is essentially a master contractor, and you transfer the issues of selection to that vendor. Single-sourcing works well when the vendor has both the resources and track record to give you a high degree of certainty of the project's success.

Not all companies are comfortable with single-sourced projects, and for good reason. Even the very best contractor can run into trouble, for all the same reasons that you can. Most people want more control over their business and destiny. Therefore, having a second vendor as a backup or even as a minor participant, ready to take over in case of a problem, is prudent. Your chances of success are larger when you have a good backup plan. Many companies insist on having a second vendor for all their parts, simply because they don't want to build a product and have their business be dependent on a single partner.

Finally, many deployments are multivendor affairs that can be managed in-house by your IT staff, by a consultant, or by a firm that specializes in this kind of work. Multivendor deployments are more complex and difficult by nature to manage, but they offer the advantage of allowing you to pick the best vendor for the task, with best defined by some combination of quality, cost, and speed.

The Review Phase

The final phase in any planned project should be a review of what is accomplished. Often organizations are too busy with the work at hand to spend the time necessary to frame the lessons learned. However, if your server deployment is something that will be reproduced or extended later on, if you fail to create a final review, many of the lessons learned will be lost.

Even if you are only deploying a single server, it is valuable to record the original assumptions in selecting the equipment and software you didand how those assumptions were borne out in practice. Chances are that at some point in the future, you will be called on to upgrade the software or the server, or possibly replicate the application in another location.

After deployment, someone familiar with the entire project should be called upon to record the results obtained. A good place to start a review cycle is to begin circulating the project specification among selected IT staff and users. Many people scout around for document review software, and indeed there is a large category of that kind of software. However, you can use office productivity software such as Microsoft Word to generate a review, post a discussion in SharePoint, or even start an email chain letter to get the results you need. The point is that you need to be inclusive in obtaining feedback on the projects results from many different points of view.

You should ask people to comment on how the system compares with what was projected by annotating the document and by giving you comments to a number of pertinent questions, such as these:

  • Did the deployment meet its functionality and performance objectives?

  • What areas of the system(s) require improvement?

  • In hindsight, if you could change aspects of the project, what would those aspects be?

  • Are there any improvements to the system that should be considered moving forward?

  • How manageable is the system for IT staff?

  • How easy is it to use the system for users?

  • Do you have any general comments?

Your list of questions will undoubtedly be different from this, and it may involve evaluating specific contractors or employees, as well as delving into politics and other aspects of the project that you feel are important to document for projects to come. You might need to publish one form of the document while reserving sensitive matters for people who may have a need to know.

Категории