Buliding N1 Grid Solutions Preparing, Architecting, and Implementing Service-Centric Data Centers
Utility computing is often considered one of the greatest promises of the next decade of computing. Millions of dollars are spent every year on this new era of computing. UC is, however, difficult to define, as it seems to mean something different to everyone asked, and to every system vendor. UC can be narrowly defined as simply aligning pricing models to data center resources, or pay-as-you-go models. This definition is focused on the hardware, CPUs, storage systems, and sometimes the network. Almost everyone agrees with the benefits of UC:
Technically, UC requires the following items:
This model enables IT managers to move toward pool-based management in which systems and other resources are used as they are needed, rather than dedicated for the life cycle of the hardware. IT managers can use the N1 Grid software to reprovision and repurpose systems accordingly, without impact on the business unit. The pay-as-you-go model for hardware is usually a first step. This step focuses on reducing per-project costs and drives higher utilization and providing a better cost model. UC impacts the following three areas:
Utility is really about aligning as many IT costs with the business as possible, including not only system resources, but transactions, service levels, and most importantly, the users. The ability to create a model that supports strategic flexibility represents the most interesting side effect, enabling the business to move toward a service-based utility. Service Utility
After physical resources are better aligned with the business, the services, the applications, and the service levels can also be addressed. This is complicated, requiring better application instrumentation and monitoring, but it enables premium services and billing and higher degrees of flexibility. UC is often compared to another famous utility: the power company. The power company has a standard transaction (kilowatt-per-hour) that enables a user to "pay for the use" of power. In the popular model, the power company charges you each time you need capacity. Just as service has several different meanings in the context of IT, so does the customer. For example, operations provides hardware and operating environment services for application developers. Application developers and architects can provide a standard J2EE foundation service for other developers to use. The customer for application development is the business unit that can be charged on a transactional basis. The service utility moves toward billing for transactional and application usage, closely linking IT systems and their cost with the business. For example, cell phone users have become accustomed to purchasing ring tones over the air for their cellular phones. A user pays a small fee each time. This is an example of a service utility. The service utility enables companies to find the true cost of a business transaction, rather than simply being billed based on server utilization or storage utilization. For example, a business unit might be charged based on the number of HTTP transactions it handles through its service. This broader definition requires many different methods of data center and business operations than in the past. Most companies would need to achieve operational maturity-level 4 to benefit from the service utility. Utility Transformation
Why focus on billing and how it relates to the business? The business, and its relationship to the IT environment, has changed over the last decade. Companies now make purchasing decisions when they have to, and only when they have to, rather than overbuilding, provisioning, and disregarding the actual cost and maintenance of IT systems. IT decisions today are still made on a per-project basis. Servers and storage are over-provisioned because they are hard to change. Companies rarely shrink the resource footprint after they realize that systems are underutilized. It is too complicated, error prone, and costly. The billing and pricing models are important because they provide the incentive for the business to change how it does things. Aligning the IT organization costs to the transaction and to the business enables the business to see how much the services are actually costing. It enables the IT organization to charge the business based on operational cost, and most importantly, it enables the IT organization to make changes in its domain. If the IT organization wants to move servers around to achieve better cost efficiency or to change applications based on their desired service levels, then the physical IT resources cannot be owned by the business unit. The business unit should simply be charged for what it needs, according to its quality-of-service demands. This requires changes in the process of IT, in how systems are provisioned and managed, and in how projects use system resources. The N1 Grid software and Sun's managed and utility offerings can help companies move towards this model. The N1 Grid software offers the IT organization better ways to manage its existing and new systems, including infrastructure, application, and data center optimization, as discussed earlier in this book. These technologies can be used, along with others, to provide UC and the service utility. Utility Computing Enabling
The success of UC requires changes in the business and in the IT organization. The business must be able to change the way it has procured hardware and other system resources over the years, moving back to a well-known, but almost ancient model by today's standards called the mainframe. Companies did not generally buy a mainframe for each project. They worked with their IT co-workers to ensure that the current system had enough capacity. Applications had specific requirements for processors, operating system versions, storage, and other dependencies. If these were met, the system programmers installed, configured, and brought up the application. In some ways, UC really is not that much different. The IT organization needs to work with the business unit to understand its needs, including the number of CPU or processor units, the amount of storage, or whatever else is required for the service. The business unit needs to move away from the expectation that it owns some hardware in the data center. Rather, it must understand that it is only purchasing a service. The business needs to use cost incentives to help drive business units into accepting this type of model. The reality is that everyone will save money, and the business will achieve better results with a flexible utility model that enables growth and rightsizing of services. Business Changes for Utility Computing
Building the incentive for the business to change can be difficult, but real cost savings can be demonstrated with the utility model. For example, a business unit might purchase one million dollars worth of servers, and over the course of a few years, use only 20 to 30 percent of their capacity, based on industry averages. This method of purchasing wastes $700,000 to $800,000 on just hardware costs, not counting the ongoing management costs of the hardware. (Using industry estimates from Gartner, acquisition costs make up only 20 percent of the ongoing TCO). If the IT organization is given control over these resources through a UC model, several business units use services from the hardware, reducing TCO by reducing the number of systems to manage. The previous chapters focused on the technological concepts around creating operational efficiency in management of systems and their applications. Operational efficiency enables a different way of thinking and managing systems, called strategic flexibility. The business aspects of UC also help to enable strategic flexibility a cornerstone of the ultimate N1 Grid vision. By changing the acquisition model and the resulting effects on the IT organization, strategic flexibility can become a reality. The IT organization would gain the ability to rightsize servers, applications, and their services to gain both internal and external competitive advantages. Systems would no longer be static. They would become dynamic, expanding and shrinking as the business requires. The technology to support the dynamic data center is a core part of the N1 Grid software. Utility Technologies and the N1 Grid Software
UC and the N1 Grid software require many of the same systems and tools described earlier in this book. UC implementations follow the basic data center best practices and use data center tools to help automate common functions such as infrastructure and application provisioning. The N1 Grid software uses pools of resources and allocates systems based on service needs. This can be performed manually by using the N1 Grid software today, but it will be automated as additional features are added to the N1 Grid software. FIGURE 11-1 shows the N1 Grid software in relation to other Sun technologies, including Sun remote services. Here, the core system running Sun Java Enterprise System (Java ES) is managed by the N1 Grid software and integrated with Sun's remote management systems. Figure 11-1. UC Architecture Diagram Showing Remote Services, N1 Grid PS, and N1 Grid SPS
The N1 Grid PS can be used to load common operating system images, based on service requirements. It can also load common management agents, including UC billing products, such as Sun's remote services or other third-party products. Systems can be brought into the resource pool by either acquiring new hardware, such as through Sun's utility program, or through decommissioning existing servers. As systems enter the pool, they can be refreshed and updated with the latest operating system loads and tools, so they are ready for application and service provisioning. The N1 Grid SPS software can then be used to add applications based on business needs. It can perform other end-to-end service provisioning tasks, such as updating an external DNS system. Adding applications and enabling external systems enables the N1 Grid SPS to bring up an entire service or just components of the service, as required. Analysis with Utility Computing
UC requires the instrumentation, telemetry, and management systems described in the previous sections, along with a strong emphasis on tools to facilitate chargeback and billing models. Tools are also necessary to rightsize resources based on usage and to provide input into the acquisition of new resources. In "The IT Utility Model Part II" (Sun BluePrints Online, August 2003), Emily Pagden discusses some of the tools used to enable the chargeback model. It also discusses aspects of the Sun Remote Services technology, which enables companies to purchase systems from Sun in a utility supplier model and enables remote management capabilities. Today, UC resources are usually billed back to the user or business unit based on resources allocated, not actually utilized. This is important, as users might want the ability to purchase services at a service level that enables them to grow into their allocated resources. Service Level Management with Utility Computing
As businesses move to UC, the IT organization must provide services at an adequate service level for UC to be successful. This requires both clear and concise service level agreements and tools to measure service availability. Most server operators measure availability by platform, but the industry is moving to end-to-end service levels as being the most important. This takes into account not only platform availability, but also application availability (including external systems and database servers). It might also include performance requirements, which are very common in financial services. Some end-to-end service level management tools include Micromuse ISM and Mercury Topaz. They perform application-based checks similar to how a user or other system might actually use the service. This enables the tools to develop ongoing performance and availability measurements of the entire service. However, individual tools to monitor and analyze the components of the service are still important. They are necessary in decomposing the issues brought to the surface by the service level management platform, and they are important in providing additional information for troubleshooting. Consolidation with Utility Computing
Another important aspect of enabling UC is looking at systems and applications for consolidation. Server consolidation enables companies to look at the allocated resources and their applications and to analyze those for better density and increased utilization. Consolidation also enables a reduction in TCO by potentially reducing the number of managed systems. Service and application profiling enable architects to look at the various components that make up a service, list their requirements, and determine how they could be better combined to improve TCO. This process can be used to determine the viability of moving existing services into the UC model. As services get updated, they could be deployed using UC or a plan could be put forth to migrate existing services, depending on the complexity and potential ROI. Appendix B contains references to documents and white papers that discuss server consolidation and service level management. Many of the application profiling activities that are necessary for UC and N1 Grid software implementations are also common with server consolidation. Standards for the Next Decade
The opportunity and cost savings in reducing ongoing IT costs are quite real. Standards and interoperability between system vendors are still under development. Unfortunately, standards work is time consuming, so companies should not wait to implement UC-based services. The N1 Grid system software enables companies to gain some of the business and technical benefits of UC. Although UC focuses on the business aspects of TCO and the technical aspects of dealing with service growth and contraction, grid computing concentrates on some of these same aspects, while enabling a very distributed architecture. Appendix B contains resources that discuss the implementation of utility computing, the tools, the technologies, and the financial models that can be used to build a service utility. |