Beowulf Cluster Computing With Linux 2003
|
16.4 Steering Workload and Improving Quality of Information
A good scheduler can improve the use of a cluster significantly, but its effectiveness is limited by the scheduling environment in which it must operate and the quality of information it receives. Often, a cluster is underutilized because users overestimate a job's resource requirements. Other times, inefficiencies crop up when users request job constraints in terms of job duration or processors required that are not easily packed onto the cluster. Maui provides tools to allow fine tuning of job resource requirement information and steering of cluster workload so as to allow maximum utilization of the system.
One such tool is the feedback interface, which allows a site to report detailed job usage statistics to users. This interface provides information about the resources requested and those actually used. With the FEEDBACKPROGRAM parameter, local scripts can be executed that use this information to help users improve resource requirement estimates. For example, a site with nodes of various memory configurations may choose to create a script such as the following that automates the mailing of notices at job completion:
Job 1371 completed successfully. Note that it requested nodes with 512 MBytes of RAM yet used only 112 MBytes. Had the job provided a more accurate estimate, it would have, on average, started 02:27:16 earlier.
While such notices can be used to improve memory, disk, processor, and wall-time estimates, they may be freely ignored by the end user. A more forceful approach is to use the allocation manager charge policy so as to charge users for requested resources rather than used resources. This approach quickly motivates end users to evaluate their true job needs and adjust their job requests accordingly.
Another realm of feedback involves steering jobs to use currently available resources. The showbf command is designed to help users tailor jobs to request resources that are free for immediate use. This command allows users to incorporate specific information about what they need and who needs it, allowing all scheduling policies and resource availability information to be integrated into the response. Users may specify details about the prospective job including user, group, queue, and memory requirements, and the command returns information regarding the quantity of available nodes and the duration of their availability.
A third area of user feedback is job scaling. Often, users will submit parallel jobs that only moderately scale, hoping that by requesting more processors, their job will run faster and provide results sooner. A job's completion time is simply the sum of its queue time plus its execution time. Users often fail to realize that a larger job may be more difficult to schedule, resulting in a longer queue time, and may run less efficiently, with a sublinear speedup. The increased queue-time delay, together with the limitations in execution time improvements, generally results in larger jobs having a greater average turnaround time than smaller jobs performing the same work. Maui commands such as showgrid can provide real-time job efficiency and average queue-time statistics correlated to job attributes such as job size. The output of the mprof command can also be used to provide per user job efficiency and average queue time correlated by job size and can alert administrators and users to this problem.
|