HP VantagePoint Performance Agent and HP VantagePoint Performance Analyzer/UX There are performance tools that track and chart data over a long period of time. System administrators often call this exercise "capacity planning." The goal of capacity planning is to view what system resources have been consumed over a long period of time and determine what adjustments or additions can be made to the system to improve performance and plan for the future. We'll use HP VantagePoint Performance Agent (what used to be MeasureWare Agent) and HP VantagePoint Performance Analyzer/UX (what used to be PerfView Analyzer) together to take a look at the performance of a system. These tools run on HP-UX and are similar to many advanced tools that run on other UNIX variants. The VantagePoint Performance Agent is installed on individual systems throughout a distributed environment. It collects resource and performance measurement data on the individual systems. The VantagePoint Performance Analyzer/UX management console, which you would typically install on a management system, is then used to display the historical VantagePoint Performance Agent data. You could also set alarms to be triggered off by exception conditions using the VantagePoint Performance agent. For instance, if the VantagePoint Performance agent detects an exception condition, such as CPU utilization greater than 90%, it produces an alarm message. The alarm messages are then displayed with VantagePoint Performance Analyzer/UX. We're going to use the VantagePoint Performance Analyzer/UX in our upcoming examples; however, there are really three VantagePoint Performance components: Monitor | Provides alarm monitoring capability by accepting alarms from VantagePoint Performance and displays alarms. | Planner | Provides forecasting capability by extrapolating VantagePoint Performance data for forecasts. | Analyzer | Analyzes VantagePoint Performance data from multiple systems and displays data. You can view the data from multiple systems simultaneously. | In our example, we will be working with a single system. We'll take the VantagePoint Performance data, collected over roughly a one-week period, and display some of it. In this example, we won't take data from several distributed systems and we'll use only one server in the example. HP VantagePoint Performance Agent produces log files that contain information about the system resource consumption. The longer HP VantagePoint Performance Agent runs, the longer it records data in the log files. I am often called to review systems that are running poorly to propose system upgrades. I usually run HP VantagePoint Performance Agent for a minimum of a week so that I obtain log information over a long enough period of time to obtain useful data. For some systems, this time period is months. For other systems with a regular load, a week may be enough time. After having run VantagePoint Performance for a week, I invoked VantagePoint Performance Analyzer/UX to see the level of system resource utilization that took place over the week. The graphs we'll review are CPU, Memory, and Disk. Figure 12-11 shows Global CPU Summary for the week: Figure 12-11. Global CPU Summary Screen You can adjust every imaginable feature of this graph with VantagePoint Performance Analyzer/UX. Unfortunately, the color in this graph is lost in the book. The colors used allow you to discern the parameters when viewing the graph on the computer screen. Total CPU utilization is always the top point in the graph and it is the sum of system and user mode utilization. Figure 12-11 shows classic CPU utilization with prime hours reflecting high CPU utilization and non-prime hours reflecting low CPU utilization. In some respects, however, this graph can be deceiving. Because there is a data point occurs every three hours, hence the eight ticks per 24-hour period, you don't get a view of the actual CPU utilization during a much smaller window of time. We can't, for instance, see precisely what time in the morning the CPU becomes heavily used. We can see that it is between the second and third tick, but this is a long time period - between 6:00 and 9:00 am. The same lack of granularity is true at the end of the day. We see a clear fall-off in CPU utilization between the fifth and seventh ticks, but this does not give us a well defined view. Figure 12-12 shows CPU utilization during a much shorter time window. Figure 12-12. Global CPU Summary - Short Time Period Figure 12-12 shows a finer granularity of CPU utilization during the shorter time window. The much finer granularity of this window makes clear the activity spikes that occur throughout the day. For instance, a clear login spike occurs at 8:30 am. Memory utilization can also be graphed over the course of the week, as shown in Figure 12-13. Figure 12-13. Global Memory Summary Screen The user memory utilization is the bottom line of the graph, which roughly corresponds to the CPU utilization shown earlier. User memory utilization is low during non-prime hours and high during prime hours. System memory utilization is the middle line of the graph, which remains fairly steady throughout the week. Total memory utilization is always the top line of the graph, and it is the sum of system and user utilization. It rises and drops with user utilization, because system memory utilization remains roughly the same. The three-hour interval between data points on this graph may not give us the granularity we require. Figure 12-14 shows memory utilization during a much shorter time window. Figure 12-14. Global Memory Summary - Short Time Period Figure 12-14 shows a finer granularity of memory utilization during the shorter time window. You can now see precisely how memory utilization is changing over roughly one day. Disk utilization can also be graphed over the course of the week, as shown in Figure 12-15. Figure 12-15. Global Disk Summary Like the CPU and memory graph, this is an entire week of disk usage. Because many spikes occur on this graph, we would surely want to view and analyze much shorter time windows. Figure 12-16 shows disk utilization during a much shorter time window. Figure 12-16. Global Disk Summary - Short Time Period This much shorter time window, of roughly three hours, shows a lot more detail. There are tremendous spikes in disk activity occurring in the middle of the night. These could take place for a variety of reasons, including batch job processing or system backup. You are not limited to viewing parameters related to only one system resource at a time. You can also view the way many system resources are used simultaneously, as shown in Figure 12-17. Figure 12-17. Global Summary History Screen Many system resources are present on this graph, including CPU, disk, and memory. You would surely want to view a much shorter time period when displaying so many system resources simultaneously. Figure 12-18 shows the same parameters during a much shorter time window. Figure 12-18. Global Summary - Short Time Period Figure 12-18 shows a finer granularity of the utilization of many system resources during the shorter time window. You can now view the ways in which various system resources are related to other system resources. You can find the status of VantagePoint Performance Analyzer/UX running on your system with a useful command called perfstat. The following example shows issuing the perfstat command with the -? option to see all perfstat options: # perfstat -? usage: perfstat [options] Unix option Function ----------- -------- -? List all perfstat options. -c Show system configuration information. -e Search for warnings and errors from performance tool status files. -f List size of performance tool status files. -p List active performance tool processes. -t Display last few lines of performance tool status files. -v List version strings for performance tool files. -z Dump perfstat info to a file and tar tape. Using the -c option, you get information about your system configuration, as shown in the following listing: # perfstat -c ********************************************************** ** perfstat for rp-ux6 on Fri May 15 12:20:06 EDT ********************************************************** system configuration information: uname -a: HP-UX ux6 B.11.00 E 9000/800 71763 8-user license mounted file systems with disk space shown: Filesystem kbytes used avail %used Mounted on /dev/vg00/lvol3 86016 27675 54736 34% / /dev/vg00/lvol1 67733 44928 16031 74% /stand /dev/vg00/lvol8 163840 66995 90927 42% /var /dev/vg00/lvol7 499712 358775 132155 73% /usr /dev/rp06vgtmp/tmp 4319777 1099297 3134084 26% /tmp /dev/vg00/lvol6 270336 188902 76405 71% /opt /dev/vgroot1/var 640691 15636 605834 3% /newvar /dev/vgroot1/usr 486677 356866 115210 76% /newusr /dev/vgroot1/stand 67733 45109 15850 74% /newstand /dev/vgroot1/root 83733 21181 54178 28% /newroot /dev/vgroot1/opt 263253 188109 67246 74% /newopt /dev/vg00/lvol5 20480 1109 18168 6% /home LAN interfaces: Name Mtu Network Address Ipkts Opkts lo0 4136 127.0.0.0 localhost 7442 7442 lan0 1500 192.60.11.0 rp-ux6 7847831 12939169 ************* (end of perfstat -c output) **************** Using the -f option shows the size of the performance tools status files, as shown in the following listing: # perfstat -f ********************************************************** ** perfstat for ux6 on Fri May 15 12:20:08 EDT ********************************************************** ls -l list of performance tool status files in /var/opt/perf: -rw-rw-rw- 1 root root 7812 May 10 19:35 status.alarmgen -rw-r--r-- 1 root root 0 May 10 02:40 status.mi -rw-rw-rw- 1 root root 3100 May 10 02:40 status.perflbd -rw-rw-rw- 1 root root 3978 May 10 02:40 status.rep_server -rw-r--r-- 1 root root 6079 May 11 23:30 status.scope -rw-r--r-- 1 root root 0 Mar 31 07:26 status.ttd ************* (end of perfstat -f output) **************** Using the -v option displays the version strings for the performance tools running, as shown in the following listing: # perfstat -v ********************************************************** ** perfstat for ux6 on Fri May 15 12:20:08 EDT ********************************************************** listing version strings for performance tool files: NOTE: The following software version information can be com pared with the version information shown in the /opt/perf/ReleaseNotes file(s). MeasureWare executables in the directory /opt/perf/bin scopeux C.01.00 12/17/97 HP-UX 11.0+ ttd A.11.00.15 12/15/97 HP-UX 11.00 perflbd C.01.00 12/17/97 HP-UX 11.0+ alarmgen C.01.00 12/17/97 HP-UX 11.0+ agdbserver C.01.00 12/17/97 HP-UX 11.0+ agsysdb C.01.00 12/17/97 HP-UX 11.0+ rep_server C.01.00 12/17/97 HP-UX 11.0+ extract C.01.00 12/17/97 HP-UX 11.0+ utility C.01.00 12/17/97 HP-UX 11.0+ mwa A.10.52 12/05/97 perfstat A.11.01 11/19/97 dsilog C.01.00 12/17/97 HP-UX 11.0+ sdlcomp C.01.00 12/17/97 HP-UX 11.0+ sdlexpt C.01.00 12/17/97 HP-UX 11.0+ sdlgendata C.01.00 12/17/97 HP-UX 11.0+ sdlutil C.01.00 12/17/97 HP-UX 11.0+ Measureware libraries in the directory /opt/perf/lib libmwa.sl C.01.00 12/17/97 HP-UX 11.0+ libarm.a A.11.00.15 12/15/97 HP-UX 11.00 libarm.sl A.11.00.15 12/15/97 HP-UX 11.00 Measureware metric description file in the directory /var/opt/ perf metdesc C.01.00 12/17/97 All critical MeasureWare files are accessible libnums.sl B.11.00.15 12/15/97 HP-UX 11.00 midaemon B.11.00.15 12/15/97 HP-UX 11.00 glance B.11.01 12/16/97 HP-UX 11.00 gpm B.11.01 12/16/97 HP-UX 11.00 ************* (end of perfstat -v output) **************** |