Design for Trustworthy Software: Tools, Techniques, and Methodology of Developing Robust Software

Problem 1: Managing Complexity in System Conversion

A major research university began preparing for Y2K in 1991. Y2K became an issue in 1996 in higher education due to the need to enroll the class of 2000 and prepare their student loans. The university's long-term goal was to develop a new Y2K-compliant client/server enterprise administrative application suite sharing a common base of data. The new system would serve several thousand PC-based clients and several thousand Mac-based clients. It would include applications such as purchasing that hitherto had eluded automation. Unfortunately, in the early '90s no software vendor could supply software to handle a research university with 24,000 students. The existing student registration and enrollment systems were no longer supported by their vendors, which had been merged into larger companies. The research university market in the United States amounts to perhaps 50 to 75 institutions. It has not been pursued as a viable market by software firms, which target smaller, nonresearch institutions with enrollments of 1,500 to 6,000 students.

The university's choices were as follows, in order of developmental complexity:

  • Reverse-engineer all mainframe applications using Texas Instruments TEF and make them Y2K-compliant by "windowing." In other words, let xx = 19xx if xx>50 and 20xx if xx<50. Make code fixes at the flowchart level without changing the individual application databases. This is a labor-intensive but "quick and dirty" solution and is the least complex.

  • Replace all mainframe applications with new client/server applications having a common base of data, using third-party software firms. Exclude the critical non-Y2K-compliant (and highly customized) student registration and student finance systems, which are unavailable from third-party vendors for a university with an enrollment of more than 20,000 students. These applications will be reengineered by consultant contract programmers using Xcellerator four-digit year dates incorporated using COBOL filler columns in their individual databases. The code will be redocumented and reimplemented to use these noncontiguous four-digit year dates.

  • Replace all mainframe applications with Y2K-compliant client/server applications. Contract for new student systems with a vendor willing to develop or enhance smaller applications at the required scale to meet the university's enrollment. Two vendors made offers to do so, but neither had actually produced systems like these at the required scale. (Complexity prevents many enterprise software applications from being scaled up by a factor of 2 or more.)

The following table summarizes the factors involved in this decision. The first choice was a low-cost but labor-intensive approach that would further customize mainframe applications that are no longer supported by their vendors. Although this could be done between 1992 and 1996 using advanced COBOL software reengineering tools, it would do absolutely nothing for the longer-term goal of achieving a new client/server system with a single base of data. In this case the goal would be reduced to meeting only the near-term necessity of being Y2K-compliant by 1996.

The second choice was to install a completely new suite of client/server apps from one or more vendors all using the same relational database system. The most flexibility was obtained by choosing Oracle as the RDBMS (relational database) because it has 70% market share in this market. Because software at the proper scale was not available for student registration and finance, those applications would be reverse-engineered as Y2K-compliant client/server apps until they could be replaced with commercially available software when it became available.

The third choice would require trusting a vendor that had never built a registration system that could handle 24,000 students. One vendor was testing a system that could handle 11,000 students, and the other had never produced such a system. Both eagerly sought a contract with the university. The CIO thought this approach was too risky. The former system appeared to be unable to scale by a factor of more than 2, and the second firm had no experience in this application area. Hindsight later justified these considerations. The latter firm later secured a contract to build a student registration system for a research university having an enrollment of 56,000 students, but it failed.

Option

Capital Cost

Labor Cost

Technical Risk

Business Risk

Time to Implement

A: Retread mainframe apps

Low

High

Medium

High

Very high

B: Replace partially

Medium

Medium

Low

Low

Medium

C: Replace completely

High

Medium

Very high

High

High

Sometimes it's easier to cross a brook in two steps by using a stepping-stone in the middle than to make a great leap and risk falling into the water. The CIO, MIS director, and system development director recommended and implemented Option B and successfully completed the new system on time and within budget. Did they make the best decision? Examine the results for sensitivity. Could the risk of a complete replacement solution have somehow been managed?

Problem 2: Managing Software Complexity in a High-Tech Start-up Enterprise

In 1990 a technology transfer program was set up under a NASA contract at a small university to make technology developed at 1,200 national laboratories available to American business entrepreneurs. The goal of this system was to allow the vice president of technology or chief technologist of a small, high-tech American company to e-mail a query to the technology transfer center. The reformatted query would be forwarded to the appropriate national lab technology transfer databases. The center would return a suitable response and licensing information as a package. The query engine employed a technical thesaurus of 5,000 terms and an ancillary fuzzy logic processor with 30,000 terms. The CIO of the startup center was faced with the following choices, in order of complexity:

  • Use conventional RDBMS technology to access the laboratory databases (which were running Oracle, anyway), and then use fax, e-mail, FedEx, and so on to convey unstructured data. This choice was low-tech, clunky, and highly manual.

  • Purchase a high-performance, novel, fully featured (but unproven) object-oriented database system (OODBMS) from a high-tech start-up firm. Implement the entire technology transfer system as truly state-of-the-art technology. It would return to the user a package of objects and unstructured data types, including patents, test data, reports, photographs, film clips, revisions, and updates.

  • Choose a multiapplication approach with a fit of the best proven technology in available library and RDBMS software packages for each aspect of the application. Stitch them together to make the overall system as automatic and electronic as possible, essentially emulating OODBMS technology.

The following table summarizes the options considered by the developers. The first choice was considered the easiest means of getting a national technology transfer system operating as soon as possible. The system would essentially be manual in that it required making hard copies of documents and sending them to potential users using traditional media transfer methods. The start-up process would be short. However, there was a great danger that if it was successful, the system probably would not be able to handle the expected volume, and costs would increase disproportionately.

The second choice, which the CIO chose, was to employ novel but untested object-oriented database management technology to develop an ideal system that could obtain multimedia files from the national laboratories and package them as unstructured data objects for electronic transmission. The primary risk was that the OODBMS product would not come to market. Or, even if it did, the vendor would fail, or the product would not perform well. In fact, all three happened a few months into development, forcing the third option to be used.

The third choice was to emulate OODBMS technology by a more-or-less compatible suite of proven but older technology products. All the development team's programming skills would then be employed in stitching together these tools around an object-oriented front end that dealt with the remote user by furnishing unstructured data objects in the reply to the original query.

Option

Capital Cost

Labor Cost

Technical Risk

Business Risk

Competitive Advantage

Time to Implement

A: RDBMS technology

Medium

Medium

Low

Low

Low

Low

B: OODBMS technology

Medium

High

Medium

Medium

Medium

Medium

C: OODBMS emulation

Medium

Low

High

High

High

Low

The CIO of this start-up activity chose Option B but was forced to retreat to Option C when the novel OODBMS could not meet the required performance goals. The new software was plagued by internal complexity issues and turned out to be a bit beyond state of the art. Run an analysis with this data, and rank for yourself the complexity of the three options in terms of combined technical risk and business risk. If the passage of 25 years has managed to lower technical risk to medium and business risk to low for Option B, would Option B now be viablethat is, significantly less complex than it was?

Problem 3: Complexity in Patient Record Systems

So far, the complexity of medical patient record (PR) systems has eluded electronic automation. Such systems are an unsolved problem given today's medical and computer technology. A complex patient record in a large medical center may contain manually completed forms, computer forms, medical test results, handwritten notes, referral letters, photographs, X-rays, CAT scans, MRI scans, PET scans, EKGs, and EEGs, and may be 5 inches thick! A major medical research center at a large state university won a grant to use OODBMS technology to package this amalgam of unstructured data types and their revisions into a 100% machineable (electronically packaged and transmitted) object. The goal was clear, but it was deemed very high-risk. However, a useable system was required at the end of the project. The project investigated three options, in perceived order of complexity:

  • Index electronically the physical file of multimedia documents and transmit only a highly abstracted index electronically. The physical documents would be forwarded manually as needed to support medical decisions. This approach was low-tech and very clunky, but it was much better than the existing PR system.

  • Develop an OODBMS-based PR system from scratch, optimized for medical records and the developing medical center. This was high-tech but very high-risk at the time (and probably still would be today!).

  • Choose hybrid relational/object technology such as the then newly announced Informix Datablade™ technology.

The following table summarizes the salient factors among these choices. The first choice would be a very low-risk approach to get something working that would provide some improvement. But it would not honor the grant's intentto truly automate a patient record system. It was considered not as a realistic choice but as a baseline for risk, cost, and time to implement.

The second option was the ideal or the goal of choice but was a truly pioneering effort. The significant risk of failure as a research project alone would alienate the medical staff, who would not be content with a negative or vacuous result. The team had to produce something useful! Note that they could design to this goal and back off if complexity or lack of suitable technology rendered it infeasible.

The third option, although it was considered seriously from the beginning, was rejected as not being the stretch objective that the granting agency expected. Fortunately, it was carefully evaluated and well understood, because the development team was forced to default to it in the end.

Option

Development Cost

Capital Cost

Technical Risk

Business Risk

Competitive Advantage

Time to Implement

A: Index manually

Low

Low

Low

Low

Low

Low

B: OODBMS

High

Low

High

High

High

High

C: Relational/object

Medium

High

Medium

Low

Low

Medium

Rank the true complexity of these options by performing an analysis and choosing one or more indicators of complexity in your formula.

The center chose to attempt Option B to honor the grant's intent. It included 25 NeXT workstations on a peer-to-peer network. The project was unsuccessful, primarily because of the medical staff's inability to agree on the system's functional requirements. Although this may be termed organizational complexity rather than software complexity, it was certainly a major design factor. The system developers fell back to Option C to get something working in the grant period. However, Option C lacked competitive advantage, because it could not be licensed to other medical centers or third-party vendors. From the sensitivity analysis chart, estimate what the Option B factors would have to become to make this project viable. (Assume that you could somehow convince the physicians to agree on a functional specification.)

Problem 4: Oil Well Drilling Decision System

Drilling for oil is a high-risk endeavor. Each well costs at least $10 million to drill, but on average only 35% of the information available is used to make a drilling decision. Oil well database persistence may be a worst case in the computer field. Some of the historical data that needs to be kept may go back 150 years to Indian treaties, letters patent, legal titles, geological data and reports, geological tomography, and test drill logs. In addition, any new drilling data must be kept for 50 years. Petroleum prospectors agree that drilling decisions would be very well-informed if 85% of the available data was used. However, the data is not readily accessible because it is in many different forms (media) at many different locations. A major U.S. petroleum company with significant computer resources undertook a research study with the goal of developing a drilling decision system using the latest available software technology at the time. It would attempt to include as much of the multimedia data as available. The options considered were as follows:

  • Build a sophisticated worldwide index to all available data, based on both location and similarity to other data sets and their drilling results. Abstract the data for later electronic transmission, irretrievability, and analysis, but leave it where it is until it's needed for the drilling decision process for a new well.

  • Build an OODBMS that can incorporate all unstructured multimedia data as objects so that the drilling decision team has all the data available to make a fully informed decision.

  • Build a relational system that automates as much of the machineable data as possible but merely indexes the data elements that are not readily machineable to electronic form.

The factors employed in making this decision are summarized in the following table. The first choice was to simply automate only the index of data available and use it to retrieve data manually in either hard copy or electronic form at the time it was needed to make a drilling decision. Although this approach offered some improvement over the current state, it did not represent a suitable goal for a research project and would not have had a significant return on investment (ROI).

The second choice was truly a stretch objective given the state of software technology at the time. The Norwegian Research Defense Establishment had invented object-oriented technology in the form of Simula™ to manage Norway's forest reserves in the 1960s. Simula and other similar packages were not equipped to handle the kind of multimedia data sets involved in petroleum exploration. Still, if this project could be carried to success, the competitive advantage it would give the company would be incalculable.

The third choice was to push relational database management technology to the limit in spite of its inability at that time to handle complex and unstructured data types and to do revisions rather than just data set updates. The team knew it could be done and would be a major improvement, but would it be good enough?

Option

Development Cost

Technical Risk

Business Risk

Competitive Advantage

Estimated Complexity

Time to Implement

A: Index

Low

Low

Low

40%

1

Low

B: OODBMS

High

High

High

85%

9

High

C: RDBMS

Medium

Medium

Medium

55%

5

High

The team chose Option C but did not go into implementation because the ROI for such a system was not convincing to senior management. Which decision do you think was best then? How about today?

Problem 5: The ROI Issue

Note that the analysis in Problem 4 did not include an estimate of financial payback for the effort because there was so little experience with drilling decisions made with better information. What column(s) would you add to this table to be able to estimate the payback for each option? How sensitive is each analysis to the abstract estimate of system complexity in column 6?

Problem 6: An Abstract Complexity Analysis

Consider the perspective on complexity presented in Problem 2. Assume that a certain enterprise application could be built with three levels of modularity, as summarized in the following table. Option A uses relatively few large components, each of which is of rather high complexity. Option B uses five times as many smaller and much less complex components but involves a slightly higher degree of technical risk and more testing. Option C uses many small and simple or "atomic" components but has high extrinsic complexity due to the intercomponent communication required.

Option

Module Size

Number Required

Relative Complexity

Technical Risk

Testing Effot

Time to Implement

A: Subsystem functionality

Large

30

7

Low

Low

Low

B: Business functions

Medium

170

4

Medium

Medium

Medium

C: Atomic functions

Small

2500

9

High

High

Medium

Which is the preferred implementation strategy? Why? (Note that time to implement may be dramatically reduced in high-granularity or atomic module systems by a sophisticated proprietary development environment such as JD Edwards OneWorld™.)

Problem 7: Sensitivity to Complexity

How sensitive is the analysis of Problem 6 to the estimated complexity? Are the decision choices or implementation options well-differentiated by the estimate of relative complexity? Are they equally well-differentiated by the number of modules needed to complete an application? Why does having more modules in an application require more testing time?

Категории