Data Mining: Opportunities and Challenges

Chapter XV - Data Mining in Health Care Applications
Data Mining: Opportunities and Challenges
by John Wang (ed) 
Idea Group Publishing 2003

Brought to you by Team-Fly

Data mining has often been defined as a "process of extracting previously unknown, valid and actionable information from large databases and then using the information to make critical business decisions" (Cabena, Hadjinian, Stadler, Verhees, & Zanasi, 1998). This definition is based on the premise that an organization can link its myriad sources of data into a data warehouse (to potentially include data marts). Further, these data sources can evolve to a higher degree of analyses to include exploration using on-line analytical processing (OLAP), statistical analyses, and querying. One mechanism that can be used to migrate beyond exploration and permit organizations to engage in information discovery is the Sample, Explore, Modify, Model and Assess (SEMMA) Methodology, as developed by the SAS Corporation (http://www.sas.com/). In sum, SEMMA enables its users to: 1) test a segment of data (test data) to be mined for preliminary outcomes; 2) assess this sample data for outliers, trends, etc.; 3) revise the sample data set for model testing, which can include neutral networks, linear regression, and decision trees; 4) evaluate the model for accuracy; and 5) validate the model using the complete data set (Groth, 1998).

In general, the rationale for mining data is a function of organizational needs and type of firm's role in the industry. Moreover, researchers (Hirji, 2001; Keim, 2001) have established that data-mining applications, when implemented effectively, can result in strategic planning and competitive advantage. In an effort to minimize the collection and storage of useless and vast amounts of data, mining can identify and monitor which data are most critical to an organization thereby providing efficient collection, storage, and exploration of data (Keim, 2001). The efficiencies associated with mining of "most vital data" can result in improved business intelligence, heightened awareness of often overlooked relationships in data, and discovery of patterns and rules among data elements (Hirji, 2001). Others (http://www.sas.com/; Groth, 1998) offer that data mining permits improved precision when executing:

  • Fraud Detection and Abuse;

  • Profitability Analysis;

  • Customer (Patient) Profiling; and

  • Retention Management.

In particular, data mining offers the health care industry the capabilities to tackle imminent challenges germane to its domain. Among them being:

  • Fraud Detection and Abuse identification of potential fraud and/or abuse; this is applicable to insurance claims processing, verification and payment

  • Profitability Analysis determination of categories of profit and loss levels; this can be practical for analyses of diagnosis related groups (DRGs) and managed care enrollments

  • Patient Profiling discovery of patients' health and lifestyle histories as they can impact medical coverage and service utilization

  • Retention Management identification of loyal patients and services they use; as well as those patients that depart from a particular insurance group, community and profile segment

Brought to you by Team-Fly

Категории