SAS 9.1 Language Reference Concepts

Why Use the Hash Object?

The hash object provides an efficient, convenient mechanism for quick data storage and retrieval. The hash object stores and retrieves data based on lookup keys.

To use the DATA step Component Object Interface, follow these steps:

  1. Declare the hash object.

  2. Create an instance of ( instantiate ) the hash object.

  3. Initialize look-up keys and data.

After you declare and instantiate a hash object, you can perform many tasks , including the following:

For example, suppose that you have a large data set that contains numeric lab results that correspond to patient number and weight and a small data set that contains patient numbers (a subset of those in the large data set). You can load the large data set into a hash object using the patient number as the key and the weight values as the data. You can then iterate over the small data set using the patient number to look up the current patient in the hash object whose weight is over a certain value and output that data to a different data set.

Depending on the number of lookup keys and the size of the data set, the hash object lookup can be significantly faster than a standard format lookup.

Declaring and Instantiating a Hash Object

You declare a hash object using the DECLARE statement. After you declare the new hash object, use the _NEW_ statement to instantiate the object.

declare hash myhash; myhash = _new_ hash();

The DECLARE statement tells the compiler that the variable MYHASH is of type hash. At this point, you have only declared the variable MYHASH. It has the potential to hold a component object of type hash. You should declare the hash object only once. The _NEW_ statement creates an instance of the hash object and assigns it to the variable MYHASH.

As an alternative to the two-step process of using the DECLARE and the _NEW_ statement to declare and instantiate a component object, you can use the DECLARE statement to declare and instantiate the component object in one step.

declare hash myhash();

The above statement is equivalent to the following code:

declare hash myhash; myhash = _new_ hash();

For more information about the 'DECLARE Statement' and the '_NEW_ Statement', see SAS Language Reference: Dictionary .

Initializing Hash Object Data Using a Constructor

When you create a hash object, you might want to provide initialization data. A constructor is a method that you can use to instantiate a hash object and initialize the hash object data.

The hash object constructor can have either of the following formats:

These are the valid hash object argument tags:

hashexp: n

dataset: ' dataset_name '

ordered: ˜ option '

For more information on the 'DECLARE Statement' and the '_NEW_ Statement', see SAS Language Reference: Dictionary .

Defining Keys and Data

The hash object uses lookup keys to store and retrieve data. The keys and the data are DATA step variables that you use to initialize the hash object by using dot notation method calls. A key is defined by passing the key variable name to the DEFINEKEY method. Data is defined by passing the data variable name to the DEFINEDATA method. When all key and data variables have been defined, the DEFINEDONE method is called. Keys and data can consist of any number of character or numeric DATA step variables.

For example, the following code initializes a character key and a character data variable.

length d ; length k ; if _N_ = 1 then do; declare hash h(hashexp: 4); rc = h.defineKey('k'); rc = h.defineData('d'); rc = h.defineDone(); end;

You can have multiple key and data variables. You can store more than one data item with a particular key. For example, you could modify the previous example to store auxiliary numeric values with the character key and data. In this example, each key and each data item consists of a character value and a numeric value.

length d1 8; length d2 ; length k1 ; length k2 8; if _N_ = 1 then do; declare hash h(hashexp: 4); rc = h.defineKey('k1', 'k2'); rc = h.defineData('d1', 'd2'); rc = h.defineDone(); end;

For more information about the 'DEFINEDATA Method', the 'DEFINEDONE Method', and the 'DEFINEKEY Method', see SAS Language Reference: Dictionary .

Note: The hash object does not assign values to key variables (for example, h.find(key: ˜abc')), and the SAS compiler cannot detect the implicit key and data variable assignments done by the hash object and the hash iterator. Therefore, if no explicit assignment to a key or data variable appears in the program, SAS will issue a note stating that the variable is uninitialized . To avoid receiving these notes, you can perform one of the following actions:

length d ; length k ; if _N_ = 1 then do; declare hash h(hashexp: 4); rc = h.defineKey('k'); rc = h.defineData('d'); rc = h.defineDone(); call missing(k, d); end;

Storing and Retrieving Data

After you initialize the hash object's key and data variables, you can store data in the hash object using the ADD method, or you can use the dataset argument tag to quickly load a data set into the hash object.

You can then use the FIND method to search and retrieve data from the hash object.

For more information about the 'ADD Method' and the 'FIND Method', see SAS Language Reference: Dictionary .

Note: You can also use the hash iterator object to retrieve the hash object data, one data element at a time, in forward and reverse order. For more information, see 'Using the Hash Iterator Object' on page 445.

Example 1: Using the ADD and FIND Methods to Store and Retrieve Data

The following example uses the ADD method to store the data in the hash object and associate the data with the key. The FIND method is then used to retrieve the data that is associated with the key value ˜Homer'.

data _null_; length d ; length k ; /* Declare the hash object and key and data variables */ if _N_ = 1 then do; declare hash h(hashexp: 4); rc = h.defineKey('k'); rc = h.defineData('d'); rc = h.defineDone(); end; /* Define constant value for key and data */ k = 'Homer'; d = 'Odyssey'; /* Use the ADD method to add the key and data to the hash object */ rc = h.add(); if (rc ne 0) then put 'Add failed.'; /* Define constant value for key and data */ k = 'Joyce'; d = 'Ulysses'; /* Use the ADD method to add the key and data to the hash object */ rc = h.add(); if (rc ne 0) then put 'Add failed.'; k = 'Homer'; /* Use the FIND method to retrieve the data associated with 'Homer' key */ rc = h.find(); if (rc = 0) then put d=; else put 'Key Homer not found.'; run;

The FIND method assigns the data value ˜Odyssey', which is associated with the key value ˜Homer', to the variable D.

Example 2: Loading a Data Set and Using the FIND Method to Retrieve Data

Assume the data set SMALL contains two numeric variables K (key) and S (data) and another data set, LARGE, contains a corresponding key variable K. The following code loads the SMALL data set into the hash object, and then searches the hash object for key matches on the variable K from the LARGE data set.

data match; length k 8; length s 8; if _N_ = 1 then do; /* load SMALL data set into the hash object */ declare hash h(dataset: "work.small", hashexp: 6); /* define SMALL data set variable K as key and S as value */ h.defineKey('k'); h.defineData('s'); h.defineDone(); /* avoid uninitialized variable notes */ call missing(k, s); end; /* use the SET statement to iterate over the LARGE data set using */ /* keys in the LARGE data set to match keys in the hash object */ set large; rc = h.find(); if (rc = 0) then output; run;

The dataset argument tag specifies the SMALL data set whose keys and data will be read and loaded by the hash object during the DEFINEDONE method. The FIND method is then used to retrieve the data.

Replacing and Removing Data

You can remove or replace data in the hash object.

In the following example, the REPLACE method replaces the data ˜Odyssey' with ˜Iliad' and the REMOVE method deletes the entire data entry associated with the ˜Joyce' key from the hash object.

data _null_; length d ; length k ; /* Declare the hash object and key and data variables */ if _N_ = 1 then do; declare hash h(hashexp: 4); rc = h.defineKey('k'); rc = h.defineData('d'); rc = h.defineDone(); end; /* Define constant value for key and data */ k = 'Joyce'; d = 'Ulysses'; /* Use the ADD method to add the key and data to the hash object */ rc = h.add(); if (rc ne 0) then put 'Add failed.'; /* Define constant value for key and data */ k = 'Homer'; d = 'Odyssey'; /* Use the ADD method to add the key and data to the hash object */ rc = h.add(); if (rc ne 0) then put 'Add failed.'; /* Use the REPLACE method to replace 'Odyssey' with 'Iliad' */ k = 'Homer'; d = 'Iliad'; rc = h.replace(); if (rc = 0) then put d=; else put 'Replace not successful.'; /* Use the REMOVE method to remove the 'Joyce' key and data */ k = 'Joyce'; rc = h.remove(); if (rc = 0) then put k 'removed from hash object'; else put 'Deletion not successful.'; run;

For more information on the 'REMOVE Method' and the 'REPLACE Method', see SAS Language Reference: Dictionary .

Saving Hash Object Data in a Data Set

You can create a data set that contains the data in a specified hash object by using the OUTPUT method. In the following example, two keys and data are added to the hash object and then output to the Work.out data set.

data test; length d1 8; length d2 ; length k1 ; length k2 8; /* Declare the hash object and two key and data variables */ if _N_ = 1 then do; declare hash h(hashexp: 4); rc = h.defineKey('k1', 'k2'); rc = h.defineData('d1', 'd2'); rc = h.defineDone(); end; /* Define constant value for key and data */ k1 = 'Joyce'; k2 = 1001; d1 = 3; d2 = 'Ulysses'; rc = h.add(); /* Define constant value for key and data */ k1 = 'Homer'; k2 = 1002; d1 = 5; d2 = 'Odyssey'; rc = h.add(); /* Use the OUTPUT method to save the hash object data to the OUT data set */ rc = h.output(dataset: "work.out"); run; proc print data=work.out; run;

The following output shows the report that PROC PRINT generates.

Output 24.1: Data Set Created from the Hash Object

The SAS System 1 Obs d1 d2 1 5 Odyssey 2 3 Ulysses

 

Note that the hash object keys are not stored as part of the output data set. If you want to include the keys in the output data set, you must define the keys as data in the DEFINEDATA method. In the previous example, the DEFINEDATA method would be written this way:

rc = h.defineData('k1', 'k2', 'd1', 'd2');

For more information on the 'OUTPUT Method' , see SAS Language Reference: Dictionary .

Категории