SAS/STAT 9.1 Users Guide, Volumes 1-7

The following statements are available in the ACECLUS procedure.

Usually you need only the VAR statement in addition to the required PROC ACECLUS statement. The optional BY, FREQ, VAR, and WEIGHT statements are described in alphabetical order after the PROC ACECLUS statement.

PROC ACECLUS Statement

The PROC ACECLUS statement starts the ACECLUS procedure. The options available with the PROC ACECLUS statement are summarized in Table 16.2 and discussed in the following sections. Note that, if you specify the METHOD=COUNT option, you must specify either the PROPORTION= or the MPAIRS= option. Otherwise , you must specify either the PROPORTION= or THRESHOLD= option.

Table 16.2: Summary of PROC ACECLUS Statement Options

Task

Options

Description

Specify clustering options

 

METHOD=

specify the clustering method

 

MPAIRS=

specify number of pairs for estimating within-cluster covariance (when you specify the option METHOD=COUNT)

 

PROPORTION=

specify proportion of pairs for estimating within-cluster covariance

 

THRESHOLD=

specify the threshold for including pairs in the estimation of the within-cluster covariance

Specify input and output data sets

 

DATA=

specify input data set name

 

OUT=

specify output data set name

 

OUTSTAT=

specify output data set name containing various statistics

Specify iteration options

 

ABSOLUTE

use absolute instead of relative threshold

 

CONVERGE=

specify convergence criterion

 

INITIAL=

specify initial estimate of within-cluster covariance matrix

 

MAXITER=

specify maximum number of iterations

 

METRIC=

specify metric in which computations are performed

 

SINGULAR=

specify singularity criterion

Specify canonical analysis options

 

N=

specify number of canonical variables

 

PREFIX=

specify prefix for naming canonical variables

Control displayed output

 

NOPRINT

suppress the display of the output

 

PP

produce PP-plot of distances between pairs from last iteration

 

QQ

produce QQ-plot of power transformation of distances between pairs from last iteration

 

SHORT

omit all output except for iteration history and eigenvalue table

The following list provides details on the options. The list is in alphabetical order.

ABSOLUTE

CONVERGE= c

DATA= SAS-data-set

INITIAL= name

MAXITER= n

METHOD= COUNT C

METHOD= THRESHOLD T

METRIC= name

MPAIRS= m

N= n

NOPRINT

OUT= SAS-data-set

OUTSTAT= SAS-data-set

PROPORTION= p

PERCENT= p

P= p

PP

PREFIX= name

QQ

SHORT

SINGULAR= g

SING= g

THRESHOLD= t

T= t

BY Statement

You can specify a BY statement with PROC ACECLUS to obtain separate analyses on observations in groups defined by the BY variables. When a BY statement appears, the procedure expects the input data set to be sorted in order of the BY variables.

If your input data set is not sorted in ascending order, use one of the following alternatives:

If you specify the INITIAL=INPUT= option and the INITIAL=INPUT= data set does not contain any of the BY variables, the entire INITIAL=INPUT= data set provides the initial value for the matrix A for each BY group in the DATA= data set.

If the INITIAL=INPUT= data set contains some but not all of the BY variables, or if some BY variables do not have the same type or length in the INITIAL=INPUT= data set as in the DATA= data set, then PROC ACECLUS displays an error message and stops.

If all the BY variables appear in the INITIAL=INPUT= data set with the same type and length as in the DATA= data set, then each BY group in the INITIAL=INPUT= data set provides the initial value for A for the corresponding BY group in the DATA= data set. All BY groups in the DATA= data set must also appear in the INITIAL= INPUT= data set. The BY groups in the INITIAL=INPUT= data set must be in the same order as in the DATA= data set. If you specify NOTSORTED in the BY statement, identical BY groups must occur in the same order in both data sets. If you do not specify NOTSORTED, some BY groups can appear in the INITIAL= INPUT= data set, but not in the DATA= data set; such BY groups are not used in the analysis.

For more information on the BY statement, refer to the discussion in SAS Language Reference: Concepts . For more information on the DATASETS procedure, refer to the discussion in the SAS Procedures Guide .

FREQ Statement

If a variable in your data set represents the frequency of occurrence for the observation, include the name of that variable in the FREQ statement. The procedure then treats the data set as if each observation appears n times, where n is the value of the FREQ variable for the observation. If a value of the FREQ variable is not integral, it is truncated to the largest integer not exceeding the given value. Observations with FREQ values less than one are not included in the analysis. The total number of observations is considered equal to the sum of the FREQ variable.

VAR Statement

The VAR statement specifies the numeric variables to be analyzed. If the VAR statement is omitted, all numeric variables not specified in other statements are analyzed.

WEIGHT Statement

If you want to specify relative weights for each observation in the input data set, place the weights in a variable in the data set and specify that variable name in a WEIGHT statement. This is often done when the variance associated with each observation is different and the values of the weight variable are proportional to the reciprocals of the variances. The values of the WEIGHT variable can be non-integral and are not truncated. An observation is used in the analysis only if the value of the WEIGHT variable is greater than zero.

The WEIGHT and FREQ statements have a similar effect, except in calculating the divisor of the A matrix.

Категории