SAS/STAT 9.1 Users Guide Volume 2 only

The following statements are available in the FASTCLUS procedure:

Usually you need only the VAR statement in addition to the PROC FASTCLUS statement. The BY, FREQ, ID, VAR, and WEIGHT statements are described in alphabetical order after the PROC FASTCLUS statement.

PROC FASTCLUS Statement

You must specify either the MAXCLUSTERS= or the RADIUS= argument in the PROC FASTCLUS statement.

MAXCLUSTERS= n

MAXC= n

RADIUS= t

R= t

BINS= n

CLUSTER= name

CONVERGE= c

CONV= c

DATA= SAS-data-set

DELETE= n

DISTANCE DIST

DRIFT

HC= c

HP= p 1 < p 2 >

IMPUTE

INSTAT= SAS-data-set

IRLS

LEAST= p MAX

L= p MAX

LIST

MAXITER= n

MEAN= SAS-data-set

NOMISS

NOPRINT

OUT= SAS-data-set

OUTITER

OUTSEED= SAS-data-set

OUTS= SAS-data-set

OUTSTAT= SAS-data-set

RANDOM= n

REPLACE=FULL PART NONE RANDOM

SEED= SAS-data-set

SHORT

STRICT

STRICT= s

SUMMARY

VARDEF=DFNWDFWEIGHT WGT

BY Statement

You can specify a BY statement with PROC FASTCLUS to obtain separate analyses on observations in groups defined by the BY variables. When a BY statement appears, the procedure expects the input data set to be sorted in order of the BY variables.

If your input data set is not sorted in ascending order, use one of the following alternatives:

If you specify the SEED= option and the SEED= data set does not contain any of the BY variables, then the entire SEED= data set is used to obtain initial cluster seeds for each BY group in the DATA= data set.

If the SEED= data set contains some but not all of the BY variables, or if some BY variables do not have the same type or length in the SEED= data set as in the DATA= data set, then PROC FASTCLUS displays an error message and stops.

If all the BY variables appear in the SEED= data set with the same type and length as in the DATA= data set, then each BY group in the SEED= data set is used to obtain initial cluster seeds for the corresponding BY group in the DATA= data set. All BY groups in the DATA= data set must also appear in the SEED= data set. The BY groups in the SEED= data set must be in the same order as in the DATA= data set. If you specify the NOTSORTED option in the BY statement, there must be exactly the same BY groups in the same order in both data sets. If you do not specify NOTSORTED, some BY groups can appear in the SEED= data set but not in the DATA= data set; such BY groups are not used in the analysis.

For more information on the BY statement, refer to the discussion in SAS Language Reference: Concepts . For more information on the DATASETS procedure, refer to the discussion in the SAS Procedures Guide.

FREQ Statement

If a variable in the data set represents the frequency of occurrence for the other values in the observation, include the variables name in a FREQ statement. The procedure then treats the data set as if each observation appears n times, where n is the value of the FREQ variable for the observation.

If the value of the FREQ variable is missing or 0, the observation is not used in the analysis. The exact values of the FREQ variable are used in computations : frequency values are not truncated to integers. The total number of observations is considered to be equal to the sum of the FREQ variable when the procedure determines degrees of freedom for significance probabilities.

The WEIGHT and FREQ statements have a similar effect, except in determining the number of observations for significance tests.

ID Statement

The ID variable, which can be character or numeric, identifies observations on the output when you specify the LIST option.

VAR Statement

The VAR statement lists the numeric variables to be used in the cluster analysis. If you omit the VAR statement, all numeric variables not listed in other statements are used.

WEIGHT Statement

The values of the WEIGHT variable are used to compute weighted cluster means. The WEIGHT and FREQ statements have a similar effect, except the WEIGHT statement does not alter the degrees of freedom or the number of observations. The WEIGHT variable can take nonintegral values. An observation is used in the analysis only if the value of the WEIGHT variable is greater than zero.

Категории