Centroid
Mean value for the discriminant Z scores of all objects within a particular category or group. For example, a two-group discriminant analysis has two centroids, one for the objects in each of the two groups.
 
 
     
 
  Chebychev inequality
For a random variable X with mean and variance we have for all This follows from Jensen's inequality applied to
 
 
     
 
  Classification Trees
Classifiers which partition the examples on one feature at a time.
 
 
     
 
  Classifier
A rule to assign a class (or 'doubt' or 'outlier') to new examples.
 
 
     
 
  Codebook Vectors
Representative examples of a probability distribution. The term comes from vector quantization.
 
 
     
 
  Collinearity
Relationship between two (collinearity) or more (multicollinearity) variables. Variables exhibit complete collineartity if their correlation coefficient is 1 and a complete lack of collinearity if their correlation coefficient is 0.
 
 
     
 
  Compact set
A subset is compact if it is closed and bounded, that is for some Compact sets are also called compacta.
 
 
     
 
  Condition Index
Measure of the relative amount of variance associated with an eigenvalue so that a large condition index indicates a high degree of collinearity.
 
 
     
 
  Consistent
An estimator is consistent if in large samples it converges to the true parameter value (when there is one).
 
 
     
 
  Concave
A function is concave if is concave.
 
 
     
 
  Convex
A function is concave if is convex.
 
 
     
 
  CookĄ¯s Distance
Summary measure of the influence of a single case(observation) based on the total changes in all other residuals when the case is deleted from the estimation process. Large values (usually greater than 1) indicate substantial influence by the case in affecting the estimated regression coefficients.
 
 
     
 
  COVRATIO
Measure of the influence of a single observation on the entire set of estimated regression coefficients. A value close to 1 indicates little influence. If the COVRATIO value minus 1 is greater than +-3p/n(where p is the number of independent variables+1, and n is the sample size), the observation is deemed to be influential based on this measure.
 
 
     
 
  Cross-Validation
A method of evaluating parameters or classifiers by dividing the training set into several parts, and in turn using one part to test the procedure fitted to the remaining parts. Sometimes used to refer to leave-one-out (or ordinary) cross-validation, where every example is dropped in turn. This term is much abused; it does not mean the use of a test set or validation set. Generalized cross-validation is a measure of the performance of a regularized classifier;