Sampling Paradigm In the terminology of Dawid (1976), modelling the class-conditional densities and, perhaps, the prior probabilities of the classes .

 Sherman-Morrison-Woodbury Formula Given a non-singular n¡¿n matrix A and column vectors b and d we have provided if B and D are n¡¿m matrices for m¡Ân then provided the m¡¿m matrix is invertible (Golub & Van Loan, 1989, p. 51).

 Shrinkage Methods of estimation 'shrink' an estimator by moving it towards some fixed value (or an overall mean). Ridge regression shrinks regression coefficients towards zero, apart from the constant. The idea is that the shrunken estimator has more bias lower variance and hence better generalization. The James-Stein example (Cox & Hinkley, 1974, ¡×11.8) shows that this idea works even for the mean of a normal distribution in p ¡Ã 3 dimensions.

 Simulated Annealing is a method of combinatorial optimization based on taking a series of random steps in the search space. See Ripley (1987) or Aarts & Korst (1989).

 Singular Value Decomposition of a real matrix ,where A is a diagonal matrix with decreasing non-negative entries, U is an n¡¿p matrix with orthonormal columns, and V is a p ¡¿ p orthogonal matrix (Goulub & Van Loan, 1989).

 SOFM, SOM Self-organizing (feature) map of Kohonen.

 Softmax Given outputs for each of K classes, assign posterior probabilities as The term comes from Bridle (1990a, b), but the idea is that of multiplelogistic regression.

 Specific Variance Variance of each variable unique to that variables and not explained or associated with other variables in the factor analysis.

 Splines are used in function approximation and smoothing. They are constructed by joining functions defined over a partition of the space: the simplest case is polynomials on adjoining intervals.

 Stacked Generalization A method of using cross-validation to choose a combination of classifiers. The term is from Wolpert (1992); the idea goes back at least to M. Stone (1974).

 Steepest Descent A method of minimization which takes steps along the direction to steepest descent, the gradient vector. For maximization the method is known as steepest ascent or hill-climbing.

 Stochastic Approximation aims to find the value of solving but although we can measure the result will measured with error. After taking many measurements for with near zero we will be able to find accurate estimators of There are also versions which aim to find the maximizer of

 Summated Scales Method of combining several variables that measure the same concept into a single variable in an attempt to increase the reliability of the measurement. In most instances, the separate variables are summed and then their total or average score is used in the analysis.

 Supervised Learning Choosing a classifier from a training set of correctly classified examples.

 Summated Scales Method of combining several variables that measure the same concept into a single variable in an attempt to increase the reliability of the measurement. In most instances, the separate variables are summed and then their total or average score is used in the analysis.

 Surrogate Variable Selection of a single variables with the highest factor loading to represent a factor in the data reduction stage instead of using a summated scale or factor score.