












Sampling Paradigm
In the terminology of Dawid (1976), modelling the classconditional
densities and, perhaps, the prior probabilities of the classes
. 








ShermanMorrisonWoodbury Formula
Given a nonsingular n¡¿n matrix A and column vectors b and d
we have provided if B and D are n¡¿m matrices for m¡Ân then provided
the m¡¿m matrix is invertible (Golub & Van Loan, 1989, p.
51). 








Shrinkage Methods
of estimation 'shrink' an estimator by moving it towards some
fixed value (or an overall mean). Ridge regression shrinks regression
coefficients towards zero, apart from the constant. The idea
is that the shrunken estimator has more bias lower variance
and hence better generalization. The JamesStein example (Cox
& Hinkley, 1974, ¡×11.8) shows that this idea works even
for the mean of a normal distribution in p ¡Ã 3 dimensions. 








Simulated Annealing
is a method of combinatorial optimization based on taking a
series of random steps in the search space. See Ripley (1987)
or Aarts & Korst (1989). 








Singular Value Decomposition
of a real matrix ,where A is a diagonal matrix with decreasing
nonnegative entries, U is an n¡¿p matrix with orthonormal columns,
and V is a p ¡¿ p orthogonal matrix (Goulub & Van Loan, 1989).









SOFM, SOM
Selforganizing (feature) map of Kohonen. 








Softmax
Given outputs for each of K classes, assign posterior probabilities
as The term comes from Bridle (1990a, b), but the idea is that
of multiplelogistic regression.









Specific Variance
Variance of each variable unique to that variables and not explained
or associated with other variables in the factor analysis. 








Splines
are used in function approximation and smoothing. They are constructed
by joining functions defined over a partition of the space:
the simplest case is polynomials on adjoining intervals. 








Stacked Generalization
A method of using crossvalidation to choose a combination of
classifiers. The term is from Wolpert (1992); the idea goes
back at least to M. Stone (1974).









Steepest Descent
A method of minimization which takes steps along the direction
to steepest descent, the gradient vector. For maximization the
method is known as steepest ascent or hillclimbing. 








Stochastic Approximation
aims to find the value of solving but although we can measure
the result will measured with error. After taking many measurements
for with near zero we will be able to find accurate estimators
of There are also versions which aim to find the maximizer of









Summated Scales
Method of combining several variables that measure the same
concept into a single variable in an attempt to increase the
reliability of the measurement. In most instances, the separate
variables are summed and then their total or average score is
used in the analysis. 








Supervised Learning
Choosing a classifier from a training set of correctly classified
examples. 








Summated Scales
Method of combining several variables that measure the same
concept into a single variable in an attempt to increase the
reliability of the measurement. In most instances, the separate
variables are summed and then their total or average score is
used in the analysis. 








Surrogate Variable
Selection of a single variables with the highest factor loading
to represent a factor in the data reduction stage instead of
using a summated scale or factor score. 








