Of course, one of the best ways of learning how to use Netlab is to run and examine the demo programs. Every major function in the library has at least one associated demonstration. Each demo has the prefix dem in its name.
Two demos that are particularly useful when getting started are
Models and Structures
- demnlab is a simple GUI that allows you to select and run Netlab demos
- demtrain demonstrates how a neural network can be trained to solve classification and regression problems: it reads in datasets in the Netlab data format.
Every model of any complexity (all except K nearest neighbour) is manipulated as a Matlab data structure. This ensures that all the relevant information is kept together, and prevents the user from passing models to the wrong function. Each model has a `constructor' function that builds and initialises the relevant data structure. Each model has a three letter prefix that is used for all of its associated functions.
For example, the Gaussian mixture model has the prefix gmm. A call to the constructor
net = gmm(1, 2, 'spherical');
generates the following data structure
priors: [0.5000 0.5000]
centres: [2x1 double]
covars: [1 1]
We have not developed the code in a fully object-oriented fashion for simplicity. To access fields in this structure, for example, the centres, use the . operator:
The models have been designed to have compatible data structures. The Gaussian mixture model is used when training RBF networks and in Mixture Density Networks.
Because models are represented by a single data structure, it is very easy to save and load them. The command
save gmm.net net
saves the Gaussian mixture model net to the file gmm.net. Training and Optimisers
The optimisers are compatible with the Matlab optimisation toolbox. In particular, they make use of the options vector to control the way the algorithm works, set termination conditions etc. (For more information on the meaning of the fields in this vector, see the Matlab help on foptions and the help for each optimisation function.) There is one significant difference in the treatment of termination criteria. If the tolerance is set to 0, then in Netlab, the optimisation function will continue until the required number of steps have been taken, whereas in Matlab, the values are overwritten with the default 1e-4. All Netlab optimisation routines assume that there is a function available that computes the gradient of the function being optimised.
Because the optimisers (with the exception of the on-line gradient descent optimiser olgd
which is specifically for training network models) are general purpose, they operate in `parameter vector' space. However, the model parameter vectors are held in the corresponding data structure. To apply the optimisers to network training, three utility functions netopt
are provided. Examples of their use can be found in demmlp1.m
. For example, use the following line of code to train a network data structure with input data x
and target data t
using the quasi-Newton training algorithm
[net, options] = netopt(net, options, x, t, 'quasinew');
This mechanism relies upon the following assumptions:
- The network data structure net contains the function prefix (for example, mlp) in a field called type.
- The function that computes the error is called <prefix>err, for example mlperr.
- The function that computes the error gradient is called <prefix>grad, for example mlpgrad.
- There is function <prefix>pak (for example, mlppak) which packs all the component parameter vectors in the network data structure into one single vector.
- There is function <prefix>unpak (for example, mlpunpak) which unpacks a single parameter vector into its component vectors in a network data structure.
If you extend Netlab by adding new models, then you should ensure that you follow these conventions if you want to use optimisation functions to train them. A Simple Program
The "Hello world" equivalent in Netlab is a programme that generates some data, trains an MLP, and plots its predictions.
% Generate the matrix of inputs x and targets t.
x = [0:1/19:1]';
t = sin(2*pi*x) + 0.2*randn(ndata, 1);
% Set up network parameters.
net = mlp(1, 3, 1, 'linear');
% Set up vector of options for the optimiser.
options = zeros(1,18);
options(1) = 1; % This provides display of error values.
options(9) = 1; % Check the gradient calculations.
options(14) = 100; % Number of training cycles.
% Train using scaled conjugate gradients.
[net, options] = netopt(net, options, x, t, 'scg');
% Plot the trained network predictions.
plotvals = [0:0.01:1]';
y = mlpfwd(net, plotvals);
plot(plotvals, y, 'ob')
A fuller version of this program is contained in demmlp1.m Data Format and Sample Datasets
As well as the usual Matlab data file formats, Netlab provides two utility functions datread
to read and write data files respectively. The data is stored in a text file with a short header followed by the data itself. The header has the form
specifies the number of input variables, nout
the number of output variables, and ndata
the number of data points. Each subsequent line corresponds to a single example, with the first nin
values for the input variables, and the remaining nout
values for the output variables. For unsupervised learning problems, nout
can be zero.
Netlab is supplied with two datasets: the ubiquitous exclusive or problem xor.dat
, and oil pipeline data oilTrn.dat
. The latter is synthetic data modelling non-intrusive measurements on a pipeline transporting a mixture of oil, water and gas. the inputs are twelve measures of gamma beam attenuation, and the two outputs represent the fraction of water and the fraction of oil. Each dataset contains 500 examples. This link
gives references to papers that use the data.