What to do if you, for example, want to use a different distance function or use a different way to adapt a vector? No worries, here are some hints to get you started.
Most of the functions in the package (som_training, codebook labeling, etc.) take their parameters from the teach_params-structure that is passed to them. The structure is defined as follows (in lvq_pak.h):
struct teach_params { short topol; short neigh; short alpha_type; MAPDIST_FUNCTION *mapdist; /* calculates distance between two units */ DIST_FUNCTION *dist; /* calculates distance between two vectors */ NEIGH_ADAPT *neigh_adapt; /* adapts weights */ VECTOR_ADAPT *vector_adapt; /* adapt one vector */ WINNER_FUNCTION *winner; /* function to find winner */ ALPHA_FUNC *alpha_func; float radius; /* initial radius (for SOM) */ float alpha; /* initial alpha value */ long length; /* length of training */ int knn; /* nearest neighbours */ struct entries *codes; struct entries *data; struct snapshot_info *snapshot; time_t start_time, end_time; };The different function types are defined in lvq_pak.h:
typedef float MAPDIST_FUNCTION(int bx, int by, int tx, int ty);where (bx,by) and (tx,ty) are the coordinates of two units on the map. These type of functions are used by the neighbourhood adaptation functions.
typedef float DIST_FUNCTION(struct data_entry *v1, struct data_entry *v2, int dim);where v1 and v2 are two entries and dim is their dimension. Note that the functions for finding winners do not use these functions.
typedef void NEIGH_ADAPT(struct teach_params *teach, struct data_entry *sample, int bx, int by, float radius, float alpha);where the teaching parametes are in teach, sample is the data sample, (bx,by) are the coordinates of the winning unit, and radius and alpha are the training radius and alpha for this sample (both decrease during teaching).
This function takes a data sample and moves vectors within the adaptation region (a circle centered at (bx,by) with radius radius) towards the sample. The vector adaptation function and map distance function are taken from the teach-structure.
Examples of this kind of functions are bubble_adapt and gaussian_adapt (in som_rout.c) that adapt codebooks with bubble and gaussian neighborhood respectively.
typedef void VECTOR_ADAPT(struct data_entry *c, struct data_entry *s, int d, float a);where c is the codebook vector and s is the sample vector. D is the dimension and a is the alpha. The adapt_vector function is an example of this type of function. This function is used in the neighbourhood adaptation functions and in the LVQ training algorithms.
typedef int WINNER_FUNCTION(struct entries *codes, struct data_entry *sample, struct winner_info *w, int knn);where codes is the codebook and sample is the sample vector. The information about the winners is stores in the winner_info array pointed by w with the best match in w[0]. Knn is the number of best matching units to store (k nearest neighbours). The SOM_PAK uses only the knn of 1, in LVQ_PAK the number can vary.
Examples of these kind of functions are find_winner_euc and find_winner_knn which both use euclidean distance to look for best matches. The difference between these two functions is that find_winner_euc only finds the best matching unit (ie. knn = 1).
The winner_info structure is defined as follows:
struct winner_info { long index; struct data_entry *winner; float diff; };The index is the number of the winning unit from the start of the codebook. This is used to get the winning unit's coordinates on the map. Winner is a pointer to the winning entry. Diff is the difference of the winning vector and the sample. In this case it is the distance of the two vectors squared.
typedef float ALPHA_FUNC(long iter, long length, float alpha);where iter is the number of the current iteration, length is the total length of the process and alpha is the initial alpha value. Examples: linear_alpha which produces a linearly decreasing alpha and inverse_t_alpha which decreases according to the inverse of the iterations done.