................ SHORT DOC .............................................
POP: Optimal Partitionning
Reference:
M. Petitjean, "Agregation des similarites: une solution oubliee",
RAIRO Oper. Res. 2002,36[1],101-108 (http://www.edpsciences.org)
Author email: michel.petitjean@cea.fr
POP reads, either an (N*N) array of signed similarities,
or an array containing N lines (individuals) and P columns
(categorical variables), which will be converted into
signed similarities. Then POP computes an optimal partition.
Input parameters: PGM , INI , OPT , N , P
(these 5 parameters are input at keyboard on a single line)
(allowed separators: space, comma, semi-colon, slash)
-----------------------------------------------------
PGM: indicates which kind of array will be read:
PGM = 0 : array of signed similarites S(i,j)
(i and j varying from 1 to N)
PGM = 1 : array of similarites S(i,j), to be transformed by:
S(i,j) := 2 * S(i,j) - P
PGM = 2 : array of N lines (individuals) and P columns,
each column being associated to a categorical variable;
the categories are coded with integers.
PGM = 3 : read the P weights (each for one categorical variable),
then read data as for PGM=2.
INI: indicates what will be done prior computing partitions:
INI = -1 : initial sorting via algebrically decreasing costs
INI = 0 : stop after the initial partition has been computed
(useful for high N values)
INI = +1 : initial sorting via arithmetically decreasing costs
OPT: indicates how many optimal partitions are needed:
OPT = 1 : only one optimal partition is needed
OPT = 2 : all optimal partitions are enumerated
N: number of individuals
P: number of categorical variables (ignored when PGM=0)
Input data filename at keyboard
(enter an empty line if the data will be themselves input at keyboard)
----------------------------------------------------------------------
Reading file (or input at keyborad):
-----------------------------------
- Only when PGM=3, read the P weights (on a single line, free format)
- Read N lines of the data array (free format)
Output: the optimal cost, and the optimal partition(s)
------------------------------------------------------
Remarks:
The computing time grows exponentially with N, but computing
the initial partition is achieved in O(N^3).
Simulations shows that this latter appears often to be optimal.
The line buffer of version 1.1 is limited to 255 characters per line.
In version 1.2, it is extended to 255*255 characters per data line.
................ END SHORT DOC .........................................