................ SHORT DOC ............................................. POP: Optimal Partitionning Reference: M. Petitjean, "Agregation des similarites: une solution oubliee", RAIRO Oper. Res. 2002,36[1],101-108 (http://www.edpsciences.org) Author email: michel.petitjean@cea.fr POP reads, either an (N*N) array of signed similarities, or an array containing N lines (individuals) and P columns (categorical variables), which will be converted into signed similarities. Then POP computes an optimal partition. Input parameters: PGM , INI , OPT , N , P (these 5 parameters are input at keyboard on a single line) (allowed separators: space, comma, semi-colon, slash) ----------------------------------------------------- PGM: indicates which kind of array will be read: PGM = 0 : array of signed similarites S(i,j) (i and j varying from 1 to N) PGM = 1 : array of similarites S(i,j), to be transformed by: S(i,j) := 2 * S(i,j) - P PGM = 2 : array of N lines (individuals) and P columns, each column being associated to a categorical variable; the categories are coded with integers. PGM = 3 : read the P weights (each for one categorical variable), then read data as for PGM=2. INI: indicates what will be done prior computing partitions: INI = -1 : initial sorting via algebrically decreasing costs INI = 0 : stop after the initial partition has been computed (useful for high N values) INI = +1 : initial sorting via arithmetically decreasing costs OPT: indicates how many optimal partitions are needed: OPT = 1 : only one optimal partition is needed OPT = 2 : all optimal partitions are enumerated N: number of individuals P: number of categorical variables (ignored when PGM=0) Input data filename at keyboard (enter an empty line if the data will be themselves input at keyboard) ---------------------------------------------------------------------- Reading file (or input at keyborad): ----------------------------------- - Only when PGM=3, read the P weights (on a single line, free format) - Read N lines of the data array (free format) Output: the optimal cost, and the optimal partition(s) ------------------------------------------------------ Remarks: The computing time grows exponentially with N, but computing the initial partition is achieved in O(N^3). Simulations shows that this latter appears often to be optimal. The line buffer of version 1.1 is limited to 255 characters per line. In version 1.2, it is extended to 255*255 characters per data line. ................ END SHORT DOC .........................................