................ SHORT DOC ............................................. CSR: The Combined SDM/RMS Algorithm for spatial alignment of two molecules. Reference for the main CSR algorithm: M. Petitjean, Interactive Maximal Common 3D Substructure Searching with the Combined SDM/RMS Algorithm, Comput. Chem. 1998,22[6],463-465). Reference of the RMS numerical algorithm: [1] M. Petitjean, On the Root Mean Square Quantitative Chirality and Quantitative Symmetry Measures, J.Math.Phys. 1999,40[9],4587-4595 (see appendix of ref. [1]; see also the generalization in appendix A.5 in J.Math.Phys. 2002,43[8],4147-4157) Reference of the SDM algorithm: [2] M. Petitjean, Interactive Maximal Common 3D Substructure Searching with the Combined SDM/RMS Algorithm, Comput.Chem. 1998,22[6],463-465 Author email: petitjean.chiral@gmail.com CSR reads the cartesian coordinates of two molecules, then optimally rotates and translates the molecule 2 onto the molecule 1 to find the maximal common 3D motif. The two input molecules should be concatenated into a single file prior execution. Input data and parameters: ------------------------- INPUT FORMAT: BIO : Biosym (MSI) files CAS : Reserved for internal purposes HIN : Hyperchem-type files ISU : Reserved for internal purposes MDL : Cambridge Crystallographic Model files ML2 : SYBYL Mol2 files PDB : Protein Data Bank or Nucleic Acid Data Bank files (only HEADER, ATOM, ENDMDL and END records are recognized) SDF : Symyx Mol/SDF files (data between 'M END' and '$$$$' are treated as comments) XYZ : n+2 lines. Line 1: n; line 2: free comment, Next n lines: label or atomic symbol, x, y, z (separator: spaces; no tabulation allowed). INPUT MOLEC FILE NAME: name of the input file containing both molecules OUTPUT MOLEC FILE NAME: name of the output file containing the optimally rotated and translated molecule 2 IMOL1: sequential position number of molecule 1 in the input molecules file IMOL2: sequential position number of molecule 2 in the input molecules file ITERMX: maximum number of iterations; recommended value: about 200 for small molecules (<100at.), about 2000 for a hundred to a thousand atoms, and 20000 for larger molecules CUT-OFF DIST: This parameter does NOT affect the results. It saves space and time. As a rule of thumb, this value should be roughly near a bondlength. E.g. about 1.5 to 2 for small inorganic molecules, 0.9 to 1.2 for full proteins, 4 to 5 for C-alpha protein backbones). Output results: -------------- The size N of the common 3D motif, and the r.m.s. between the N pairs of atoms, followed by the one-to-one correspondence between the N atoms of molecule 1 and the N atoms of molecule 2. The new coordinates of the optimally rotated and translated molecule 2. Remarks: ------- The number of atoms is currently limited to 15000 for each molecule. The source has to be recompiled to read larger molecules. To operate on C-alpha protein backbones, the other atoms should be removed prior execution. The computing time is roughly proportional to the product n1*n2 of the number of atoms of the two molecules, and proportional to the number of iterations (reading and writing files not included). The generated file containing the output moved molecule 2 is empty for CAS, MDL and BIO formats, and the message "EERCO2 = 1" is displayed. ................ END SHORT DOC .........................................