APPENDIX 3 Force-Field File Format


Organization of the Force-field File

Force field parameters are obtained by BatchMin at execute time prior to performing energy calculations. Force-field parameters may be obtained from a co-process, using the BMFF mechanism (see Appendix 5) or, more commonly, may be read in from a file, the force-field file. This appendix describes the latter mechanism. The force-field file tells the program which of the various possible energetic equations the particular force field uses for each type of interaction and what the values of the parameters are (force constants, natural values of internal coordinates, etc.) for different combinations of atoms. Making changes to the parameter set used by a calculation is simply a matter of editing the force field file to change parameters or adding new ones.

See the description of the FFLD command for more information about the force fields we supply.

Force-field files have several sections:

· The Introductory Section tells the program which equations and conversion factors to use in computing energies. This section also lists a variety of alternatives and selections which may be chosen by making minor editing changes to the file. Finally, special atom symbols may be defined that will be equivalenced to multiple standard atom types for the convenience of further specification.
· The Main Parameter Section contains parameters which can be assigned based on knowing the local environment of the atoms involved in an interaction (a stretch, bend, torsion or non-bonded atom pair) out to at most a distance of two covalent bonds.
· The Substructure Section contains parameters which can be assigned only by knowing how atoms in an interaction are placed in a larger context; for example, some AMBER* parameters for atoms in standard amino acids are assigned differently in this section depending on which amino acid the atom is part of. At the left of each line is a two-character field which indicates the format or contents of the line. This two-character field is most often interpreted as an integer, I, though in certain situations it contains character data. If I is a negative number, it indicates the Fortran format for reading the group of subsequent lines, as follows:
-1 I2, 4X,A3,4X,I2,4X,F11.5 Energy Equation Switches
-2 I2, 2X,3(A2,1X,A1,1X),A2,3X,3(F9.4,2X),4(2(A2),1X) Normal Interactions
-2 I2, 2X,4(A2,2X),4X,3(F9.4,2X) Special Substructure Interactions
-3 I2, 2X,75A1 Special Substructure Linear Notation Format
-4 I2,10F9.4,5(1X,A1),1X,I3,1X,A1,A30 Special Substructure Atomic Charges
-5 I2, 2X,A2,2X,29(A2,1X) Atom Equivalencing Specifications
-6 2X,A2,2X,6(F9.4,2X),2A2,1X,A2 VDW interactions and single atom charges

I=-2 has different meanings depending on whether whether the lines in question occur in the Main Parameter Section or the Substructure Section of the force field. To the end of these formats is appended

,2X,5(1X,A1),1X,I3,1X,A1,A30.

This addition allows reading of alternate and selection information relating to the quality and origin of the parameters.

Values of I greater than or equal to zero or containing character data specify the nature, rather than the format, of the data to be read, the format having already been specified by a preceding line with I<0. The meaning of these values of I is discussed in the appropriate section below.

Introductory Section

BMFF force-fields have several lines at the very top giving BatchMin information about the coprocess to be launched. These are discussed in Appendix 5.

An excerpt from the introductory section of force-field is shown in Figure 1 for mm2.fld, the MM2 force-field.

Equation Specification

The section extending from the "STR" to the "HYD" line above tells the program which equations to use for the various types of interaction and give factors which convert from published force field units to kJ/mole. These data are encoded in lines beginning with I=0. Later, in the Main Parameter and Substructure sections, parameters will be specified in whatever units the original author used in his publications.

In general, each force-field equation has certain universal constants which do not vary with the atom types in the interaction. It is possible to specify these on the equation-specification line; however, this facility is now used only for BMFF force fields.

Stretch

0 None
1 Symmetric (Hooke's law) stretch
2 MM2 asymmetric stretch
3 MM3 symmetric stretch

Bend

0 None
1 Symmetric (Hooke's law) bend
2 MM2 asymmetric bend
3 MM3 asymmetric bend
4 MMFF assymmetric bend

Stretch-Bend

0 None
1 Linear, one constant for both bonds (MM2, MM3)
2 Linear, separate constants for each bone (MMFF)

Torsion

0 None
1 MM2-style cosine torsional potential
2 AMBER-style cosine torsional potential

Improper Torsion (Out-of-plane)

0 None
1 Harmonic out-of-plane (MM2)
2 Improper torsion
4 Harmonic Wilson-angle displacement (MMFF)

Van-der-Waals

1 7,11-Lennard-Jones equation; Eps, R as defined in AMBER
2 6,12-Lennard Jones Equation; Eps, R as defined in AMBER
3 MM3 Hill Equation; Eps, R
4 MM2 Hill Equation; Eps, R
5 Lennard Jones Equation; Eps, R as defined in OPLS
6 Lennard Jones Equation; Alpha, R, N as defined in CHARMM
7 MMFF buffered 7,12 potential

Coulombic

0 Turned off
1 Constant-dielectric electrostatics
2 Distance-dependent (R) dielectric electrostatics
3 MMFF buffered constant dielectric constant
4 MMFF buffered, distance-dependent dielectric constant

Hydrogen-bonding

0 6,12-Lennard Jones for hydrogen bonds
1 10,12-Lennard Jones hydrogen bonds
2 10,12 Lennard-Jones with angular depence

Special Treatment of 1-4 Interactions

0 Skip all 1,4 electrostatics and van der Waals
1 Scale 1,4 va- der-Waals terms by factor given
2 Scale 1,4 Coulombic terms by factor given
3 Scale both 1,4 van der Waals and electrostatics by factor given
4 Do not scale 1,4 nonbonded interactions

Fixed-atom Tethering Potential (when FXAT specified)

0 None
1 Harmonic

ALT and SEL Specification

The lower parts (ALT, SEL) allow selection among varous parameter types (O for original, M for modified, etc) and among various alternative parameter sets (e.g., Allinger MM2 vs Osawa MM2').

The SEL (selection) descriptors above allow you to include (or exclude) the lines of the force fields having certain labels. For example, associated with SEL 1 are three symbols O (for Original), M (for Modified) and A (for Added) which denote parameters which are original with the field, modified from the original values or totally new entries. In the SEL lines above, all such parameters are turned on, i.e. to be included into the force field which is read by the modeling programs. If you wish to use only original parameters, delete the lines which select Modified or Added parameters (i.e., SEL 1 M and SEL 1 A). The O, M and A symbols go with selection switch 1. SEL 2 has associated symbols 1, 2 and 3, signifying high-, medium- and low-quality parameters. If the SEL 2 2 and SEL 2 3 lines above are removed, the only high quality parameters will be used in the calculation (however, there are no high quality parameters as yet for many common substructures). You can add your own SEL switches starting with SEL 3 and choose any symbols you wish for switching. The symbols you use are the same in the SEL lines above and in the corresponding entries of the force field below. MacroModel allows a total of 5 different SEL's.

The ALT (alternative) descriptors are similar to the SEL descriptors except that they select single possibilities from among various options. We have defined ALT 1 to toggle between the original MM2 parameters (signified by the symbol a) and the Osawa MM2' parameters (signified by the symbol b). By default, a is selected. You may notice that there are seemingly duplicate entries in the force field with slightly different parameters (MM2 and MM2'), these are switched between by selecting the ALT symbol you want.

Figure 2 shows how SEL and ALT options are encoded later, on an interaction-by-interaction basis, in the Main Parameter and Subststructure sections of the force field. Each interaction may have values specified for the various SEL and ALT options. The SEL values should be aligned below the columns labeled 1 2 3 4 5. The ALT number goes below Alt the particular ALT symbol goes below the a. Comments should be used (30 characters maximum) since they are printed out along with the parameters in the output .mmo files containing detailed energy listings. These comments are quite helpful in checking parameters.

Atom-type Equivalencies

It is possible to equivalence several atom types in any given force field by some simple additions in the force field file. This equivalencing allows the use of a special 2-character descriptor to refer to any one of a set of atoms. These 2-character descriptors may be used anywhere in the force field to avoid creation of multiple entries for related interaction types. An example would define CU as the various forms of sp2 carbons (C2, CD, CE) and then substitute CU whenever the parameters for C2, CD and CE are all the same. Such a substitution would replace 3 different interaction descriptors with one interaction descriptor. The AMBER force field makes extensive use of such equivalencing.

The atom equivalence should be specified in the force field just before the stretch interactions. The format is first to type '-5' in columns 1 and 2. The next lines contain the atom equivalence data in the Fortran format (I2,2X,A2,2X, 29(A2,1X)). Up to 29 atom types can be equivalenced. After all the atom equivalences are specified, end input by putting a C, or any letter, in column 2 of the next line; for example:

-5
 1 CX  C1 C2 C3
 2 CY  CA CB CC
 3 NP  NA NB
 C
This indicates that CX is the name of an equivalency class that includes atom types C1, C2 and C3. Thus, if CX is specified in an interaction in the force field, then, whenever atom type C,r C2 or C3 is found in the matching position of the molecule under study, the interactions specified will be used.

Atom equivalencing is not allowed for wild card atom types, e.g., C0, N0, O0, 00.

Main Parameter Section

Interactions

There are separate subsections for stretching, bending, torsional, improper torsional, van der Waals and hydrogen bonding parameters. These sections distinguish parameters by the types of atoms and bonds used. These main atom and bond types are listed at the left of each parameter line. Each stretch, bend and torsional array in the molecule is tested against lines in the force field until a match is found. When a set of atoms and bonds in the molecule defining a stretch, bend or torsion matches those in the parameter line, then the associated parameters from that line of the force field file are used for that interaction. For a match to be found, the atom types and bond types must be the same in the molecule and in the parameter line.

Within the Main Parameter Section of the force-field, lines beginning with I>0 have the following meanings:

1 Constants for stretching interactions
2 Constants for bending interactions
3 Constants stretch-bend interactions
4 Constants for dihedral interactions
5 Constants for improper torsional (or OPB) interactions
6 Van der Waals constants for pairs of atoms
7 Constants for hydrogen bonding interactions Toward the right in these lines optional atom descriptors may be given. These list additional atom types which must be attached to each of the main atoms at the left of the parameter line for a valid match. Each descriptor consists of a 4-character field (e.g., 0000 or C200) specifies two attached optional atoms. 00 is the wild card atom type symbol. Thus 0000 specifies that there are no required attachments to the corresponding main atom (i.e. just match the main atoms and bonds). C200 says that a C2 (an sp2 carbon) must be bound to the main atom in question. A C2C3 indicates that both a C2 (sp2 carbon) and a C3 (sp3 carbon) must be bound. Optional atom fields of each force field file entry must not be left blank in the force field file. If no special attachments to any main atoms are required, then use a 0000 of each atom in the interaction. For stretches there 2 optional atom fields (corresponding to the two main atoms of the stretch interaction). For a bend, there are 3 optional atom fields, for a torsion, there are 4 such fields. Consider the torsional entry from a force field file below as an example:

                             v1         v2          v3
 4  C3 - C3 - C3 - H1      0.0000     0.0000     0.3500  0000 H1H1 0000 0000 
    ^    ^    ^    ^                                     ^    ^    ^    ^
   atm1 atm2 atm3 atm4          Optional atoms for atom: 1    2    3    4
The entry above specifies that anytime a C3 - C3 - C3 - H1 torsional array is found in which two hydrogens (H1's) are attached to main atom 2, then the torsional force constants given (V1 = 0.0, V2 = 0.0 and V3 = 0.35) will be used. Now consider the following torsion force field entry:

 4  C3 - C3 - C3 - H1      0.0000     0.0000     0.2670  0000 0000 0000 0000 
    ^    ^    ^    ^                                      ^    ^    ^    ^
   atm1 atm2 atm3 atm4           Optional atoms for atom: 1    2    3    4
The entry above notes that when any C3 - C3 - C3 - H1 array is found, then the force constant V3 =0.267 will be used regardless of the substitution of any atom (including atom 2). This is because all optional atom entries are 0000. Both of the above lines may appear in a force field because matching is done top down: the first time an acceptable match is found, the parameters from the matching force field line will be used.

For some torsional arrays, it may be desirable to add higher order parameters. This may be done for V4-V6 terms by adding an additional line of torsional parameters and beginning the line with a '54' as shown below:

 4  C3 - C3 - C3 - H1      0.0000     0.0000     0.2670  0000 00Si 0000 0000 
54                         0.1000     0.0000     0.5000   
The parameters above specify V3 = 0.267, V4 = 0.100 and V6 = 0.500.

Parameters in the file must be ordered from most specific to most general since the program will assign the parameters to interactions the first time a valid match is found between the molecule and the parameter line. Thus in the example of the two torsional interactions above, the more specific one given first must come first in the force field file. That will allow the program to apply the correct v3 term (0.350) if there are two hydrogens (H1) on atom 2 - otherwise C3-C3-C3-H1 interactions will use V3=0.267.

The above paragraphs describe how to specify up to two atoms alpha to (one covalent bond removed from) each atom in the interaction. By adding up to three additional lines of optional atom descriptors,up to three beta atoms (connected to an on-interaction atom atom via two bonds) may be specified for each alpha atom. Consider the three O-H stretching entries below:

 1  O3 - H2                1.0100     5.0000    -1.0000 1C200 0000    Entry 1
                                                         O200 0000
 1  O3 - H2                1.0100     5.0000    -1.0000  C200 0000    Entry 2
 1  O3 - H2                1.0100     5.0000    -1.0000  0000 0000    Entry 3
The third entry shown here would be the O-H stretch of a simple saturated alcohol. The second would specify the O-H stretch of, say, a phenol, which has a C2 (an sp2 carbon) connected to the oxygen. The first entry would not match a phenol, since it specifies that the C2 alpha to the oxygen must be bonded to an O2 (sp2 oxygen). This would match a carboxylic acid.

The lines specifying beta atoms have continuation characters immediately preceding the first set of alpha interactions. In the first entry above, the 1 in the string 1C200 tells the program to read a single line of beta atoms next. This character should be replaced by a 2 or a 3 if there are two or three lines of beta atoms. Beta optional atoms appear directly beneath the alpha optional atom to which they are attached. An alpha atom of type 00 may not have beta atoms specified; however, wild-card types for specific atomic numbers, such as C0, may have beta atoms. Within each group of alpha and beta atoms, more specific combinations must be specified first. For example, if it is desired to match interaction atom X, attached to -A-B and also attached to another A, the -A-B combination must be given before (to the left) of the unqualified -A attachment.

When adding new parameters or modifying old ones, it is wise to use a test structure to be sure the new parameters are being found. If they are not, then you may be using an incorrect atom or bond type or a match is occurring from an entry higher up in the force field file for the stretch, bend or torsional parts of the force field. You may easily test for ordering by putting your entry at the top of the appropriate section (stretch, bend, etc). You should also rerun the energy test structures supplied to be sure you have not changed other parameters in error. Also be sure to make the appropriate changes in the interaction descriptors at the far right of each entry so that the parameter set will be correctly identified (as we have defined them) as (Column 1) O (original), M (modified), or A (added) and as (Column 2) 1 (high quality parameter), 2 (tentative parameter) or 3 (generalized parameter). We also add a brief comment and the name of (or reference to) the parameter author. You should add the appropriate comments, your initials and location for each entry you add or modify to keep track of the parameters you have added. You will want to be able to add your new parameters to any force fields we supply in the future. Note that force field lines are up to 130 characters so set your editor to display the entire line.

Some excerpts from mm2.fld follow, by way of example. Figure 3 exhibits several stretch interactions.

A positive bond moment will make the first atom listed in the bond entry positively charged. In the forth entry above, the value of -0.32 Debye will make the C3 negative and the H1 positive.

In the first line of the stretch section, a special C3 - C3 interaction is defined by the optional alpha descriptors O3O3 which require that both the first C3 and the second C3 have attached O3s, as in ethylene glycol. This is not a standard MM2 stretch and is therefore marked in the SEL 1 field with an A . The SEL 2 field is marked1 since the parameter is a high quality one. The second and third lines contain MM2 and MM2' parameters and are distinguished at the right hand end of the lines in the labeling of the second line by the ALT specifications 1 a and 1 b. As described above, if ALT 1 is given the a value in the Introductory Section of the force field, MM2 (not MM2') parameters will be used.

Some bend parameters are shown in Figure 4. The first few parameters shown are original (O), high quality (1) MM2 parameters. Also, the environmental dependence of the bending angles (methyl vs. methylene vs. methyne) is handled with optional descriptors for the central atom. The last few parameters shown are highly generalized bend interactions about various sorts of central atoms. The use of wild-card atom types (00) and bond types (*) means that one of these interactions will match nearly any bend having the given central atom. These are given at the end of the bend section, so that more specific interactions will match first. These entries are labeled A (added) and3 (low quality).

A portion of the stretch-bend part of the force-field is shown in Figure 5. Torsions are shown in Figure 6. The same considerations hold for these interaction types as hold for stretches and bends. Of course, not all force fields have all interaction types; for example, the AMBER force fields have no stretch-bend interactions.

Low quality (3) torsional parameters are the most likely to give incorrect results in molecular mechanics. If generalized torsions are used in a calculation, then quantitative results should not be expected and new torsional parameters should be devised to fit the molecule on the basis of experimental (or ab-initio quantum-mechanical) data. Minimally, the results of a calculation with low quality torsion should be compared with experimental data to determine how poor the calculation is. See the MacroModel Technical Manual for some comments regarding parameter derivation.

Figure 7 shows an excerpt from the section of the force-field giving out-of-plane bending parameters. When the these interactions are handled using improper torsions, the atoms are listed with the central atom (e.g., the carbon of a carbonyl group) first in the entry; that is, if the atoms are listed in this section in the sequence ABCD, the dihedral angle calculated will really be BACD. When these interactions are being handled using Wilson angles, the two "anchor" atoms are given first, then the "apex" atom, then the "end"atom (Wilson, E. B., Decius, J. C. and Cross, P. C., Molecular Vibrations, Dover Publications, New York (1955), p. 59).

A section of the van-der-Waals portion of the force field is shown in Figure 8. This section gives single-atom van-der-Waals parameters which will be combined using some combing rule. The parameters given are r0 and .

Note the use of optional atoms to distinguish MM2 atom types and the Lp descriptor at the right to indicate which atoms get lone pairs; for example, an O3 not bound to P, S or C(sp2) gets a lone pair. MacroModel does not use lone pairs on enol, ether or ester oxygens this gives better results with hydrogen bonding. In the past it was not possible to specify beta atoms for this portion of the force field, but this is now allowed.

New Feature IconStarting with BatchMin 6.0, formal charges for atom types are obtained from the atom.typ file, rather than from this section of the force-field file.

Following the van-der-Waals parameters, there are two more parts of the Main Interaction Section of the force field. Figure 9 shows some of the special van-der-Waals interactions. These encode overrides of the normal van-der-Waals combining rules for atom-type pairs. Figure10 shows some hydrogen-bonding parameters. Even when there is no special functional form invoked for hydrogen bonding, the parameters for whatever standard nonbonded potential is being used is derived from this section for atoms that can hydrogen bond.

Adding New Parameters

New parameters generally come from one of three possible sources: new published work, quantum-mechanical calculations and educated guesses based on analogy to systems similar to that being parameterized. Some examples of parameters based on quantum-mechanical calculations are given in the MacroModel Technical Manual.

New parameters from MM2, MM3 and OPLSA publications may be added directly to the force field files without modification. If specific charges are desired, you may convert the desired charges to a dipole entry in the stretch section of the field using the following formula:

µ = L0 * Q / 0.2082

Here, µ is the dipole moment in debyes, L0 is the natural length of the bond (in Angstroms, Å) given in the force field and Q is the desired charge (+ on one end and- on the other). A positive dipole moment means that the first main atom in the force field entry will have the positive charge and the second atom will get an equal but negative charge. For the opposite polarity, use a negative dipole moment. Again, test by doing an energy calculation on a structure in MacroModel and then use the ELECT button (ENRGY mode) to display the resulting partial charges.

For stretches and bends, new values may be assigned to fit experimental data as far as bond lengths and angles are concerned. Such data commonly comes from high quality x-ray crystal structures. Force constants can be approximated by analogy with similar structures in the force field or from infrared stretching frequencies:

K(stretch-MM2,MM3) = 5.3x10-7 2 M1 M2 / (M1+M2)

K(stretch-AMBER) = 3.0x10-5 2 M1 M2 / (M1+M2)

where is the IR frequency in wave numbers (cm-1) and M1 and M2 are the atomic masses of the atoms involved in the stretch. Alternatively, you may transfer force constants from one field to another. AMBER stretching force constants are approximately 60 times those of MM2 or MM3.

For bending, infrared frequencies may also be used (if they are available for the scissoring mode) from the following approximate expression:

K(bend-MM2,MM3) = 3.0x10-7 2 M1 M3 / (M1+M3)

K(bend-AMBER) = 3.4x10-5 2 M1 M3 / (M1+M3)

where is the IR frequency in wave numbers (cm-1) and M1 and M3 are the atomic masses of the first and third atoms involved in the bend. AMBER bending constants are approximately 120 times those given in MM2 or MM3 for the same bending array.

While stretch (and sometimes bend) parameters may be moved from one force field to another without creating large errors, torsional parameters are highly force field dependent for the same array of atoms. Consequently, new torsions must be parameterized in the context of the field to be used. Often, data from one well-parameterized force field (e.g. MM2 or MM3) can be used as the basis for parameterization of another force field.

Nonbonded parameters for unusual atoms may be approximated according to the following equation (Slater and Kirkwood, Phys. Rev., 37, 682 (1931); Weiner and Kollman, J. Comp. Chem., 2, 287 (1981)). The van-der-Walls is given by:

= 1.4154 2r-6 (/Nel)1/2

where r is the covalent radius (ionic radius for ions) in Angstroms, is its polarizability (Å3) and Nel is its effective number of electrons for the atom (generally equal to the actual number of electrons for first and second row elements).

While these protocols work well, it is easy to corrupt the force field by careless addition of new parameters or modifications of old ones. If you make any additions to the force fields, be sure and test that the additions you make do not change the energetic results of a adequately wide variety of test structures. In particular, unless it is your intent to have your new parameters alter even the structures which are well parameterized with the force-field as we supply it, you should check to make sure that the energy tests described in Appendix 1 still work. properly.

Special Notes for AMBER

AMBER94 does not differ from the published force-field in any significant way.

The MacroModel implementation of AMBER* is identical to original AMBER with the exception of additional parameters which we have added. We supply the field by default with a constant dielectric treatment, the united-atom AMBER charge set and Kollman's 6,12-Lennard Jones hydrogen bonding treatment. Other options (e.g. distance dependent dielectric) may be set by modifying the first section of the AMBER force field file or via button selections from within MacroModel.

The AMBER force field we supply has a special substructure notation which allows a single residue substructure to match both united atom and all atom structures and assign the appropriate charges as well as other parameters. Although the linear notation gives only the united atom description of the molecule, the program will interpret the notation both as an actual united atom representation and as the all atom representation with the appropriate number of explicit hydrogens. The charges given for the residues are in the united atom representation for both the united atom and the all atom force field parameter sets. Thus, for the all atom charge set, the charges given are for the heavy atoms plus any attached hydrogens. To get the actual complete all atom charge set, the program uses dipoles from the main section of the field (mainly for heteroatom-H's) or from the substructure (mainly for C-H's).

By using a combination of bond dipoles and formally united atom charges, MacroModel reproduces both AMBER field charges exactly and also provides reasonable charges for substituted or modified residues. As such, it allows the use either charge set (from the All atom or United atom paper) and, independently, hydrogens on carbons or united atoms. As we supply the field, the charges correspond to those given in the older AMBER paper (i.e. the united atom paper), but you may switch to charges from the newer paper (i.e. the all- atom paper) by modifying the ALT 1 line at the top of the force field file.

You will notice that many of the parameters for some of the more complex amino acid sidechains (phenol, indole, imidazole, etc) are given separately from the actual residue itself in the substructure section of the force field file. This allows any such substructure to be given the same parameter set as used in the actual amino acids. Thus indole itself gets the same parameter set as the indole of tryptophan. Such substructures are labeled as 'C' (for Continue, see below) which means that the atoms matching the substructure may also be matched against substructures futher down in the force field (i.e. where the actual residue might be found). The residues themselves are labeled 'S' (for Stop). This means that once atoms are matched (as a complete residue), then they are removed from candidacy for matching with substructures which come below them in the field. This distinction speeds substructure matching since each such match of a 'S' substructure removes atoms which must be tested from subsequent matching.

Amber Torsional Specification

There are three differences between the MacroModel and the AMBER torsional parameters. These differences do not affect the actual functional forms used by the program, but only the way in which parameters are specified.

First, AMBER uses a different entry for each 2-fold (n=2) and each 3-fold (n=3) barrier; MacroModel uses a single entry with both 2-fold (V2) and 3-fold (V3) barriers included. Second, AMBER uses only positive torsional potentials but uses an angular offset () to adjust the position of minima and maxima. For =0° or =180°, this is best accomplished in MacroModel through the use of positive and negative Vn values, as follows.

AMBER n=3, = 0°: use V3/2 directly

AMBER n=3, =180°: use -V3/2 for V3/2

AMBER n=2, = 0: use -V2/2 for V2/2

AMBER n=2, =180: use V2/2 directly

Some AMBER publications give torsional parameters as Vn/2 and some as Vn. In MacroModel force-field files, always use the Vn/2 value.

We also implement AMBER torsional offsets explicitly for arbitrary offsets; however, the implementation is not as efficient as that described in the preceeding two paragraphs. Thus, for example, you may, in principle, specify an n=2 potential using a 180° offset by means of a positive value of Vn/2 and an explicit value of 180°, but you are advised instead to use the formulation given above. On the other hand, for offsets equal to neither 0 nor 180o, there is really no choice but to use an explicit specification.

To specify offsets explicitly for V1, V2 and V3, use a continuation force-field line beginning with 64 and place the offsets in the same columns as the V values for the main torsion line. Similarly, to specify offsets for V4, V5, and V6, use a continuation line beginning with 74 and insert the offsets in the same location. (Recall that V4, V5 and V6 themselves use a continuation line beginning with 54.) An offset specified as 0 is taken to mean "no offset specified", so that if you wish to use a V2 potential with a zero offset and a V3 potential with a 31° offset, use a negative value of V2/2, as described above, on the first torsion line for this interaction; place the positive value of V3/2 on the same line. On a continuation line beginning with 64, indicate a V2 offset of zero and a V3 offset of 31°.

AMBER publications give total barriers for generalized torsional arrays - arrays with wild card atoms (the X atom type in AMBER publications) at positions 1 and 4. MacroModel, however, uses the MM2 method of storing only the component torsional barriers except in the substructure part of the field. Thus only generalized AMBER torsional barriers should be divided by the multiplicity of the torsional array where the multiplicity if equal to the number of substitutents on the 2nd atom (excluding the 3rd atom) multiplied times the number of substituents on the 3rd atom (excluding the 2nd atom) in a torsional array of four atoms: 1-2-3-4. For the central bond of hexamethyl ethane for example, the multiplicity would be 3*3 or 9. For the peptide linkage, the multiplicity would be 2*2 or 4. Note that the multiplicity is computed based on the actual number of substituents so the multiplicity of a pair of united atom methylenes (e.g., x - CB - CB - x (MacroModel) or X - C2 - C2 - X (AMBER) would be only one since only single substituents (X) are attached to the methylene carbons. If you are confused, compare some of the entries in the MacroModel AMBER force field with those published in the AMBER papers. Remember that if a torsional array is described in AMBER publications without wild card X atoms, then the barriers are simply adjusted by sign as described above, but not divided by multiplicity.

Regarding AMBER-style generalized torsions, MacroModel follows the AMBER protocol of not including the multiplicity for generalized torsions in substructures. only. Thus the substucture torsional barriers are the same as published in the AMBER papers if they are in the force field file in the form of 00 nn nn 00 where nn is some atom number. The beginning and ending 00's are wildcard atom types.

Special Notes for MM2 and MM3

The MM2 field we supply is similar to but not identical with that provided by Allinger in 1987. The MM3 parameter set is the 1991 version. The stretch, bend, torsion and off-diagonal (e.g. stretch-bend) equations are identical with those of the authentic fields. Included in the MacroModel fields are a number of additional parameters to handle substructures other than those parameterized by Allinger. Such parameters are labeled at the right end of the parameter line as A for Addition. We have followed Allinger's protocol of distinguishing parameter quality by noting his final values as 1 and other, less reliable ones, as 2 or3' (also at the right of each parameter line). When you report the results of a MacroModel MM2 or MM3 calculation, then you should report what nonstandard, additional parameters (if any) are used.

More significant departures involve charges. MacroModel uses the partial charge (optional in MM2) treatment of electrostatics instead of the MM2/MM3 standard dipole-dipole electrostatics. Partial charges for the actual MM2 or MM3 parameters come from the classical definition of a bond dipole and thus correspond to the MM2 or MM3 dipoles exactly. Electrostatic energies will however be different from those given by the authentic Allinger programs and will differ the most when the charged/dipolar groupings are close together. As we provide them, MM2 and MM3 force fields use distance-dependent dielectric electrostatics.

The MacroModel implementation of MM2/MM3 includes approximate parameters for hydrogen bonds which were adjusted to mimic AMBER hydrogen bonding potentials. Part of the hydrogen bonding implementation included the removal of lone pairs from ester and aryl ether oxygens which gave too high a hydrogen bonding energy - this removal of lone pairs from ether oxygens bound to sp2 centers is another departure from the official MM2.

Whereas MM2 and MM3 use a SCF pi calculation to obtain parameters for conjugated systems, MacroModel uses specific torsional parameters from the force field for conjugated dienes, enones, arenes, etc. Where available, these torsional parameters were developed to fit data from Allinger's papers or from MM3. Otherwise, they were developed to fit ab initio data. Aromatic ring parameters are given in the special substructure section of the field.

A final relatively insignificant difference is the way out-of-plane bending is handled. In MM2 and MM3, an out-of-plane distance is computed whereas in MacroModel we use an improper torsion with a barrier which mimics the MM2/3 out of plane energy for moderate excursions from planarity.

Substructure Section

Use of Substructures

The last section of the force field file lists special substructures. A force-field substructure is a contiguously connected chemical entity, such as an amino-acid residue or an aromatic nucleus. The connectivity is given by a simple linear notation. For each substructure, parameters are given that override any parameters obtained for the corresponding interactions from the Main Parameter Section of the force-field. These parameters may lie wholly within the substructure or they may include off-substructure atoms. Single-atom van-der-Waals parameters may not be specified within substructures and charges may be specified either by means of bond dipoles, as in the Main Parameter Section, or else by means of explicit charges. In contrast to the Main Parameter Section of the force-field, parameters for special substructures must be entered with the most specific parameters last, rather than first;; in other words, the last parameter match found is the one that is used. This is also true for the substructure matching process: if a given portion of the molecule under study is matched by several force-field substructures, the substructure that appears last in the force-field is normally used. For substructure matching, but not for parameter matching within substructures, this behavior may be overridden in the force-field specification of the substructure.

Within the Substructure Section of the force-field, lines with I>0 or beginning with character data have the following meanings; this list augments the values shown in the Main Parameter Section. Recall that I is a two-character field; single values shown must be right-aligned in this field.

8 constants for charges in special substructures
9 Substructure linear notation (see section below)
C Data lines beginning with a blank followed by a C or S mean "continue" or "stop" in the substructure-matching process.
aT Data lines beginning with two ASCII characters (e.g., aT or aR) indicate a geometrical constraint line, as described in the section, Geometry Dependent Parameters.
an Data lines beginning with a character followed by an integer encode interactions subjected to geometry dependence, as described in the section, Geometry Dependent Parameters. The letter refers to the corresponding geometric constraint line and the integer specifies the interaction type (e.g., 1 for stretch) as usual.

Pay Close Attention IconSingle-atom van-der-Waals parameters may not appear in substructures.

Substructure Linear Notation

In addition to the normal one- to four-atom descriptors of stretch, bend and torsional substructures used in molecular mechanics, MacroModel provides for description of more complex substructures. These are the "special substructures" given at the end of a MacroModel force field files. The description of the atomic connectivity is given by a simple linear notation which resembles SMILES/SMARTS notation. Atoms and bonds are described as in normal interactions but up to 50 atoms (130 characters maximum) may comprise a substructure and complicated connectivity, including wild-card and optional atoms, may be specified.

Structures are described as a list of connected atoms and bonds; all atoms and bonds in the structure must be explicitly included. Ethyl vinyl ether would be:

C2=C2-O3-C3-C3

The atom types are normal MacroModel atom types: C2 is sp2 carbon, C3 is sp3 carbon and O3 is non-carbonyl oxygen. Simple generalized atom descriptors may also be used; then the formula above would be:

C=C-O-C-C

You may also use the equivalenced atom types to represent many different types of atoms. Generalized bond descriptors may also be used; for example, an asterisk (*) stands for a bond of any order:

C*C-O-C-C

This could represent diethyl ether, ethyl vinyl ether or ethoxy acetylene.

Ring Closure

Ring-forming bonds are added as a bond type followed by a number which spedifies which atoms are bound together. Cyclopentane is:

C-C-C-C-C-1

Thus the last (fifth) atom is bound to the first (-1); this signifies a single bond to the first atom in the substructure. Ring closure bonds can go only to atoms earlier in the substructure. Norbornane would be represented:

C-C-C-C(-C-1)-C-C-1

Note the use of parentheses to identify atoms off the linear backbone as represented. Quinoline could be:

N=C-C=C-C=C(-1)-C=C-C=C-5

For the quinoline above to match, it would have to have the connectivity and bonding exactly as drawn above. Thus the user would always have to draw the correct resonance structure for the parameters to be found. As an alternative, we generally specify the atom type more exactly to designate the atomic hybridization and use wild card bonds for all the connectivity. Thus any resonance form of quinoline would match the following:

N2*C2*C2*C2*C2*C2(*1)*C2*C2*C2*C2*5

Chain Branching

As a further example of the use of parentheses, methylcyclobutane could have any of the following linear notations:

C(-C)-C-C-C-1
C-C-C-C(-C)-1
C-C-C-C(-1)-C
C-C(-C)-C-C-1
Parentheses may be nested. Isopropylbenzene could be:

C=C(-C(-C)-C)-C=C-C=C-1

Threonine could be:

N-C(-C(-C)-O-H)-C=O

Optional atoms

Optional atoms may also be specified within the linear notation so that the substructure will match whether or not the optional atom is found. Optional atoms with associated bonds are put in square brackets ([ ]). In the substructure below, serine would be identified whether or not the optional hydrogens were present in the structure:

N[(-H)]-C(-C-O[-H])-C=O

Parameters will be assigned for optional atoms if the atoms are explicitly present in the molecule. If the optional atoms are absent, the constants involving those atoms will not be used in reassigning the interaction constants previously assigned by the first part of the force field (1-4 atom interactions). Another way to accomplish the same end is to omit the optional atoms but to include the parameters involving the hydrogens as off-substructure interactions (see below).

The atoms in the substructure are referred to by number when parameters are assigned. An atom is simply referred to by its position in the linear notation; for example, the serine nitrogen would be number one and the carbonyl oxygen would be number 8.

Substructure Examples

Figure 11 shows a substructure specification for a cyclopropane nucleus. It may be useful to know that the -3 and -2 that appear alone on lines specify the format for the reading of subsequent lines. The C, 9, 1, 2, etc. introducing other lines specify the nature of the information to be read. The C specifies that the title of a substructure is to be read, with the default ("continue") matching behavior; an S ("stop") in place of C would prevent the matched atoms in the structure being studied from being candidates for further substructure matches. In other words, S would override the "use last substructure match" default behavior for this particular substructure. 9 introduces a linear substructure specification within the -3 format block and and 1, 2, etc. introduce parameter specifications as follows, within the -2 format block:

Substructure interaction lines can specify optional connected atoms, but if they do, then non-wildcard atom-types in the optional-atom list will be matched only against atoms not in the substructure. In this context, a wildcard atom-type is either 00, which matches any atom type, or else one of the wildcard types for an element; for example, C0 for any carbon. The numbers in the interaction proper (such as 1 2 in the first stretch shown) correspond to the numbers in the linear substructure notation. Atom types, when shown, correspond to matching atoms off the substructure.

The following shows a substructure for benzenoids, giving benzene-like bond lengths, angles and force constants. Wild-card bond types are used to allow matching of any resonance form. Since the substructure is matched a total of 12 different ways, all aromatic bonds will be parameterized by this substructure (with length = 1.39 Å, force constant = 8.0667), etc. The torsion is generalized (the first and last entries are00, the wild-card atom type), so it represents the total barrier (recall that this handling is special for generalized torsions in substructures). The barriers used by the individual components would thus be V2 = 36.8/4 kcal/mol.

The usual torsional parameter handling in substructures is that all parameters (V1, V2 and V3) for a matched torsional interaction are overwritten by the torsional parameters from a torsional substructure parameter line. It is possible to overwrite only specific components (e.g., only the V2 component) by using the special parameter entry 99.9999 for the other components; this prevents overwriting. For example, the torsional parameter line below would replace only the V2 barrier (with 2.5 kcal/mol) of an existing torsional interaction.

Note again that the V2 barrier specified here is the total barrier, because of the use of wild-card atom types.

Figure 12 shows how a specific set of charges (perhaps derived from ab-initio quantum-mechanical studies) can be specified directly in a substructure. The purpose of the substructure above is to load the partial charges given in the line beginning 8. The charges are placed on the substructure atoms in the order in which the atoms appear in the substructure notation. When using partial charges directly instead of bond dipoles, be sure that the total charge of the substructure sums to 0.0 for neutral substructures. Also check the final charge set using MacroModel (ELECT button) to test your charge data.

Geometry-dependent Parameters

It is occasionally desirable to make the parameters used for certain degrees of freedom dependent upon the geometry of some other degree of freedom. We allow this, in a limited sense. First, the degree of freedom which dictates the parameters must be configurational; that is, we do not provide a mechanism for "on-the-fly" reparameterization of atoms, but only for parameterizing some degree of freedom based on the initial value of some other. Second, we provide three specific forms of this dependence. First of all, a parameter for an interaction can depend on a torsion value, and secondly a parameter for an interaction can depend on whether the interaction is within a ring; finally, a parameter can depend on the initial value of an angle. The following examples illustrate several cases.

This shows how parameters may be selected depending on the value of a torsion. Here, we wish to load one exocyclic O3-C3 bond length (1.3Å) for an a-anomeric (axial) sugar and a different bond length (1.4Å) for a b-anomeric (equatorial) sugar. The entry beginning aT defines a geometrical constraint which is to be tested just before the parameters are loaded. The T defines the constraint as a torsion. The a could be any character (a-z,A-Z) and is used to associate particular subsequent interaction lines with the constraint. The 1 2 3 4 torsion angle, because it is proceded with an a, will be subjected to the aT specification. The constraint test will passed if the torsion angle of the equivalent atoms in the molecule is in the range 180° ±60°. For -anomeric sugars, this torsion angle would be approximately 180° (anti). In the example, the 1-2 bond (the first O3-C3 in the substructure) will be given a natural length of 1.3Å by default, but if the torsional constraint labeled "a" is met, it will be given the value 1.4 Angstrom. This is specified by prefixing the interaction line containing this bond-stretch parameter value by the letter "a". The the more specific (here a1') interaction line must come second in the list of substructure force field entries, since in these entries the last match found overrides the others. If the default value were given last, it would always be used.

The following excerpt exhibits a more complex torsional dependence of several parameters, and also a dependence on whether several atoms are present in a ring.

Here, parameters may be dependent upon three different torsions, given in the lines beginning with uT, vT and wT, or on whether the atoms in the molecule corresponding to substructure atoms 2 and 10 are in a ring of size eight or smaller, as specified in the line beginning with rR.

The equilibrium bond length between substructure atom 4 and an off-substructure H1 is set to 1.0712 Angstrom by default. If the optional C3 is present on atom 4, it is given the value 1.0736 Angstrom, unless the 2-3-4-5 torsion falls between 120o and 240o. In this situation, the 4-H1 bond-length is set back to the default value. The two other torsional dependencies shown are straightforward.

The lines beginning with r2 specify a test against the ring constraint. The 2-3-10 bending force-constant is set to 0.06 by default, but is set to zero instead if atoms 2 and 10 fall within a ring of size 8 or less. This condition will be met in the internal Diels-Alder transition state. The H1-3-10 bending force-constant (where H1 is an off-substructure hydrogen) is subject to a similar test.

The same mechanism can be used to specify the angular dependence of a parameter. The letter A in the second column of a force-field substructure line denotes such a dependence. The first real-number field specifies a central value, and the second specifies a range about it. This mechanism can be used, for example, to set equilibrium bond angles in a trigonal bipyramidal complex to 90°, 120° or 180°, depending on whether each ligand pair is axial-equatorial, equatorial-equatorial or axial-axial. Usually, however, one uses the points-on-a-sphere approach (see VDWB) to model such systems.




If you have any questions about running MacroModel®, please send email to the following address:
mmod@still3.chem.columbia.edu
Copyright © 1997, MacroModel® Development Group. All rights reserved. Last updated: 03/14/97 13:03:16