5. Input options. The input.uspex file

The input.uspex file has json-like syntax and hierarchical structure representing modular nature of USPEX program. Below we describe the most important parameters of the input. Most of the parameters have reliable default values (this allows you to have extremely short input files!). Those options that have no default should always be specified. Please consult online utilities at https://uspex-team.org/online_utilities/ — these help to prepare input and analyze some of results. Section 6 of this Manual briefly discusses these utilities.

In order to make structure of parameters and blocks of the input.uspex file more visual, we have prepared an example code template, which include every block and parameter. You can see it in Section 7.1 .

5.1. The input.uspex file syntax

The input.uspex consists of main section and a number of definition sections. Main section is mandatory and precedes definition sections. Definition sections are optional. There could be any number of definition sections. Each definition section starts with define name line.

{...}

#define name1
{...}

#define name2
{...}

#define name3
{...}
...

Main input section is described in Section 5.2, Section 5.3, Section 5.5, Section 5.6 and Section 5.7.

Definition sections correspond to parameters of ab initio calculations (Section 5.8), submission parameters (Section 5.9), molecule definitions (Section 5.10) and environment definitions (Section 5.11).

Each section is a dictionary of key, value pairs.

{
    key1 : value1
    key2 : value2
    key3 : value3
    ...
}

Such pairs correspond to different parameters or blocks of parameters. Input parameters are grouped into blocks recursively. Each block reflects some aspect of calculation. So it should be quite intuitive to prepare such input file.

Parameters can have numeric, string, sequence or map values.

numGenerations : 50,
stages: [vasp1 vasp2 vasp3]
ionDistances: {'C C': 2.0 'C H': 1.2 'H H': 0.7}

Blocks of parameters look as follows:

compositionSpace: {
    symbols: ...
    blocks: ...
}

There are two types of sequences: lists [] and tuples (). Tuples are expected to have fixed length when lists can have arbitrary one.

5.2. General calculation parameters

The most general algorithm used in all USPEX modes is iteration. At each iteration step program deals with a number of systems also called individuals. And the set of these systems is called population or generation. Such iteration is controlled by two parameters: numGenerations and stopCrit.


numGenerations:

Maximum number of generations allowed for the simulation. The simulation can terminate earlier, if the same best structure remains unchanged for stopCrit generations.

Default:

Format:

numGenerations : 50

stopCrit :

The simulation is stopped if the best structure did not change for stopCrit generations, or when numGenerations have expired – whichever happens first.

Default:

Format:

stopCrit : 20

USPEX job with population consists of two parts: creation and relaxation.

Creation is controlled by block optimizer. This block depends on the type of optimization. For global optimum search it is GlobalOptimizer. In later releases VCNEB optimizer type will be available for optimization of a single phase transition pathway.


optimizer :

Specifies the type of calculation and its parameters

Available types:
  • GlobalOptimizer | global search algorithm see Section 5.3.

  • ModelOptimizer | model training algorithm see Section 5.4.

Default:

Format:

optimizer : {type: GlobalOptimizer ...}

For population relaxation USPEX employs a powerful two-level parallelization scheme, making its parallel scalability exemplary. The first level of parallelization is performed within structure relaxation codes, the second level of parallelization distributes the calculation over the individuals in the same population (since structures within the same generation are independent of each other).


stages :

List of relaxation stages to be applied to each individual in population. The names of definition sections should be used in this sequence. See Section 5.8 for information on writing this section.

Default:

Format:

stages : [glp glp glp]

numParallelCalcs :

Specifies how many structure relaxations you want to run in parallel.

Default:

Format:

numParallelCalcs : 10

One may specify output file according to their convenience.


output :

Specifies the content and title of the columns in the output. These columns will be used whenever details of an individual will be printed (Individuals, goodStructures, etc.).

Default: depends on calculation type

Format:

output: { columns: [   (simpleMoleculeUtility.composition 'Composition')
              (radialDistributionUtility.structureOrder 'Structure order')
              (radialDistributionUtility.averageOrder 'Average order')
              (radialDistributionUtility.quasientropy 'Quasientropy')]
          }

outputRefreshDelay :

Time (in seconds) between file updates in the results folder

Default: 120

Format:

outputRefreshDelay : 30

5.4. Model training

In model training mode USPEX samples target space (e.g., crystal structures of certain compositional range) and retrains machine learning model (e.g., of interatomic potential) with training set generation by generation. The process of training stops when the algorithm becomes unable to produce individuals sufficiently different from those already used in training the model.

{
    type: ModelOptimizer
    target: {...}
    popSize: ...
    initialPopSize: ...
    fractions: {...}
    model: {...}
}

target :

Specifies the target space of search

Default:

Available types:
  • Atomistic – same target space as in structure prediction (see Section 5.6). This target deals with structures consisting of atoms including molecular crystals, various dimensions and environments (like substrate).

Format:

target : {type: Atomistic ...}

popSize

The number of structures in each generation; size of initial generation can be set separately, if needed.

Format:
popSize: 100
initialPopSize

The number of structures in the initial generation.

Default: equal to popSize

Format:
initialPopSize: 1000
fractions

This parameter defines percentages of generators.

Format:
fractions : {
            randTop: 0.5
            randSym: 0.3
            randSymPyxtal: 0.2
            }

model

Specifies model to be trained.

Available types:
  • External – the model is implemented in stand alone program. An interface is used to communicate with it. The names of definition section should be used in interface field. See Section 5.8 for information on writing this section.

Format:
model : {
            type: External
            interface: ...
        }

5.5. Optimization type

Optimization type specification is designed as powerful method of writing functions inside input file. Such functions looks like

(func arg1 arg2 ...)

Here arg1, arg2 could be functions themselves. So this procedure could be recursive. For example:

(pareto (aging enthalpy) (negate radialDistributionUtility.structureOrder))

This means that parameter to be optimized will be calculated for each structure by following procedure. Enthalpy will be taken from structutre directly, parameter structureOrder will be calculated using radialDistributionUtility module. Then aging function will be applied to enthalpy and negate function to structureOrder. Then finally optimization parameter will be determined via pareto function with two arguments: aged enthalpy and negated structure order.

For now there are following functions available:

Function

Description

negate

takes opposite value

aging

apply penalties to old structures (not tunable at current release)

pareto

calculate Pareto front in space of given arguments.

Available base parameters:

Name

Description

enthalpy

enthalpy of system

enthalpyCCH

enthalpy per block above convex hull (for variable composition)

enthalpyCS

enthalpy per block above best for the same composition

simpleMoleculeUtility.density

structre density

cellUtility.volume

volume of system unit cell (for 3D)

cellUtility.area

area of system unit cell (for 2D)

cellUtility.length

Period of system unit cell (for 1D)

bondUtility.hardness

Structure hardness

radialDistributionUtility.structureOrder

degree of order

radialDistributionUtility.quasientropy

structural quasientropy

elasticML.bulkModulus

Elastic bulk modulus calculated using ML model.

elasticML.shearModulus

Elastic shear modulus calculated using ML model.

elasticML.youngsModulus

Elastic Young’s modulus calculated using ML model.

elasticML.poissonsRatio

Elastic Poisson’s ratio calculated using ML model.

elasticML.pughsRatio

Elastic Pugh’s modulus ratio calculated using ML model.

elasticML.vickersHardness

Elastic Vickers hardness calculated using ML model.

elasticML.fractureToughness

Elastic fracture toughness calculated using ML model.

Providing only base parameter as optType is also a valid option.

5.6. Crystal structure prediction

When target type in Section 5.3 is set to Atomistic USPEX performs global optimization in the space of structures consisting of atoms, Historically the first property which USPEX optimized was enthalpy and search was for stable crystal structures. Now this target covers atomic and molecular crystals in various dimensions and environments.

Typical target block in this mode looks like

{
    type: Atomistic
    compositionSpace: {symbols : [Mg Al O] blocks: [[4 8 16]]}
    conditions: {externalPressure : 100}
}

Besides obligatory compositionSpace block there are number of optional parameter blocks.


compositionSpace

Describes the identity of each type of atom or molecule and specifies the number of species of each type. For details see Section 5.6.1.

cellUtility

Properties of the unit cell. For details see Section 5.6.2.

bondUtility

Constants and constraints related to interatomic distances. For details see Section 5.6.3.

environmentUtility

Set up environments (such as substrates) used in the calculation. For details see Section 5.6.4.

radialDistributionUtility

Properties for radial distribution fingerprint calculation. For details see Section 5.6.5.

conditions

Properties of environment in general (currently only external pressure). For details see Section 5.6.6

heredity

Properties of heredity variation operator. For details see Section 5.6.7

randSym

Properties of symmetrical random structure generator. For details see Section 5.6.8

randSymPyXtal

Properties of PyXtal random structure generator. For details see Section 5.6.9

randTop

Properties of topological random structure generator. For details see Section 5.6.10

permutation

Properties of permutation variation operator. For details see Section 5.6.11

transmutation

Properties of transmutation variation operator. For details see Section 5.6.12

softmodemutation

Properties of softmutation variation operator. For details see Section 5.6.13

seeds

Properties of seeds for the calculation. For details see Section 5.6.14


5.6.1. Composition space

Examples:

Fixed-compositioin prediction for \(Mg_4 Al_8 O_{16}\).

{
    symbols : [Mg Al O]
    blocks: [[4 8 16]]
}

Fixed-compositioin prediction with two molecules of type mol_1 and two of type mol_2 in unit cell.

{
    symbols : [mol_1 mol_2]
    blocks: [[2 2]]
}

Variable-composition prediction for the \(MgO-Al_{2}O_{3}\) system, where the minimum number of atoms is 8 and the maximum 20.

{
    symbols : [Mg Al O]
    blocks: [[1 0 1] [0 2 3]]
    minAt : 8
    maxAt : 20
}

Variable-composition prediction for the \(MgO-Al_2 O_3\) system, where the minimum number of atoms is 8 and the maximum 20. With additional constraint that block \(Al_2 O_3\) should be present in system minimum 1 and maximum 3 times.

{
    symbols : [Mg Al O]
    blocks: [[1 0 1] [0 2 3]]
    range: [(0 10) (1 3)]
    minAt : 2
    maxAt : 20
}

Variable-composition prediction for the \(MgO-Al_2 O_3\) system. Block \(MgO\) should be present in system minimum 2 and maximum 5 times. Block \(Al_2 O_3\) should be present in system minimum 1 and maximum 3 times.

{
    symbols : [Mg Al O]
    blocks: [[1 0 1] [0 2 3]]
    range: [(2 5) (1 3)]
}

symbols

Describes the identity of each type of atom or molecule. The value is a list of chemical elements or molecule names.

Default: no default

Format:
symbols : [mol_1 mol_2 O]

Molecule names should be defined in molecule definition sections Section 5.10.


blocks

Specify compositional building blocks.

Default: no default

Format:
blocks : [[1 0 1] [0 2 3]]

range

Specify ranges for each block.

Default: If maxAt is set then each block can be taken from 0 to maxAt devided by size of block, otherwise each block is taken one time.

Format:
range : [(1 10) (1 10)]

maxAt

Maximal number of atoms.

Default: If range is set, then maximum of upper bound in range for each block times size of block.

Format:
maxAt : 40

minAt

Minimal number of atoms.

Default: If range is set, then sum of lowest bound in range for each block times size of block.

Format:
minAt : 2

5.6.2. Unit cell properties

cellVolume

Initial volume of the unit cell.

Default: for cell volumes you don’t have to specify values — USPEX has a powerful algorithm to make reasonable estimates at any pressure.

Format:
cellVolume : 125.0

Note

This volume is only used as an initial guess to speed up structure relaxation and does not affect the results, because each structure is fully optimized and adopts the volume corresponding to the (free) energy minimum.

Note

You can also use online program https://uspex-team.org/online_utilities/volume_estimation. Users can also input the volumes manually.

Note

If you study molecular crystals under pressure, you might sometimes need to increase the initial volumes somewhat, in order to be able to generate initial random structures.


cellVectors

Unit cell vectors.

Note

These are used in fixed-cell calculations.

Note

You should provide number of cell vectors equal to dimensionality of the problem.

Default: by default cell vectors are variable.

Format:
cellVectors: [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]]

cellParameters

Unit cell vectors.

Note

This parameter is used in fixed-cell calculations.

Default: by default cell vectors are variable.

Format:
cellParameters: {a: 2.474, b: 8.121, c: 6.138, alpha: 90.0, beta: 90.0, gamma: 90.0}

dim

Specifies unit cell dimension.

Default: 3

Format:
dim : 3

thickness

Thickness of the surface region or film (for 2D), or maximum linear size of nanoparticle (0D)

Default: 2.0 \(\text{Å}\)

Format:
thickness: 3.5

supercellDegree

Maximum multiplications of the surface cell, to allow for complex reconstructions.

Default: 1

Format:
supercellDegree: 2

axis

Vector normal to the plane of the 2D structure.

Default:

Format:
axis : (0.0 0.0 1.0)

5.6.4. Environment utility

environments

List of environments. Names used here should be defined in the corresponding definition sections, as described in Section 5.11. Every individual created through random structure generation gets an environment picked up from this list. Individuals created through other variation operators inherit their environments from parents.

Format:
environments:  [substrate]

5.6.5. Radial distribution based fingerprint

For details on fingerprint functions we refer reader to Ref 19.

Rmax

Distance cuttoff (in Ångstroms).

Default: 10.0

Format:
Rmax : 10

delta

Discretization (in Ångstoms) of the fingerprint.

Default: 0.08

Format:
delta : 0.08

sigma

Gaussian broadening of interatomic distances (in Ångstoms).

Default: 0.03

Format:
sigma : 0.03

tolerance

Specifies the minimum distances between structures that qualify them as non-identical — for participating in the production of child structures and for survival of the fittest, respectively. This depends on the precision of structure relaxation and the physics of the system (for instance: for alloy ordering problems, fingerprints belonging to different structures will be very similar, and these tolerance parameters should be made small).

Default: 0.008

Format:
tolerance : 0.2

legacy

If true switch to the old style cosine distance between structures.

Default: False

Format:
legacy : True

5.6.6. Conditions

externalPressure

Specifies external pressure at which you want to find structures, in GPa.

Default: 0.0

Format:
externalPressure: 0.00001

Note

Please: do not specify it in relaxation files in the Specific folder.


5.6.7. Heredity

nslabs

Heredity operator might be done traditionally, when both parent structures cut into two pieces. or in ‘zebra’ way. In this case nslabs determines number of slabs into which parents are cut.

Default:

2 for fixed composition. For variable composition we take length of structure in the direction orthogonal to the cut and divide it by average diameter of atom (double covalent radius).

Format:
nslabs: 5

5.6.8. Symmetrical random generator

nsym

Possible symmetry groups for symmetric random structure generator for crystals (space groups), layer groups for 2D-crystals, wallpaper groups for surfaces, or point groups for clusters. A certain number of structures will be produced using randomly selected groups from this list, using randomly generated lattice parameters and atomic coordinates. During this process special Wyckoff sites can be produced from general positions (Fig. 5.6.1, see Ref. 14 for details)

Default:
  • For 3D crystals: 2-230

Format:
nsym: '16-74'
_images/Wyckoff_positions.png

Fig. 5.6.1 Example of random symmetric structure generation and merging atoms onto special Wyckoff positions.


splitInto

Defines the number of identical subcells or pseudosubcells in the unit cell. If you do not want to use splitting, just use the value 1, or delete the block. Use splitting only for systems with >25-30 atoms/cell.

Default: 1

Format:
splitInto: [2, 4]

Subcells introduce extra translational (pseudo)symmetry. In addition to this, each subcell can be built using symmetric random structure generator developed by A.R. Oganov and H.T. Stokes and implemented by H.T. Stokes (see Reference 14).


5.6.9. RandSymPyXtal

nsym

Possible symmetry groups for symmetric random structure generator for crystals (space groups), layer groups for 2D-crystals, wallpaper groups for surfaces, or point groups for clusters. A certain number of structures will be produced using randomly selected groups from this list, using randomly generated lattice parameters and atomic coordinates. During this process special Wyckoff sites can be produced from general positions (Fig. 5.6.1)

Default:
  • For 3D crystals: 2-230

Format:
nsym: '16-74'

5.6.10. Topological random generator

maxSupersize

When trying to generate structure with certain number of atoms topological random generator takes from database of topologies all entries number of nodes in which fits the required one up to the factor of maxSupersize.

Default:
  • 4

Format:
maxSupersize: 9

supercells

List of allowed supercells. When trying to generate structure with certain number of atoms topological random generator takes from database of topologies all entries which when replicated according supercells give required number of atoms.

Default:
  • all supercells are allowed, up to maxSupersize.

Format:
supercells: [(2 2 1) (1 2 2) (2 1 2)]

5.6.11. Permutation

howManySwaps

For permutation, the number of pairwise swaps will be randomly drawn from a uniform distribution between 1 and howManySwaps.

Default:

0.5\(\times\)(maximum number of possible swaps). If atoms \(N_a\) and \(N_b\), and atoms \(N_c\) and \(N_d\) are swappable, then the total number of possible swaps is \(\min(N_a,N_b)+\min(N_c,N_d)\), and the default for is \(0.5\times[\min(N_a,N_b)+\min(N_c,N_d)]\). In most cases, it is a good idea to rely on this default.

Format:
howManySwaps: 5

specificSwaps

Specifies which atom types you allow to swap in permutation.

Default: No specific swaps and all atoms are permutable

Format:
specificSwaps: [1 2]

Note

In this case, atoms of type 1 can be swapped with atoms of type 2. If you want to try all possible swaps, just leave a blank line inside this keyblock, or delete the block.


5.6.12. Transmutation

In this operator, a randomly selected atom is transmuted into another chemical species present in the system - the new chemical identity is chosen randomly.

howManyTrans

Maximum percentage of atoms in the structure that are being transmuted (0.1 = 10%). The fraction of atoms that will be transmuted is drawn randomly from a homogeneous distribution bounded from 1 to the fractional parameter howManyTrans.

Default: 0.2

Format:
howManyTrans: 0.2

5.6.13. Softmutation

degree

The maximum displacement in softmutation in angstoms. The displacement vectors for softmutation are scaled so that the largest displacement magnitude equals degree.

Default: 3\(\times\)(average atomic radius)

Format:
degree: 0.1

5.6.14. Seeds

This feature requires aditional input files to be provided.

generations

List of generations into which seeds will be injected during the search.

Default: No default

Format:
generations : [0 5 10]

seedsFolders

List of paths to folders which contain seeds files to be injected during the search. In the same order as generations above.

Default:

Format:
seedsFolders : ['./Seeds/1' './Seeds/2' './Seeds/3']

5.7. Evolutionary algorithm USPEX

When doing global optimization search (see Section 5.3) one can choose between several algorithms. In current release only evolutionary algorithm is implemented – which is the most efficient and reliable one. Particle swarm optimization and evolutionary metadynamics will be made available soon.

globalParentsPool

When true algorithm will choose parents from pool of all good structures. When false – just from the previous generation.

Default: False for fixed composition. True for variable composition.

Format:
globalParentsPool: True

popSize

The number of structures in each generation; size of initial generation can be set separately, if needed.

Default: \(2 \times N\) rounded to the closest 10, where \(N\) is the number of atoms/cell (or for variable composition). The upper limit is 60. Usually, you can trust these default settings. popSize: 20

Format:
popSize: 20
initialPopSize

The number of structures in the initial generation.

Default: equal to populationSize

Format:
initialPopSize: 20

Note

In most situations, we suggest that these two parameters be equal. Sometimes (especially in variable-composition calculations) it may be useful to specify initialPopSize to be larger than populationSize. It is also possible to have a smaller initialPopSize, if one wants to produce the first generation from seed structures.


bestFrac

Fraction of the current generation that shall be used as potential parents to produce the next generation.

Default: 0.7

Format:
bestFrac: 0.7

Note

This is an important parameter, values between 0.5–0.8 are reasonable.


howManyDiverse

Defines how many good and diverse structures will survive into the next generation.

Default: 0.15\(\times\) popSize

Format:
howManyDiverse: 3

We perform clustering of the population into specified number of groups here. And take the fittest representative from each group


optType

This keyblock specifies the property that you wish to use for selecting potential parents for new generation. See Section 5.5.

Default: same as in global optimizer section.

Format:
optType : (aging enthalpy)

Fittness can differ from the property to be optimized, if one uses aging procedure.


fractions

This parameter defines allowed percentages of each variation operator. The first value is minimum percentage of structures in each generation produced by this operator. The second — maximum percentage. The third value is weight (not the percentage) of the operator in the 1st generation.

Format:
fractions : {
            heredity: (0.3 0.7 0.5)
            softmodemutation: (0.1 0.3 0.2)
            randTop: (0.1 0.5 0.3)
            }

The fractions of operators evolve during the calculation, so that the more successful operators gain weight at the expense of the less successful operators, but within the limits specified here.


5.8. Details of ab initio calculations sections

Typical ab initio definition section looks like

#define vasp1
{
type: vasp
commandExecutable: 'vasp'
kresol : 0.12
vacuumSize: 10    #only for 0- 1- or 2- dimensional calculations
}

It contains type of interface for external program package, common parameters (like commandExecutable, see Section 5.8.1), package specific parameters (like kresol) and vacuumSize which should be specified only for low-dimensional systems and should be not specified for 3D-crystals.

The definiton name (vasp1 in the example above) should be used in \(stages\) parameter of general parameters Section 5.2.

Available types of interfaces are:

5.8.1. Common interface parameters.

commandExecutable

Specifies executable for a given code

Default: no default, has to be specified by the user.

Format:
commandExecutable : 'mpirun vasp'

taskManager

Specifies name of task manager to be used for submission.

Default: By default no task manager is used.

Format:
taskManager : TM

Note

Usually you define one task manager and use it in several stages, but it can be useful to use different task managers for different stages. For details see Section 5.9


tag

Optional identifier for stage.

Default: equal to index number of stage.

Format:
tag : final

Note

This is useful if you want to highlight or distinguish a stage. This tag will appear as suffix for a job in job submission system and in calculation folder created for this job. If this parameter is set then files in Specific/ folder should use it in their suffixes instead of index number of stage. For eaxample INCAR_final istead of just INCAR_4.


targetProperties

List of properties to be evaluated in this stage.

Default: [enthalpy structure]

Format:

targetProperties : [structure energy forces]

stageType

Specifies complex stage type.

Available types:

execute — just execute external program using interface

atomistic — first assemble atomic structure form components then execute external program using interface

populationProcessor — extract set of subsystems from an individual and run its internal stages for each element. Section 5.8.10

Default: atomistic

Format:

stageType : populationProcessor

5.8.2. VASP interface parameters.

sleepTime

Specifies sleep delay between checks of completion for this stage in seconds.

Default: 30

Format:
sleepTime : 2

kresol

Specifies the reciprocal-space resolution for k-points generation.

Format:
kresol : 0.12

Note

Using different values for each step of structure relaxation, starting with cruder (i.e., larger) values and ending with high resolution dramatically speeds up calculations, especially for metals, where very many k-points are needed. This keyblock is important for ab-initio calculations (through some codes, e.g. VASP and SIESTA, now have similar tricks)).


vacuumSize

Specify vacuum region size for calculations of nanoparticles, films or surfaces.

Default: 10.0

Format:
vacuumSize : 11.0

5.8.3. GULP interface parameters.

sleepTime

Specifies sleep delay between checks of completion for this stage in seconds.

Default: 10

Format:
sleepTime : 2

vacuumSize

Specify vacuum region size for calculations of nanoparticles, films or surfaces.

Default: 10.0

Format:
vacuumSize : 11.0

5.8.4. LAMMPS interface parameters.

sleepTime

Specifies sleep delay between checks of completion for this stage in seconds.

Default: 10

Format:
sleepTime : 2

vacuumSize

Specify vacuum region size for calculations of nanoparticles, films or surfaces.

Default: 10.0

Format:
vacuumSize : 11.0

mlip

Optional. If provided specifies mlip potential file.

Format:
mlip : './Training/La_H_24g_vasp.mtp'

5.8.5. Quantum Espresso interface parameters.

sleepTime

Specifies sleep delay between checks of completion for this stage in seconds.

Default: 30

Format:
sleepTime : 2

kresol

Specifies the reciprocal-space resolution for k-points generation.

Format:
kresol : 0.12

Note

Using different values for each step of structure relaxation, starting with cruder (i.e., larger) values and ending with high resolution dramatically speeds up calculations, especially for metals, where very many k-points are needed. This keyblock is important for ab-initio calculations (through some codes, e.g. VASP and SIESTA, now have similar tricks)).


vacuumSize

Specify vacuum region size for calculations of nanoparticles, films or surfaces.

Default: 10.0

Format:
vacuumSize : 11.0

5.8.6. Abinit interface parameters.

sleepTime

Specifies sleep delay between checks of completion for this stage in seconds.

Default: 30

Format:
sleepTime : 2

kresol

Specifies the reciprocal-space resolution for k-points generation.

Default:

Format:
kresol : 0.12

Note

Using different values for each step of structure relaxation, starting with cruder (i.e., larger) values and ending with high resolution dramatically speeds up calculations, especially for metals, where very many k-points are needed. This keyblock is important for ab-initio calculations (through some codes, e.g. VASP and SIESTA, now have similar tricks)).


vacuumSize

Specify vacuum region size for calculations of nanoparticles, films or surfaces.

Default: 10.0

Format:
vacuumSize : 11.0

5.8.7. FHIaims interface parameters.

sleepTime

Specifies sleep delay between checks of completion for this stage in seconds.

Default: 30

Format:
sleepTime : 2

kresol

Specifies the reciprocal-space resolution for k-points generation.

Default:

Format:
kresol : 0.12

5.8.8. MOPAC interface parameters.

sleepTime

Specifies sleep delay between checks of completion for this stage in seconds.

Default: 1

Format:
sleepTime : 2

5.8.9. MLIP interface parameters.

sleepTime

Specifies sleep delay between checks of completion for this stage in seconds.

Default: 10

Format:
sleepTime : 2
mode

Specifies mlip mode.

Available types:

select_add – actively selects configurations to be added to the current training set

train – fits an MTP

Format:
mode : train
potential

Specifies potential file.

Format:
potential : './Training/La_H_24g_vasp.mtp'
trainingSet

Specifies training set file.

Format:
trainingSet : trainingSet: './Training/set.cfg'
specorder

Specifies order of species.

Format:
specorder : [La H]

5.8.10. Population processor parameters.

This is not an interface for third-party program and so it does not provide parameters commandExecutable, taskManager, type and targetProperties.

Population processor extracts set of structures from an individual and treats it as a population. For each element it starts nested sequence of stages and controls parallel execution.

stages

List of stages to be run on each element.

Format:
stages : [vasp_forces]
inputKey

Specifies name of attribute of an individual to be treated as nested population.

Format:
inputKey : trajectory
numParallelCalcs

Specifies how many structure relaxations you want to run in parallel for this nested loop.

Format:
numParallelCalcs : 10

5.9. Task manager definition

Task manager definition looks like

#define TM
{
type: SBATCH
header: "#!/bin/sh
#SBATCH -p debug
#SBATCH -N 1
#SBATCH -n 1

"
}

Type could be:

  • SBATCH — for slurm submission system,

  • QSUB — for torque submission system,

  • BSUB — for lsf submission system.

Header should contain fraction of submission script for supercomputer which you are going to use. Usually such header should contain information on submission queue, number of nodes and cores which the job is going to consume. To determine exact content of such header consult with supercomputer usage guide or administrator.

5.10. Molecules definitions

Each molecule definition looks like

#define mol_name
{filename : 'MOL_FILE'}

The defined molecule name (like \(mol\_name\)) should then be used in compositionSpace block (see Section 5.6.1).

For a molecular crystal, the MOL_FILE file describes the internal geometry of the molecule from which the structure is built. The Z Matrix file is created using the information given in the MOL_FILE file, i.e., bond lengths and all necessary angles are calculated from the Cartesian coordinates. The lengths and angles that are important should be used for the creation of Z Matrix — this is exactly what columns 5–7 specify. Let’s look at the file for benzene \(C_6 H_6\):

_images/benzene.png

Fig. 5.10.1 Sample of MOL_FILE file and illustration of the corresponding molecular structure.

The 1\(^{st}\) atom is C, its coordinates are defined without reference to other atoms (“0 0 0”).

The 2\(^{nd}\) atom is C, its coordinates (in molecular coordinate frame) in Z Matrix will be set only by its distance from the 1\(^{st}\) atom (i.e. C described above), but no angles — (“1 0 0”).

The 3\(^{rd}\) atom is H, its coordinates will be set by its distance from the 1\(^{st}\) atom, and the bond angle 3-1-2, but not by torsion angle — hence we use “1 2 0”.

The 4\(^{th}\) atom is C, its coordinates will be set by its distance from the 1\(^{st}\) atom, bond angle 4-1-2, and torsion angle 4-1-2-3 — hence, we use “1 2 3” and so forth…until we reach the final, 12\(^{th}\) atom, which is H, defined by its distance from the 9\(^{th}\) atom (C), bond angle 12-9-6 and torsion angle 12-9-6-8 — hence “9-6-8”.

The final column is the flexibility flag for the torsion angle. For example, in C4, the tosion angle is defined by 4-1-2-3. This flag should be either 1 or 0 for the first three atoms, and 0 — for the others, if the molecule is rigid. If any other flexible torsion angle exists, specify 1 for this column.

5.10.1. How to prepare the MOL files

There are plenty of programs which can generate Zmatrix style files, such as Molden, Avogadro, and so on. Experienced users might have their own way to prepare these files. For the users’ convenience, we have created an online utility to allow one to generate the USPEX-style MOL file just from a file in XYZ format. Please try this utility at https://uspex-team.org/online_utilities/zmatrix/

5.11. Environment definitions

Environment definition looks like.

#define substrate
{
type: substrate
file: './POSCAR_SUBSTRATE'
pbc: (1 1 0)
bufferThickness: 3.0
}

The only supported type now is substrate. Here file should contain substrate region in VASP5 POSCAR format, see Fig. 5.11.1.


_images/Surface.png

Fig. 5.11.1 Surface model used in USPEX.


Besides file you need to specify following parameters.

pbc: (1 1 0)

Periodic boundary conditions. 1 indicates the persistence of periodic boundary conditions and 0 indicates vacuum direction

Format:
pbc: (1 1 0)

bufferThickness

Thickness of the buffer region in substrate. This region is part of POSCAR_SUBSTRATE, and is allowed to relax. See Fig. 5.11.1.

Format:
bufferThickness:  3.5