Introduction¶
GitHub: https://www.github.com/lukasturcani/stk
Slack: https://t.co/LCPzWhvsVO
Installation¶
To get stk
, you can install it with pip:
$ pip install stk
Make sure you also install rdkit
, which is a dependency:
$ conda install -c rdkit rdkit
Overview¶
stk
is a Python library which allows the construction,
manipulation, property calculation and automatic design of molecules.
For quick navigation through the modules of stk
, use
Module Index.
Among other things, stk
allows you to construct molecules like this
![https://i.imgur.com/HI5cciM.png](https://i.imgur.com/HI5cciM.png)
and many others.
The key idea behind stk
is that the construction of a molecule can
be broken down into two fundamental pieces of information, its
building blocks and its topology graph. The building blocks of a
molecule are molecules, or molecular fragments, which are used for
construction. The smallest possible building block is a single atom
and constructed molecules can become the building blocks of other
constructed molecules. The topology graph is an abstract representation
of a constructed molecule. The nodes of the graph represent the
positions of the building blocks and the edges of the graph represent
which building blocks have bonds formed between them during
construction.
To use stk
you only have to choose which building blocks and
topology graph you wish to use and stk
will take care of everything
else, take for example the construction of a linear polymer
import stk
polymer = stk.ConstructedMolecule(
building_blocks=[
stk.BuildingBlock('BrCCBr', ['bromine']),
stk.BuildingBlock('BrCNCBr', ['bromine'])
],
topology_graph=stk.polymer.Linear(
repeating_unit='ABBBA',
num_repeating_units=2
)
)
# You can write the molecule to a file if you want to view it.
polymer.write('polymer.mol')
which will produce
![https://i.imgur.com/XmKRRun.png](https://i.imgur.com/XmKRRun.png)
Because the topology graph is an idealized representation of the
constructed molecule, the bonds formed during construction often have
unrealistic lengths. This means that constructed molecules will need to
undergo structure optimization. There is no single correct way to go
about this, because the appropriate methodology for structure
optimization will depend various factors, such as the nature of the
constructed molecule, the desired accuracy and time constraints.
stk
provides objects called optimizers, which provide a simple
and consistent interface to different optimization methodologies, and
can act as an API for external chemistry software. Alternatively,
stk
allows you to write constructed molecules in common chemical
file formats, which can be used as input for computational chemistry
software, if you wish to do this manually.
![https://i.imgur.com/UlCnTj9.png](https://i.imgur.com/UlCnTj9.png)
The general construction workflow of stk
.¶
The abstraction provided by the topology graph has a number of powerful benefits. Firstly, because every vertex is responsible for the placement of a building block, it is extremely easy to construct different structural isomers of the constructed molecule. The vertex can be told to perform different transformations on the building block, so that its orientation in the constructed molecule changes. For the end user, selecting the transformation and therefore structural isomer is relatively easy. Take the example of an organic cage, which can be constructed with the following code
# Create the building blocks.
bb1 = stk.BuildingBlock('O=CC(C=O)C(Cl)C=O', ['aldehyde'])
bb2 = stk.BuildingBlock('O=CC(C=O)C=O', ['aldehyde'])
bb3 = stk.BuildingBlock('NCC(Cl)N', ['amine'])
bb4 = stk.BuildingBlock('NCCN', ['amine'])
# Create the topology graph.
tetrahedron = stk.cage.FourPlusSix()
# Because there are multiple building blocks with the same
# number of functional groups, they need to be explicitly
# placed on vertices, as there are multiple valid combinations.
building_block_vertices = {
bb1: tetrahedron.vertices[:1],
bb2: tetrahedron.vertices[1:4],
bb3: tetrahedron.vertices[4:5],
bb4: tetrahedron.vertices[5:]
}
# Create the molecule.
cage = stk.ConstructedMolecule(
building_blocks=[bb1, bb2, bb3, bb4],
topology_graph=tetrahedron,
building_block_vertices=building_block_vertices
)
# You can write the molecule to a file if you want to view it.
cage.write('cage.mol')
and looks like this
![https://i.imgur.com/MAFrzAl.png](https://i.imgur.com/MAFrzAl.png)
You can see that the green atoms on adjacent building blocks point toward the different edges. However, by specifying a different edge to align with, the building block will be rotated
# Vertex 0 gets aligned to the third edge it's connected to.
isomer_graph = stk.cage.FourPlusSix(vertex_alignments={0: 2})
building_block_vertices = {
bb1: isomer_graph.vertices[:1],
bb2: isomer_graph.vertices[1:4],
bb3: isomer_graph.vertices[4:5],
bb4: isomer_graph.vertices[5:]
}
isomer = stk.ConstructedMolecule(
building_blocks=[bb1, bb2, bb3, bb4],
topology_graph=tetrahedron,
building_block_vertices=building_block_vertices
)
isomer.write('cage_isomer.mol')
![https://i.imgur.com/cg9n69u.png](https://i.imgur.com/cg9n69u.png)
The same thing can be done to any other building block on the cage to perform a rotation on it. You can also write a loop, to create all the structural isomers of a single cage in one swoop
import itertools as it
edges = (
range(len(v.edges)) for v in stk.cage.FourPlusSix.vertex_data
)
# Create 5184 structural isomers.
isomers = []
for i, aligners in enumerate(it.product(*edges)):
tetrahedron = stk.cage.FourPlusSix(
vertex_alignments={
vertex.id: edge
for vertex, edge
in zip(stk.cage.FourPlusSix.vertex_data, aligners)
}
)
isomer = stk.ConstructedMolecule(
building_blocks=[bb1, bb2, bb3, bb4],
topology_graph=tetrahedron,
building_block_vertices={
bb1: tetrahedron.vertices[:1],
bb2: tetrahedron.vertices[1:4],
bb3: tetrahedron.vertices[4:5],
bb4: tetrahedron.vertices[5:]
}
)
isomers.append(isomer)
The second major benefit of the topology graph is that the vertices and
edges can hold additional state useful for the construction of a
molecule. An example of this is in the construction of different
structural isomers, but another can be seen in the construction of
periodic systems. For example, stk
allows you to construct
covalent organic frameworks. With the topology graph this is trivial
to implement, simply label some of the edges a periodic and they
will construct periodic bonds instead of regular ones.
The third benefit of the topology graph is that it allows users to easily modify the construction of molecules by placing different building blocks on different vertices. The user can use the building_block_vertices parameter with any topology graph.
The fourth benefit of the topology graph is that the construction of
a molecule is broken down into a independent steps. Each vertex
represents a single, independent operation on a building block while
each edge represents a single, independent operation on a collection
of building blocks. As a result, each vertex and edge represents a
single operation, which can be executed in parallel. This allows
stk
to scale efficiently to large topology graphs and take
advantage of multiple cores even during the construction of a single
molecule.
Property Calculation¶
stk
provides a number calculators to carry out property
calculations. When these are not sufficient, stk
molecules can
be converted to and from rdkit
molecules, which provides
additional property calculation and cheminformatics facilities.
Working With Multiple Molecules¶
It is often the case that the construction and property calculation
needs to be performed on molecules in bulk and in parallel. For this,
stk
provides the Population
, which is a specialized
container providing these facilities.
Automatic Molecular Design¶
To perform automatic design, stk
includes an evolutionary
algorithm, which can make use of the construction facilities in
stk
but is not required to.
What Next?¶
A good thing to look at are some basic examples, which will allow
you to get a feel for stk
. Further examples of molecular
construction can be seen by looking at the different topology graphs.
The documentation of the various topology graph classes in stk
also contains usage
examples. More advanced examples can be seen in the cookbook and
if you want to experiment with automated molecular design you can look
into how to write an input file for the evolutionary algorithm. If
stk
does not have a topology graph for a molecule you would like to
construct, you can always implement a new one yourself. Alternatively,
if you would like to request an extension to
stk
, or you have any other question about stk
, feel free to
message me on your favourite platform or file an issue on GitHub.