Introduction

GitHub: https://www.github.com/lukasturcani/stk

Slack: https://t.co/LCPzWhvsVO

Installation

To get stk, you can install it with pip:

$ pip install stk

Make sure you also install rdkit, which is a dependency:

$ conda install -c rdkit rdkit

Overview

stk is a Python library which allows the construction, manipulation, property calculation and automatic design of molecules.

For quick navigation through the modules of stk, use Module Index.

Among other things, stk allows you to construct molecules like this

https://i.imgur.com/HI5cciM.png

and many others.

The key idea behind stk is that the construction of a molecule can be broken down into two fundamental pieces of information, its building blocks and its topology graph. The building blocks of a molecule are molecules, or molecular fragments, which are used for construction. The smallest possible building block is a single atom and constructed molecules can become the building blocks of other constructed molecules. The topology graph is an abstract representation of a constructed molecule. The nodes of the graph represent the positions of the building blocks and the edges of the graph represent which building blocks have bonds formed between them during construction.

To use stk you only have to choose which building blocks and topology graph you wish to use and stk will take care of everything else, take for example the construction of a linear polymer

import stk

polymer = stk.ConstructedMolecule(
    building_blocks=[
        stk.BuildingBlock('BrCCBr', ['bromine']),
        stk.BuildingBlock('BrCNCBr', ['bromine'])
    ],
    topology_graph=stk.polymer.Linear(
        repeating_unit='ABBBA',
        num_repeating_units=2
    )
)
# You can write the molecule to a file if you want to view it.
polymer.write('polymer.mol')

which will produce

https://i.imgur.com/XmKRRun.png

Because the topology graph is an idealized representation of the constructed molecule, the bonds formed during construction often have unrealistic lengths. This means that constructed molecules will need to undergo structure optimization. There is no single correct way to go about this, because the appropriate methodology for structure optimization will depend various factors, such as the nature of the constructed molecule, the desired accuracy and time constraints. stk provides objects called optimizers, which provide a simple and consistent interface to different optimization methodologies, and can act as an API for external chemistry software. Alternatively, stk allows you to write constructed molecules in common chemical file formats, which can be used as input for computational chemistry software, if you wish to do this manually.

https://i.imgur.com/UlCnTj9.png

The general construction workflow of stk.

The abstraction provided by the topology graph has a number of powerful benefits. Firstly, because every vertex is responsible for the placement of a building block, it is extremely easy to construct different structural isomers of the constructed molecule. The vertex can be told to perform different transformations on the building block, so that its orientation in the constructed molecule changes. For the end user, selecting the transformation and therefore structural isomer is relatively easy. Take the example of an organic cage, which can be constructed with the following code

# Create the building blocks.
bb1 = stk.BuildingBlock('O=CC(C=O)C(Cl)C=O', ['aldehyde'])
bb2 = stk.BuildingBlock('O=CC(C=O)C=O', ['aldehyde'])
bb3 = stk.BuildingBlock('NCC(Cl)N', ['amine'])
bb4 = stk.BuildingBlock('NCCN', ['amine'])

# Create the topology graph.
tetrahedron = stk.cage.FourPlusSix()

# Because there are multiple building blocks with the same
# number of functional groups, they need to be explicitly
# placed on vertices, as there are multiple valid combinations.
building_block_vertices = {
    bb1: tetrahedron.vertices[:1],
    bb2: tetrahedron.vertices[1:4],
    bb3: tetrahedron.vertices[4:5],
    bb4: tetrahedron.vertices[5:]
}

# Create the molecule.
cage = stk.ConstructedMolecule(
    building_blocks=[bb1, bb2, bb3, bb4],
    topology_graph=tetrahedron,
    building_block_vertices=building_block_vertices
)
# You can write the molecule to a file if you want to view it.
cage.write('cage.mol')

and looks like this

https://i.imgur.com/MAFrzAl.png

You can see that the green atoms on adjacent building blocks point toward the different edges. However, by specifying a different edge to align with, the building block will be rotated

# Vertex 0 gets aligned to the third edge it's connected to.
isomer_graph = stk.cage.FourPlusSix(vertex_alignments={0: 2})
building_block_vertices = {
    bb1: isomer_graph.vertices[:1],
    bb2: isomer_graph.vertices[1:4],
    bb3: isomer_graph.vertices[4:5],
    bb4: isomer_graph.vertices[5:]
}
isomer = stk.ConstructedMolecule(
    building_blocks=[bb1, bb2, bb3, bb4],
    topology_graph=tetrahedron,
    building_block_vertices=building_block_vertices
)
isomer.write('cage_isomer.mol')
https://i.imgur.com/cg9n69u.png

The same thing can be done to any other building block on the cage to perform a rotation on it. You can also write a loop, to create all the structural isomers of a single cage in one swoop

import itertools as it

edges = (
    range(len(v.edges)) for v in stk.cage.FourPlusSix.vertex_data
)
# Create 5184 structural isomers.
isomers = []
for i, aligners in enumerate(it.product(*edges)):
    tetrahedron = stk.cage.FourPlusSix(
        vertex_alignments={
            vertex.id: edge
            for vertex, edge
            in zip(stk.cage.FourPlusSix.vertex_data, aligners)
        }
    )
    isomer = stk.ConstructedMolecule(
        building_blocks=[bb1, bb2, bb3, bb4],
        topology_graph=tetrahedron,
        building_block_vertices={
            bb1: tetrahedron.vertices[:1],
            bb2: tetrahedron.vertices[1:4],
            bb3: tetrahedron.vertices[4:5],
            bb4: tetrahedron.vertices[5:]
        }
    )
    isomers.append(isomer)

The second major benefit of the topology graph is that the vertices and edges can hold additional state useful for the construction of a molecule. An example of this is in the construction of different structural isomers, but another can be seen in the construction of periodic systems. For example, stk allows you to construct covalent organic frameworks. With the topology graph this is trivial to implement, simply label some of the edges a periodic and they will construct periodic bonds instead of regular ones.

The third benefit of the topology graph is that it allows users to easily modify the construction of molecules by placing different building blocks on different vertices. The user can use the building_block_vertices parameter with any topology graph.

The fourth benefit of the topology graph is that the construction of a molecule is broken down into a independent steps. Each vertex represents a single, independent operation on a building block while each edge represents a single, independent operation on a collection of building blocks. As a result, each vertex and edge represents a single operation, which can be executed in parallel. This allows stk to scale efficiently to large topology graphs and take advantage of multiple cores even during the construction of a single molecule.

Property Calculation

stk provides a number calculators to carry out property calculations. When these are not sufficient, stk molecules can be converted to and from rdkit molecules, which provides additional property calculation and cheminformatics facilities.

Working With Multiple Molecules

It is often the case that the construction and property calculation needs to be performed on molecules in bulk and in parallel. For this, stk provides the Population, which is a specialized container providing these facilities.

Automatic Molecular Design

To perform automatic design, stk includes an evolutionary algorithm, which can make use of the construction facilities in stk but is not required to.

What Next?

A good thing to look at are some basic examples, which will allow you to get a feel for stk. Further examples of molecular construction can be seen by looking at the different topology graphs. The documentation of the various topology graph classes in stk also contains usage examples. More advanced examples can be seen in the cookbook and if you want to experiment with automated molecular design you can look into how to write an input file for the evolutionary algorithm. If stk does not have a topology graph for a molecule you would like to construct, you can always implement a new one yourself. Alternatively, if you would like to request an extension to stk, or you have any other question about stk, feel free to message me on your favourite platform or file an issue on GitHub.