Constructed Molecule¶
-
class
ConstructedMolecule
(building_blocks, topology_graph, building_block_vertices=None, use_cache=False)¶ Bases:
stk.molecular.molecules.molecule.Molecule
Represents constructed molecules.
A
ConstructedMolecule
requires at least 2 basic pieces of information: which building block molecules are used to construct the molecule and what theTopologyGraph
of the constructed molecule is. The construction of the molecular structure is performed byTopologyGraph.construct()
. This method does not have to be called explicitly by the user, it will be called automatically during initialization.The building block molecules used for construction can be either
BuildingBlock
instances or otherConstructedMolecule
instances, or a combination both.Each
TopologyGraph
subclass may add additional attributes to theConstructedMolecule
, which will be described within its documentation.-
atoms
¶ The atoms of the molecule. Each
Atom
instance is guaranteed to have two attributes. The first isbuilding_block
, which holds the building blockMolecule
from which thatAtom
came. If theAtom
did not come from a building block, but was added by a reaction, the value of this attribute will beNone
.The second attribute is
building_block_id
. This will be the same value on all atoms that came from the building block. Note that if a building block is used multiple times during construction, thebuilding_block_id
will be different for each time it is used.- Type
tuple
ofAtom
-
building_block_vertices
¶ Maps the
Molecule
instances used for construction, which can be eitherBuildingBlock
orConstructedMolecule
, to theVertex
objects they are placed on during construction. Thedict
has the formbuilding_block_vertices = { BuildingBlock(...): [Vertex(...), Vertex(...)], BuildingBlock(...): [ Vertex(...), Vertex(...), Vertex(...), ] ConstructedMolecule(...): [Vertex(...)] }
- Type
dict
-
building_block_counter
¶ A counter keeping track of how many times each building block molecule appears in the
ConstructedMolecule
.- Type
collections.Counter
-
topology_graph
¶ Defines the topology graph of
ConstructedMolecule
and is responsible for constructing it.- Type
-
construction_bonds
¶ Holds the bonds in
bonds
, which were added by the construction process.- Type
tuple
ofBond
-
func_groups
¶ The remnants of building block functional groups present in the molecule. They track which atoms belonged to functional groups in the building block molecules. The id of each
FunctionalGroup
should match its index infunc_groups
.- Type
tuple
ofFunctionalGroup
Examples
Initialization
A
ConstructedMolecule
can be created from a set of building blocks and aTopologyGraph
import stk bb1 = stk.BuildingBlock('NCCCN', ['amine']) bb2 = stk.BuildingBlock('O=CC(C=O)CC=O', ['aldehyde']) tetrahedron = stk.cage.FourPlusSix() cage1 = stk.ConstructedMolecule( building_blocks=[bb1, bb2], topology_graph=tetrahedron )
A
ConstructedMolecule
can be used to construct otherConstructedMolecule
instancesbenzene = stk.BuildingBlock('c1ccccc1') cage_complex = stk.ConstructedMolecule( building_blocks=[cage1, benzene], topology_graph=stk.host_guest.Complex() )
During initialization it is possible to force building blocks to be placed on specific
vertices
of theTopologyGraph
bb3 = stk.BuildingBlock('NCOCN', ['amine']) bb4 = stk.BuildingBlock('NCOCCCOCN', ['amine']) cage2 = stk.ConstructedMolecule( building_blocks=[bb1, bb2, bb3, bb4], topology_graph=tetrahedron building_block_vertices={ bb1: tetrahedron.vertices[4:6] bb3: tetrahedron.vertices[6:7] bb4: tetrahedron.vertices[7:8] } )
Building blocks with the wrong number of functional groups.
If the building block has too many functional groups, you can remove some in order to use it
chain = stk.polymer.Linear('AB', [0, 0], 3) # This won't work, bb2 has 3 functional groups but 2 are needed # for monomers in a linear polymer chain. failed = stk.ConstructedMolecule( building_blocks=[bb1, bb2], topology_graph=chain ) # Remove one of the functional groups and you will be able to # construct the chain. bb2.func_groups = (bb2.func_groups[0], bb2.func_groups[2]) failed = stk.ConstructedMolecule( building_blocks=[bb1, bb2], topology_graph=chain )
Methods
apply_displacement
(self, displacement)Shift the centroid by displacement.
apply_rotation_about_axis
(self, angle, axis, …)Rotate by angle about axis on the origin.
apply_rotation_between_vectors
(self, start, …)Rotate by a rotation from start to target.
apply_rotation_to_minimize_angle
(self, …)Rotate to minimize the angle between start and target.
clone
(self)Return a clone.
dump
(self, path[, include_attrs, …])Write a
dict
representation to a file.get_atom_distance
(self, atom1_id, atom2_id)Return the distance between 2 atoms.
get_atom_positions
(self[, atom_ids])Yield the positions of atoms.
get_building_blocks
(self)Yield the building blocks.
get_cached_mol
(identity_key[, default])Get a molecule from the cache.
get_center_of_mass
(self[, atom_ids])Return the centre of mass.
get_centroid
(self[, atom_ids])Return the centroid.
get_direction
(self[, atom_ids])Return a vector of best fit through the atoms.
get_identity_key
(self)Return the identity key.
get_maximum_diameter
(self[, atom_ids])Return the maximum diamater.
get_plane_normal
(self[, atom_ids])Return the normal to the plane of best fit.
get_position_matrix
(self)Return a matrix holding the atomic positions.
has_cached_mol
(identity_key)True
if molecule with identity_key is cached.init_from_dict
(mol_dict[, use_cache])Initialize from a
dict
representation.load
(path[, use_cache])Initialize from a dump file.
set_centroid
(self, position[, atom_ids])Set the centroid to position.
set_position_matrix
(self, position_matrix)Set the coordinates to those in position_matrix.
to_dict
(self[, include_attrs, …])Return a
dict
representation.to_rdkit_mol
(self)Return an
rdkit
representation.update_cache
(self)Update attributes of the cached molecule.
update_from_file
(self, path)Update the structure from a file.
update_from_rdkit_mol
(self, mol)Update the structure to match mol.
write
(self, path[, atom_ids])Write the structure to a file.
-
__init__
(self, building_blocks, topology_graph, building_block_vertices=None, use_cache=False)¶ Initialize a
ConstructedMolecule
.- Parameters
building_blocks (
list
ofMolecule
) – TheBuildingBlock
andConstructedMolecule
instances which represent the building block molecules used for construction. Only one instance is present per building block molecule, even if multiples of that building block join up to form theConstructedMolecule
.topology_graph (
TopologyGraph
) – Defines the topology graph of theConstructedMolecule
and constructs it.building_block_vertices (
dict
, optional) – Maps theMolecule
in building_blocks to theVertex
instances in topology_graph it is placed on. EachBuildingBlock
andConstructedMolecule
can be mapped to multipleVertex
objects. See the examples section in theConstructedMolecule
class docstring to help understand how this parameter is used. IfNone
, building block molecules will be assigned to vertices at random.use_cache (
bool
, optional) – IfTrue
, a newConstructedMolecule
will not be made if a cached and identical one already exists, the one which already exists will be returned. IfTrue
and a cached, identicalConstructedMolecule
does not yet exist the created one will be added to the cache.
-
apply_displacement
(self, displacement)¶ Shift the centroid by displacement.
- Parameters
displacement (
numpy.ndarray
) – A displacement vector applied to the molecule.- Returns
The molecule.
- Return type
-
apply_rotation_about_axis
(self, angle, axis, origin)¶ Rotate by angle about axis on the origin.
- Parameters
angle (
float
) – The size of the rotation in radians.axis (
numpy.ndarray
) – The axis about which the rotation happens.origin (
numpy.ndarray
) – The origin about which the rotation happens.
- Returns
The molecule.
- Return type
-
apply_rotation_between_vectors
(self, start, target, origin)¶ Rotate by a rotation from start to target.
Given two direction vectors, start and target, this method applies the rotation required transform start to target onto the molecule. The rotation occurs about the origin.
For example, if the start and target vectors are 45 degrees apart, a 45 degree rotation will be applied to the molecule. The rotation will be along the appropriate direction.
The great thing about this method is that you as long as you can associate a geometric feature of the molecule with a vector, then the molecule can be rotated so that this vector is aligned with target. The defined vector can be virtually anything. This means that any geometric feature of the molecule can be easily aligned with any arbitrary axis.
- Parameters
start (
numpy.ndarray
) – A vector which is to be rotated so that it transforms into the target vector.target (
numpy.ndarray
) – The vector onto which start is rotated.origin (
numpy.ndarray
) – The point about which the rotation occurs.
- Returns
The molecule.
- Return type
-
apply_rotation_to_minimize_angle
(self, start, target, axis, origin)¶ Rotate to minimize the angle between start and target.
Note that this function will not necessarily overlay the start and target vectors. This is because the possible rotation is restricted to the axis.
- Parameters
start (
numpy.ndarray
) – The vector which is rotated.target (
numpy.ndarray
) – The vector which is stationary.axis (
numpy.ndarray
) – The vector about which the rotation happens.origin (
numpy.ndarray
) – The origin about which the rotation happens.
- Returns
The molecule.
- Return type
-
clone
(self)¶ Return a clone.
- Returns
The clone.
- Return type
-
dump
(self, path, include_attrs=None, ignore_missing_attrs=False)¶ Write a
dict
representation to a file.- Parameters
path (
str
) – The full path to the file to which thedict
should be written.include_attrs (
list
ofstr
, optional) – The names of attributes of the molecule to be added to the representation. Each attribute is saved as a string usingrepr()
.ignore_missing_attrs (
bool
, optional) – IfFalse
and an attribute in include_attrs is not held by theMolecule
, an error will be raised.
- Returns
None
- Return type
NoneType
-
get_atom_distance
(self, atom1_id, atom2_id)¶ Return the distance between 2 atoms.
This method does not account for the van der Waals radius of atoms.
- Parameters
atom1_id (
int
) – The id of the first atom.atom2_id (
int
) – The id of the second atom.
- Returns
The distance between the first and second atoms.
- Return type
float
-
get_atom_positions
(self, atom_ids=None)¶ Yield the positions of atoms.
- Parameters
atom_ids (
iterable
ofint
, optional) – The ids of the atoms whose positions are desired. IfNone
, then the positions of all atoms will be yielded.- Yields
numpy.ndarray
– The x, y and z coordinates of an atom.
-
get_building_blocks
(self)¶ Yield the building blocks.
- Yields
Molecule
– A building block of theConstructedMolecule
.
-
classmethod
get_cached_mol
(identity_key, default=None)¶ Get a molecule from the cache.
- Parameters
identity_key (
object
) – The identity key of the molecule to return.default (
object
, optional) – Returned if identity_key is not found in the cache. IfNone
an error will be raised if identity_key is not found in the cache.
- Returns
The cached molecule.
- Return type
-
get_center_of_mass
(self, atom_ids=None)¶ Return the centre of mass.
- Parameters
atom_ids (
iterable
ofint
, optional) – The ids of atoms which should be used to calculate the center of mass. IfNone
, then all atoms will be used.- Returns
The coordinates of the center of mass.
- Return type
numpy.ndarray
References
-
get_centroid
(self, atom_ids=None)¶ Return the centroid.
- Parameters
atom_ids (
iterable
ofint
, optional) – The ids of atoms which are used to calculate the centroid. IfNone
, then all atoms will be used.- Returns
The centroid of atoms specified by atom_ids.
- Return type
numpy.ndarray
-
get_direction
(self, atom_ids=None)¶ Return a vector of best fit through the atoms.
- Parameters
atom_ids (
iterable
ofint
, optional) – The ids of atoms which should be used to calculate the vector. IfNone
, then all atoms will be used.- Returns
The vector of best fit.
- Return type
numpy.ndarray
-
get_identity_key
(self)¶ Return the identity key.
The identity key wil be equal for two molecules which
stk
sees as identical. The identity key does not take the conformation into account but it does account for isomerism.- Returns
A hashable object which represents the identity of the molecule.
- Return type
object
-
get_maximum_diameter
(self, atom_ids=None)¶ Return the maximum diamater.
This method does not account for the van der Waals radius of atoms.
- Parameters
atom_ids (
iterable
ofint
) – The ids of atoms which are considered when looking for the maximum diamater. IfNone
then all atoms in the molecule are considered.- Returns
The maximum diameter in the molecule.
- Return type
float
-
get_plane_normal
(self, atom_ids=None)¶ Return the normal to the plane of best fit.
- Parameters
atom_ids (
iterable
ofint
, optional) – The ids of atoms which should be used to calculate the plane. IfNone
, then all atoms will be used.- Returns
Vector orthonormal to the plane of the molecule.
- Return type
numpy.ndarray
-
get_position_matrix
(self)¶ Return a matrix holding the atomic positions.
- Returns
The array has the shape
(n, 3)
. Each row holds the x, y and z coordinates of an atom.- Return type
numpy.ndarray
-
classmethod
has_cached_mol
(identity_key)¶ True
if molecule with identity_key is cached.- Parameters
identity_key (
object
) – The identity key of a molecule.- Returns
True
if a molecule with identity_key is cached.- Return type
bool
-
classmethod
init_from_dict
(mol_dict, use_cache=False)¶ Initialize from a
dict
representation.The
Molecule
returned has the class specified in mol_dict, notMolecule
.- Parameters
mol_dict (
dict
) – Adict
holding thedict
representation of a molecule, generated byto_dict()
.use_cache (
bool
, optional) – IfTrue
, a new instance will not be made if a cached and identical one already exists, the one which already exists will be returned. IfTrue
and a cached, identical instance does not yet exist the created one will be added to the cache.
- Returns
The molecule represented by mol_dict.
- Return type
Molecule
-
classmethod
load
(path, use_cache=False)¶ Initialize from a dump file.
The
Molecule
returned has the class specified in in the file, notMolecule
. You can use this if you don’t know what class the instance in the loaded molecule is or should be.- Parameters
path (
str
) – The full path holding a dumped molecule.use_cache (
bool
, optional) – IfTrue
, a new instance will not be made if a cached and identical one already exists, the one which already exists will be returned. IfTrue
and a cached, identical instance does not yet exist the created one will be added to the cache.
- Returns
The molecule held in path.
- Return type
Molecule
-
set_centroid
(self, position, atom_ids=None)¶ Set the centroid to position.
- Parameters
position (
numpy.ndarray
) – This array holds the position on which the centroid of the molecule is going to be placed.atom_ids (
iterable
ofint
) – The ids of atoms which should have their centroid set to position. IfNone
then all atoms are used.
- Returns
The molecule.
- Return type
Molecule
-
set_position_matrix
(self, position_matrix)¶ Set the coordinates to those in position_matrix.
- Parameters
position_matrix (
numpy.ndarray
) – A position matrix of the molecule. The shape of the matrix is(n, 3)
.- Returns
The molecule.
- Return type
Molecule
-
to_dict
(self, include_attrs=None, ignore_missing_attrs=False)¶ Return a
dict
representation.- Parameters
include_attrs (
list
ofstr
, optional) – The names of additional attributes of the molecule to be added to thedict
. Each attribute is saved as a string usingrepr()
. These attributes are also passed down recursively to the building block molecules.ignore_missing_attrs (
bool
, optional) – IfFalse
and an attribute in include_attrs is not held by theConstructedMolecule
, an error will be raised.
- Returns
A
dict
which represents the molecule.- Return type
dict
-
to_rdkit_mol
(self)¶ Return an
rdkit
representation.- Returns
The molecule in
rdkit
format.- Return type
rdkit.Mol
-
update_cache
(self)¶ Update attributes of the cached molecule.
If there is no identical molecule in the cache, then this molecule is added.
When using multiprocessing, modified copies of the original molecules are created. In order to ensure that the cached molecules have their attributes updated to the values of the copies, this method must be run on the copies.
- Returns
None
- Return type
NoneType
-
update_from_file
(self, path)¶ Update the structure from a file.
Multiple file types are supported, namely:
.mol
,.sdf
- MDL V2000 and V3000 files.xyz
- XYZ files.mae
- Schrodinger Maestro files.coord
- Turbomole files
-
update_from_rdkit_mol
(self, mol)¶ Update the structure to match mol.
- Parameters
mol (
rdkit.Mol
) – Therdkit
molecule to use for the structure update.- Returns
The molecule.
- Return type
-
write
(self, path, atom_ids=None)¶ Write the structure to a file.
This function will write the format based on the extension of path.
.mol
,.sdf
- MDL V3000 MOL file.xyz
- XYZ file.pdb
- PDB file
- Parameters
path (
str
) – The path to which the molecule should be written.atom_ids (
iterable
ofint
, optional) – The atom ids of atoms to write. IfNone
then all atoms are written.
- Returns
The molecule.
- Return type
-