Populations¶
Population
objects are container specialized to perform
operations on groups of stk
molecules, often in parallel.
-
class
EAPopulation
(*args)¶ Bases:
stk.populations.Population
A population which also stores fitness values of molecules.
-
direct_members
¶ Held here are direct members of the
Population
. In other words, these are the molecules not held by any subpopulations. As a result, not all members of aPopulation
are stored in this attribute.- Type
list
ofMolecule
-
subpopulations
¶ A
list
holding the subpopulations.- Type
list
ofPopulation
Methods
add_members
(self, molecules[, duplicate_key])Add
Molecule
instances to thePopulation
.add_subpopulation
(self, population)Add a clone of population to
subpopulations
.clone
(self)Return a clone.
close_process_pool
(self)Close an open process pool.
dump
(self, path[, include_attrs, …])Dump the
Population
to a file.get_fitness_values
(self)Return the fitness values of molecules.
init_all
(building_blocks, topology_graphs[, …])Make all possible molecules from groups of building blocks.
init_diverse
(building_blocks, …[, …])Construct a chemically diverse
Population
.init_from_list
(pop_list[, use_cache])Initialize a population from a
list
representation.init_random
(building_blocks, …[, …])Construct molecules for a random
Population
.load
(path[, use_cache])Initialize a
Population
from one dumped to a file.open_process_pool
(self[, num_processes])Open a process pool.
optimize
(self, optimizer[, num_processes])Optimize the structures of molecules in the population.
remove_duplicates
(self[, …])Remove duplicates from the population.
remove_members
(self, key)Remove all members where
key(member)
isTrue
.set_fitness_values_from_calculators
(self, …)Set the fitness values of molecules.
set_fitness_values_from_dict
(self, …)Set the fitness values of molecules.
set_mol_ids
(self, n[, overwrite])Give each member of the population an id starting from n.
to_list
(self[, include_attrs, …])Convert the population to a
list
representation.write
(self, path)Write the
.mol
files of members to a directory.-
__init__
(self, *args)¶ Initialize a
Population
.- Parameters
*args (
Molecule
,Population
) – A population is initialized with theMolecule
andPopulation
instances it should hold.
Examples
bb1 = stk.BuildingBlock('CCC') bb2 = stk.BuildingBlock('NCCNCNC') bb3 = stk.BuildingBlock('[Br]CCC[Br]') pop1 = stk.Population(bb1, bb2, bb3) bb4 = stk.BuildingBlock('NNCCCN') # pop2 has pop1 as a subpopulation and bb4 as a direct # member. pop2 = stk.Population(pop1, bb4)
-
add_members
(self, molecules, duplicate_key=None)¶ Add
Molecule
instances to thePopulation
.The added
Molecule
instances are added as direct members of the population, they are not placed into any subpopulations.- Parameters
molecules (
iterable
ofMolecule
) – The molecules to be added as direct members.duplicate_key (
callable
, optional) – If notNone
,duplicate_key(mol)
is evalued on each molecule in members. If a molecule with the same duplicate_key is already present in the population, the molecule is not added.
- Returns
None
- Return type
NoneType
-
add_subpopulation
(self, population)¶ Add a clone of population to
subpopulations
.Only a clone of the population container is made. The molecules it holds are not copies.
- Parameters
population (
Population
) – The population to be added as a subpopulation.- Returns
None
- Return type
NoneType
-
clone
(self)¶ Return a clone.
The clone will share the
Molecule
objects, copies ofMolecule
objects will not be made.- Returns
The clone.
- Return type
Examples
import stk # Make an intial population. pop = stk.Population(stk.BuildingBlock('NCCN')) # Make a clone. clone = pop.clone()
-
close_process_pool
(self)¶ Close an open process pool.
- Returns
The population.
- Return type
-
dump
(self, path, include_attrs=None, ignore_missing_attrs=False)¶ Dump the
Population
to a file.- Parameters
path (
str
) – The full path of the file to which thePopulation
should be dumped.include_attrs (
list
ofstr
, optional) – The names of attributes of the molecules to be added to the JSON. Each attribute is saved as a string usingrepr()
.ignore_missing_attrs (
bool
, optional) – IfFalse
and an attribute in include_attrs is not held by aMolecule
, an error will be raised.
- Returns
None
- Return type
NoneType
-
get_fitness_values
(self)¶ Return the fitness values of molecules.
- Returns
Maps a
Molecule
to its fitness value.- Return type
dict
-
classmethod
init_all
(building_blocks, topology_graphs, num_processes=None, duplicates=False, use_cache=False)¶ Make all possible molecules from groups of building blocks.
- Parameters
building_blocks (
list
ofMolecule
) –A
list
holding nested building blocks, for examplebbs1 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] bbs2 = [ stk.ConstructedMolecule(...), stk.BuildingBlock(...), ..., ] bbs3 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] building_blocks = [bbs1, bbs2, bbs3]
To construct a new
ConstructedMolecule
, aMolecule
is picked from each of the sublists in building_blocks. The pickedMolecule
instances are then supplied toConstructedMolecule
# mol is a new ConstructedMolecule. bb1 is selected # from bbs1, bb2 is selected from bbs2 and bb3 is # selected from bbs3. mol = stk.ConstructedMolecule( building_blocks=[bb1, bb2, bb3], topology_graph=topology_pick )
The order a
Molecule
instance is given to theConstructedMolecule
is determined by the sublist of building_blocks it was picked from. Note that the number of sublists in building_blocks is not fixed. It merely has to be compatible with the topology_graphs.topology_graphs (
list
ofTopologyGraph
) – The topology graphs of .ConstructedMolecule being made.num_processes (
int
, optional) – The number of parallel processes to create when constructing the molecules. IfNone
, creates a process for each core on the computer.duplicates (
bool
, optional) – IfFalse
, duplicate structures are removed from the population.use_cache (
bool
, optional) – Toggles use of the molecular cache.
- Returns
A
Population
holding .ConstructedMolecule instances.- Return type
Examples
Construct all possible cage molecules from some precursors
import stk amines = [ stk.BuildingBlock('NCCCN', ['amine']), stk.BuildingBlock('NCCCCCN', ['amine']), stk.BuildingBlock('NCCOCCN', ['amine']), ] aldehydes = [ stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=C(C=O)COCC=O', ['aldehyde']), ] # A total of 9 cages will be created. cages = stk.Population.init_all( building_blocks=[amines, aldehydes], topology_graphs=[stk.cage.FourPlusSix()] )
Use the constructed cages and a new bunch of building blocks to create all possible cage complexes.
encapsulants = [ stk.BuildingBlock('[Br][Br]'), stk.BuildingBlock('[F][F]'), ] # Every combination of cage and encapsulant. complexes = stk.Population.init_all( building_blocks=[cages, encapsulants], topology_graphs=[stk.host_guest_complex.Complex()] )
-
classmethod
init_diverse
(building_blocks, topology_graphs, size, random_seed=None, use_cache=False)¶ Construct a chemically diverse
Population
.All constructed molecules are held in
direct_members
.In order to construct a
ConstructedMolecule
, a randomMolecule
is selected from each sublist in building_blocks. Once the first construction is complete, the nextMolecule
selected from each sublist is the one with the most different Morgan fingerprint to the prior one. The third construction uses randomly selectedMolecule
objects again and so on. This is done until sizeConstructedMolecule
instances have been constructed.- Parameters
building_blocks (
list
ofMolecule
) –A
list
holding nested building blocks, for examplebbs1 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] bbs2 = [ stk.ConstructedMolecule(...), stk.BuildingBlock(...), ... ] bbs3 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] building_blocks = [bbs1, bbs2, bbs3]
To construct a new
ConstructedMolecule
, aMolecule
is picked from each of the sublists in building_blocks. The pickedMolecule
instances are then supplied to theConstructedMolecule
# mol is a new ConstructedMolecule. bb1 is selected # from bbs1, bb2 is selected from bbs2 and bb3 is # selected from bbs3. mol = stk.ConstructedMolecule( building_blocks=[bb1, bb2, bb3], topology_graph=topology_pick )
The order a
Molecule
instance is given toConstructedMolecule
is determined by the sublist of building_blocks it was picked from. Note that the number of sublists in building_blocks is not fixed. It merely has to be compatible with the topology_graphs.topology_graphs (
iterable
ofTopologyGraph
) – An iterable holding topology grpahs which should be randomly selected for the construction of aConstructedMolecule
.size (
int
) – The desired size of thePopulation
.random_seed (
int
, optional) – Seed for the random number generator to get replicable results.use_cache (
bool
, optional) – Toggles use of the molecular cache.
- Returns
A population filled with the constructed molecules.
- Return type
Examples
Construct a diverse
Population
of cage molecules from some precursorsimport stk amines = [ stk.BuildingBlock('NCCCN', ['amine']), stk.BuildingBlock('NCCCCCN', ['amine']), stk.BuildingBlock('NCCOCCN', ['amine']), ] aldehydes = [ stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=C(C=O)COCC=O', ['aldehyde']), ] # A total of 4 cages will be created. cages = stk.Population.init_diverse( building_blocks=[amines, aldehydes], topology_graphs=[stk.cage.FourPlusSix()], size=4 )
Use the constructed cages and a new bunch of building blocks to create some diverse cage complexes.
encapsulants = [ stk.BuildingBlock('[Br][Br]'), stk.BuildingBlock('[F][F]'), ] # 4 combinations of cage and encapsulant. complexes = stk.Population.init_diverse( building_blocks=[cages, encapsulants], topology_graphs=[stk.host_guest_complex.Complex()], size=4 )
-
classmethod
init_from_list
(pop_list, use_cache=False)¶ Initialize a population from a
list
representation.- Parameters
pop_list (
list
) –A
list
which represents aPopulation
. Like the ones created byto_list()
. For example in,pop_list = [{...}, [{...}], [{...}, {...}], {...}]
pop_list
represents thePopulation
, sublists represent its subpopulations and thedict
{...}
represents the members.use_cache (
bool
, optional) – Toggles use of the molecular cache.
- Returns
The population represented by pop_list.
- Return type
-
classmethod
init_random
(building_blocks, topology_graphs, size, random_seed=None, use_cache=False)¶ Construct molecules for a random
Population
.All molecules are held in
direct_members
.From the supplied building blocks a random
Molecule
is selected from each sublist to form aConstructedMolecule
. This is done until sizeConstructedMolecule
objects have been constructed.- Parameters
building_blocks (
list
ofMolecule
) –A
list
holding nested building blocks, for examplebbs1 = [ stk.BuildingBlock(...), sk.BuildingBlock(...), ... ] bbs2 = [ stk.ConstructedMolecule(...), stk.BuildingBlock(...), ... ] bbs3 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] building_blocks = [bbs1, bbs2, bbs3]
To construct a new
ConstructedMolecule
, aMolecule
is picked from each of the sublists in building_blocks. The pickedMolecule
instances are then supplied toConstructedMolecule
# mol is a new ConstructedMolecule. bb1 is selected # from bbs1, bb2 is selected from bbs2 and bb3 is # selected from bbs3. mol = stk.ConstructedMolecule( building_blocks=[bb1, bb2, bb3], topology_graph=topology_pick )
The order a
Molecule
instance is given to theConstructedMolecule
is determined by the sublist of building_blocks it was picked from. Note that the number of sublists in building_blocks is not fixed. It merely has to be compatible with the topology_graphs.topology_graphs (
iterable
ofTopologyGraph
) – Aniterable
holding topology graphs which should be randomly selected during initialization ofConstructedMolecule
.size (
int
) – The size of the population to be initialized.random_seed (
int
, optional) – Seed for the random number generator to get replicable results.use_cache (
bool
, optional) – Toggles use of the molecular cache.
- Returns
A population filled with random
ConstructedMolecule
instances.- Return type
Examples
Construct 5 random cage molecules from some precursors
import stk amines = [ stk.BuildingBlock('NCCCN', ['amine']), stk.BuildingBlock('NCCCCCN', ['amine']), stk.BuildingBlock('NCCOCCN', ['amine']), ] aldehydes = [ stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=C(C=O)COCC=O', ['aldehyde']), ] # A total of 5 cages will be created. cages = stk.Population.init_random( building_blocks=[amines, aldehydes], topology_graphs=[stk.cage.FourPlusSix()], size=5 )
Use the constructed cages and a new bunch of building blocks to create some random cage complexes.
encapsulants = [ stk.BuildingBlock('[Br][Br]'), stk.BuildingBlock('[F][F]'), ] # Random combinations of cage and encapsulant. complexes = stk.Population.init_random( building_blocks=[cages, encapsulants], topology_graphs=[stk.host_guest_complex.Complex()], size=5 )
-
classmethod
load
(path, use_cache=False)¶ Initialize a
Population
from one dumped to a file.- Parameters
path (
str
) – The full path of the file holding the dumped population.use_cache (
bool
, optional) – Toggles use of the moleular cache.
- Returns
The population stored in the dump file.
- Return type
-
open_process_pool
(self, num_processes=None)¶ Open a process pool.
- Parameters
num_processes (
int
, optional) – The number of processes in the pool. IfNone
, then creates a process for each core on the computer.- Returns
The population.
- Return type
- Raises
RuntimeError – If a process pool is already open.
-
optimize
(self, optimizer, num_processes=None)¶ Optimize the structures of molecules in the population.
The molecules are optimized serially or in parallel depending if num_processes is
1
or more. The serial version may be faster in cases where all molecules have already been optimized and the optimizer will skip them. In this case creating a parallel process pool creates unnecessary overhead.- Parameters
optimizer (
Optimizer
) – The optimizer used to carry out the optimizations.num_processes (
int
, optional) – The number of parallel processes to create. Optimization will run serially if1
. IfNone
, creates a process for each core on the computer. This parameter will be ignored if the population has an open process pool.
- Returns
None
- Return type
NoneType
-
remove_duplicates
(self, across_subpopulations=True, key=<built-in function id>)¶ Remove duplicates from the population.
The question of which molecule is preserved when duplicates are removed is difficult to answer. The iteration through a population is depth-first, so a rule such as “the molecule in the topmost population is preserved” is not the case here. Rather, the first molecule found is preserved.
However, this question is only relevant if duplicates in different subpopulations are being removed. In this case it is assumed that it is more important to have a single instance than to worry about which subpopulation it is in.
If the duplicates are being removed from within subpopulations, each subpopulation will end up with a single instance of all molecules held before. There is no “choice”.
- Parameters
across_subpopulations (
bool
, optional) – WhenFalse
duplicates are only removed from within a given subpopulation. IfTrue
, all duplicates are removed, regardless of which subpopulation they are in.key (
callable
, optional) – Two molecules are considered the same if the values returned bykey(molecule)
are the same.
- Returns
None
- Return type
NoneType
-
remove_members
(self, key)¶ Remove all members where
key(member)
isTrue
.- Parameters
key (
callable
) – A callable which takes 1 argument. Each member of the population is passed as the argument to key in turn. If the result isTrue
then the member is removed from the population.- Returns
None
- Return type
NoneType
-
set_fitness_values_from_calculators
(self, fitness_calculator, fitness_normalizer=None, num_processes=None)¶ Set the fitness values of molecules.
- Parameters
fitness_calculator (
FitnessCalculator
) – Used to calculate the initial fitness values.fitness_normalizer (
FitnessNormalizer
, optional) – Used to normalize the fitness values.num_processes (
int
, optional) – The number of parallel processes to create. Calculations will run serially if1
. IfNone
, creates a process for each core on the computer. This parameter will be ignored if the population has an open process pool.
- Returns
The population is returned.
- Return type
-
set_fitness_values_from_dict
(self, fitness_values)¶ Set the fitness values of molecules.
- Parameters
fitness_values (
dict
) – Maps molecules in the population to their fitness values.- Returns
The population is returned.
- Return type
-
set_mol_ids
(self, n, overwrite=False)¶ Give each member of the population an id starting from n.
This method adds an
id
attribute to eachMolecule
instance held by the population.- Parameters
n (
int
) – A number. Members of thisPopulation
are given a unique number as an id, starting from n and incremented by one between members.overwrite (
bool
, optional) – IfTrue
, existing ids are replaced.
- Returns
The value of the last id assigned, plus 1.
- Return type
int
-
to_list
(self, include_attrs=None, ignore_missing_attrs=False)¶ Convert the population to a
list
representation.- Parameters
include_attrs (
list
ofstr
, optional) – The names of attributes to be added to the molecular representations. Each attribute is saved as a string usingrepr()
.ignore_missing_attrs (
bool
, optional) – IfFalse
and an attribute in include_attrs is not held by aMolecule
, an error will be raised.
- Returns
A
list
representation of thePopulation
.- Return type
list
-
write
(self, path)¶ Write the
.mol
files of members to a directory.- Parameters
path (
str
) – The full path of the directory into which the.mol
file is written.- Returns
None
- Return type
NoneType
-
-
class
Population
(*args)¶ Bases:
object
A container for
Molecule
objects.Population
instances can be nested.In addition to holding
Molecule
objects, thePopulation
class can be used to create large numbers of these instances through the class methods beginning with “init”.Molecule
instances held by aPopulation
can have their structures optimized in parallel through theoptimize()
method.It supports all expected and necessary container operations such as iteration, indexing and membership checks (via the
is in
operator).-
direct_members
¶ Held here are direct members of the
Population
. In other words, these are the molecules not held by any subpopulations. As a result, not all members of aPopulation
are stored in this attribute.- Type
list
ofMolecule
-
subpopulations
¶ A
list
holding the subpopulations.- Type
list
ofPopulation
Examples
A
Population
can be iterated through just like alist
import stk # Create a population. pop = stk.Population( stk.BuildingBlock(...), stk.ConstructedMolecule(...), stk.BuildingBlock(...), stk.BuildingBlock(...), stk.Population( stk.BuildingBlock(...), stk.ConstructedMolecule(...) ) stk.ConstructedMolecule(...), stk.BuildingBlock(...) ) for member in pop: do_stuff(member)
When iterating through a
Population
you will also iterate through nested members, that is members which are held by subpopulations. If you only wish to iterate through direct members, you canfor member in pop.direct_members: do_stuff(member)
You can also get access to members by using indices. Indices have access to all members in the population
first_member = pop[0] second_member = pop[1]
Indices will first access direct members of the population and then access members in the subpopulations. Indices access nested members depth-first
pop2 = stk.Population(bb1, bb2, stk.Population(bb3, bb4)) # Get bb1. pop2[0] # Get bb2. pop2[1] # Get bb3. pop2[2] # Get bb4. pop2[3]
You can get a subpopulation by taking a slice
# new_pop is a new Population instance and has no nesting. new_pop = pop[2:4]
You can take the length of a population to get the total number of members
len(pop)
Adding populations creates a new population with both of the added populations as subpopulations
# added has no direct members and two subpopulations, pop and # pop2. added = pop + pop2
Subtracting populations creates a new, flat population.
# subbed has all objects in pop except those also found in # pop2. subbed = pop - pop2
You can check if an object is already present in the population.
bb1 = stk.BuildingBlock(...) bb2 = stk.BuildingBlock(...) pop3 = stk.Population(bb1) # Returns True. bb1 in pop3 # Returns False. bb2 in pop3 # Returns True. bb2 not in pop3
If you want to run multiple
optimize()
calls in a row, use the “with” statement. This keeps a single process pool open, and means you do not create a new one for eachoptimize()
call. It also automatically closes the pool for you when the block exitspopulation = stk.Population(...) # Keep a process pool open through the "with" statement. with population.open_process_pool(8): # All optimize calls within this block will use the # same process pool. population.optimize(stk.UFF()) population.add_members(...) population.optimize(stk.UFF()) # Process pool is automatically cleaned up when the block # exits.
Methods
add_members
(self, molecules[, duplicate_key])Add
Molecule
instances to thePopulation
.add_subpopulation
(self, population)Add a clone of population to
subpopulations
.clone
(self)Return a clone.
close_process_pool
(self)Close an open process pool.
dump
(self, path[, include_attrs, …])Dump the
Population
to a file.init_all
(building_blocks, topology_graphs[, …])Make all possible molecules from groups of building blocks.
init_diverse
(building_blocks, …[, …])Construct a chemically diverse
Population
.init_from_list
(pop_list[, use_cache])Initialize a population from a
list
representation.init_random
(building_blocks, …[, …])Construct molecules for a random
Population
.load
(path[, use_cache])Initialize a
Population
from one dumped to a file.open_process_pool
(self[, num_processes])Open a process pool.
optimize
(self, optimizer[, num_processes])Optimize the structures of molecules in the population.
remove_duplicates
(self[, …])Remove duplicates from the population.
remove_members
(self, key)Remove all members where
key(member)
isTrue
.set_mol_ids
(self, n[, overwrite])Give each member of the population an id starting from n.
to_list
(self[, include_attrs, …])Convert the population to a
list
representation.write
(self, path)Write the
.mol
files of members to a directory.-
__init__
(self, *args)¶ Initialize a
Population
.- Parameters
*args (
Molecule
,Population
) – A population is initialized with theMolecule
andPopulation
instances it should hold.
Examples
bb1 = stk.BuildingBlock('CCC') bb2 = stk.BuildingBlock('NCCNCNC') bb3 = stk.BuildingBlock('[Br]CCC[Br]') pop1 = stk.Population(bb1, bb2, bb3) bb4 = stk.BuildingBlock('NNCCCN') # pop2 has pop1 as a subpopulation and bb4 as a direct # member. pop2 = stk.Population(pop1, bb4)
-
add_members
(self, molecules, duplicate_key=None)¶ Add
Molecule
instances to thePopulation
.The added
Molecule
instances are added as direct members of the population, they are not placed into any subpopulations.- Parameters
molecules (
iterable
ofMolecule
) – The molecules to be added as direct members.duplicate_key (
callable
, optional) – If notNone
,duplicate_key(mol)
is evalued on each molecule in members. If a molecule with the same duplicate_key is already present in the population, the molecule is not added.
- Returns
None
- Return type
NoneType
-
add_subpopulation
(self, population)¶ Add a clone of population to
subpopulations
.Only a clone of the population container is made. The molecules it holds are not copies.
- Parameters
population (
Population
) – The population to be added as a subpopulation.- Returns
None
- Return type
NoneType
-
clone
(self)¶ Return a clone.
The clone will share the
Molecule
objects, copies ofMolecule
objects will not be made.- Returns
The clone.
- Return type
Examples
import stk # Make an intial population. pop = stk.Population(stk.BuildingBlock('NCCN')) # Make a clone. clone = pop.clone()
-
close_process_pool
(self)¶ Close an open process pool.
- Returns
The population.
- Return type
-
dump
(self, path, include_attrs=None, ignore_missing_attrs=False)¶ Dump the
Population
to a file.- Parameters
path (
str
) – The full path of the file to which thePopulation
should be dumped.include_attrs (
list
ofstr
, optional) – The names of attributes of the molecules to be added to the JSON. Each attribute is saved as a string usingrepr()
.ignore_missing_attrs (
bool
, optional) – IfFalse
and an attribute in include_attrs is not held by aMolecule
, an error will be raised.
- Returns
None
- Return type
NoneType
-
classmethod
init_all
(building_blocks, topology_graphs, num_processes=None, duplicates=False, use_cache=False)¶ Make all possible molecules from groups of building blocks.
- Parameters
building_blocks (
list
ofMolecule
) –A
list
holding nested building blocks, for examplebbs1 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] bbs2 = [ stk.ConstructedMolecule(...), stk.BuildingBlock(...), ..., ] bbs3 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] building_blocks = [bbs1, bbs2, bbs3]
To construct a new
ConstructedMolecule
, aMolecule
is picked from each of the sublists in building_blocks. The pickedMolecule
instances are then supplied toConstructedMolecule
# mol is a new ConstructedMolecule. bb1 is selected # from bbs1, bb2 is selected from bbs2 and bb3 is # selected from bbs3. mol = stk.ConstructedMolecule( building_blocks=[bb1, bb2, bb3], topology_graph=topology_pick )
The order a
Molecule
instance is given to theConstructedMolecule
is determined by the sublist of building_blocks it was picked from. Note that the number of sublists in building_blocks is not fixed. It merely has to be compatible with the topology_graphs.topology_graphs (
list
ofTopologyGraph
) – The topology graphs of .ConstructedMolecule being made.num_processes (
int
, optional) – The number of parallel processes to create when constructing the molecules. IfNone
, creates a process for each core on the computer.duplicates (
bool
, optional) – IfFalse
, duplicate structures are removed from the population.use_cache (
bool
, optional) – Toggles use of the molecular cache.
- Returns
A
Population
holding .ConstructedMolecule instances.- Return type
Examples
Construct all possible cage molecules from some precursors
import stk amines = [ stk.BuildingBlock('NCCCN', ['amine']), stk.BuildingBlock('NCCCCCN', ['amine']), stk.BuildingBlock('NCCOCCN', ['amine']), ] aldehydes = [ stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=C(C=O)COCC=O', ['aldehyde']), ] # A total of 9 cages will be created. cages = stk.Population.init_all( building_blocks=[amines, aldehydes], topology_graphs=[stk.cage.FourPlusSix()] )
Use the constructed cages and a new bunch of building blocks to create all possible cage complexes.
encapsulants = [ stk.BuildingBlock('[Br][Br]'), stk.BuildingBlock('[F][F]'), ] # Every combination of cage and encapsulant. complexes = stk.Population.init_all( building_blocks=[cages, encapsulants], topology_graphs=[stk.host_guest_complex.Complex()] )
-
classmethod
init_diverse
(building_blocks, topology_graphs, size, random_seed=None, use_cache=False)¶ Construct a chemically diverse
Population
.All constructed molecules are held in
direct_members
.In order to construct a
ConstructedMolecule
, a randomMolecule
is selected from each sublist in building_blocks. Once the first construction is complete, the nextMolecule
selected from each sublist is the one with the most different Morgan fingerprint to the prior one. The third construction uses randomly selectedMolecule
objects again and so on. This is done until sizeConstructedMolecule
instances have been constructed.- Parameters
building_blocks (
list
ofMolecule
) –A
list
holding nested building blocks, for examplebbs1 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] bbs2 = [ stk.ConstructedMolecule(...), stk.BuildingBlock(...), ... ] bbs3 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] building_blocks = [bbs1, bbs2, bbs3]
To construct a new
ConstructedMolecule
, aMolecule
is picked from each of the sublists in building_blocks. The pickedMolecule
instances are then supplied to theConstructedMolecule
# mol is a new ConstructedMolecule. bb1 is selected # from bbs1, bb2 is selected from bbs2 and bb3 is # selected from bbs3. mol = stk.ConstructedMolecule( building_blocks=[bb1, bb2, bb3], topology_graph=topology_pick )
The order a
Molecule
instance is given toConstructedMolecule
is determined by the sublist of building_blocks it was picked from. Note that the number of sublists in building_blocks is not fixed. It merely has to be compatible with the topology_graphs.topology_graphs (
iterable
ofTopologyGraph
) – An iterable holding topology grpahs which should be randomly selected for the construction of aConstructedMolecule
.size (
int
) – The desired size of thePopulation
.random_seed (
int
, optional) – Seed for the random number generator to get replicable results.use_cache (
bool
, optional) – Toggles use of the molecular cache.
- Returns
A population filled with the constructed molecules.
- Return type
Examples
Construct a diverse
Population
of cage molecules from some precursorsimport stk amines = [ stk.BuildingBlock('NCCCN', ['amine']), stk.BuildingBlock('NCCCCCN', ['amine']), stk.BuildingBlock('NCCOCCN', ['amine']), ] aldehydes = [ stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=C(C=O)COCC=O', ['aldehyde']), ] # A total of 4 cages will be created. cages = stk.Population.init_diverse( building_blocks=[amines, aldehydes], topology_graphs=[stk.cage.FourPlusSix()], size=4 )
Use the constructed cages and a new bunch of building blocks to create some diverse cage complexes.
encapsulants = [ stk.BuildingBlock('[Br][Br]'), stk.BuildingBlock('[F][F]'), ] # 4 combinations of cage and encapsulant. complexes = stk.Population.init_diverse( building_blocks=[cages, encapsulants], topology_graphs=[stk.host_guest_complex.Complex()], size=4 )
-
classmethod
init_from_list
(pop_list, use_cache=False)¶ Initialize a population from a
list
representation.- Parameters
pop_list (
list
) –A
list
which represents aPopulation
. Like the ones created byto_list()
. For example in,pop_list = [{...}, [{...}], [{...}, {...}], {...}]
pop_list
represents thePopulation
, sublists represent its subpopulations and thedict
{...}
represents the members.use_cache (
bool
, optional) – Toggles use of the molecular cache.
- Returns
The population represented by pop_list.
- Return type
-
classmethod
init_random
(building_blocks, topology_graphs, size, random_seed=None, use_cache=False)¶ Construct molecules for a random
Population
.All molecules are held in
direct_members
.From the supplied building blocks a random
Molecule
is selected from each sublist to form aConstructedMolecule
. This is done until sizeConstructedMolecule
objects have been constructed.- Parameters
building_blocks (
list
ofMolecule
) –A
list
holding nested building blocks, for examplebbs1 = [ stk.BuildingBlock(...), sk.BuildingBlock(...), ... ] bbs2 = [ stk.ConstructedMolecule(...), stk.BuildingBlock(...), ... ] bbs3 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] building_blocks = [bbs1, bbs2, bbs3]
To construct a new
ConstructedMolecule
, aMolecule
is picked from each of the sublists in building_blocks. The pickedMolecule
instances are then supplied toConstructedMolecule
# mol is a new ConstructedMolecule. bb1 is selected # from bbs1, bb2 is selected from bbs2 and bb3 is # selected from bbs3. mol = stk.ConstructedMolecule( building_blocks=[bb1, bb2, bb3], topology_graph=topology_pick )
The order a
Molecule
instance is given to theConstructedMolecule
is determined by the sublist of building_blocks it was picked from. Note that the number of sublists in building_blocks is not fixed. It merely has to be compatible with the topology_graphs.topology_graphs (
iterable
ofTopologyGraph
) – Aniterable
holding topology graphs which should be randomly selected during initialization ofConstructedMolecule
.size (
int
) – The size of the population to be initialized.random_seed (
int
, optional) – Seed for the random number generator to get replicable results.use_cache (
bool
, optional) – Toggles use of the molecular cache.
- Returns
A population filled with random
ConstructedMolecule
instances.- Return type
Examples
Construct 5 random cage molecules from some precursors
import stk amines = [ stk.BuildingBlock('NCCCN', ['amine']), stk.BuildingBlock('NCCCCCN', ['amine']), stk.BuildingBlock('NCCOCCN', ['amine']), ] aldehydes = [ stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=C(C=O)COCC=O', ['aldehyde']), ] # A total of 5 cages will be created. cages = stk.Population.init_random( building_blocks=[amines, aldehydes], topology_graphs=[stk.cage.FourPlusSix()], size=5 )
Use the constructed cages and a new bunch of building blocks to create some random cage complexes.
encapsulants = [ stk.BuildingBlock('[Br][Br]'), stk.BuildingBlock('[F][F]'), ] # Random combinations of cage and encapsulant. complexes = stk.Population.init_random( building_blocks=[cages, encapsulants], topology_graphs=[stk.host_guest_complex.Complex()], size=5 )
-
classmethod
load
(path, use_cache=False)¶ Initialize a
Population
from one dumped to a file.- Parameters
path (
str
) – The full path of the file holding the dumped population.use_cache (
bool
, optional) – Toggles use of the moleular cache.
- Returns
The population stored in the dump file.
- Return type
-
open_process_pool
(self, num_processes=None)¶ Open a process pool.
- Parameters
num_processes (
int
, optional) – The number of processes in the pool. IfNone
, then creates a process for each core on the computer.- Returns
The population.
- Return type
- Raises
RuntimeError – If a process pool is already open.
-
optimize
(self, optimizer, num_processes=None)¶ Optimize the structures of molecules in the population.
The molecules are optimized serially or in parallel depending if num_processes is
1
or more. The serial version may be faster in cases where all molecules have already been optimized and the optimizer will skip them. In this case creating a parallel process pool creates unnecessary overhead.- Parameters
optimizer (
Optimizer
) – The optimizer used to carry out the optimizations.num_processes (
int
, optional) – The number of parallel processes to create. Optimization will run serially if1
. IfNone
, creates a process for each core on the computer. This parameter will be ignored if the population has an open process pool.
- Returns
None
- Return type
NoneType
-
remove_duplicates
(self, across_subpopulations=True, key=<built-in function id>)¶ Remove duplicates from the population.
The question of which molecule is preserved when duplicates are removed is difficult to answer. The iteration through a population is depth-first, so a rule such as “the molecule in the topmost population is preserved” is not the case here. Rather, the first molecule found is preserved.
However, this question is only relevant if duplicates in different subpopulations are being removed. In this case it is assumed that it is more important to have a single instance than to worry about which subpopulation it is in.
If the duplicates are being removed from within subpopulations, each subpopulation will end up with a single instance of all molecules held before. There is no “choice”.
- Parameters
across_subpopulations (
bool
, optional) – WhenFalse
duplicates are only removed from within a given subpopulation. IfTrue
, all duplicates are removed, regardless of which subpopulation they are in.key (
callable
, optional) – Two molecules are considered the same if the values returned bykey(molecule)
are the same.
- Returns
None
- Return type
NoneType
-
remove_members
(self, key)¶ Remove all members where
key(member)
isTrue
.- Parameters
key (
callable
) – A callable which takes 1 argument. Each member of the population is passed as the argument to key in turn. If the result isTrue
then the member is removed from the population.- Returns
None
- Return type
NoneType
-
set_mol_ids
(self, n, overwrite=False)¶ Give each member of the population an id starting from n.
This method adds an
id
attribute to eachMolecule
instance held by the population.- Parameters
n (
int
) – A number. Members of thisPopulation
are given a unique number as an id, starting from n and incremented by one between members.overwrite (
bool
, optional) – IfTrue
, existing ids are replaced.
- Returns
The value of the last id assigned, plus 1.
- Return type
int
-
to_list
(self, include_attrs=None, ignore_missing_attrs=False)¶ Convert the population to a
list
representation.- Parameters
include_attrs (
list
ofstr
, optional) – The names of attributes to be added to the molecular representations. Each attribute is saved as a string usingrepr()
.ignore_missing_attrs (
bool
, optional) – IfFalse
and an attribute in include_attrs is not held by aMolecule
, an error will be raised.
- Returns
A
list
representation of thePopulation
.- Return type
list
-
write
(self, path)¶ Write the
.mol
files of members to a directory.- Parameters
path (
str
) – The full path of the directory into which the.mol
file is written.- Returns
None
- Return type
NoneType
-