Populations¶
Population objects are container specialized to perform
operations on groups of stk molecules, often in parallel.
-
class
EAPopulation(*args)¶ Bases:
stk.populations.PopulationA population which also stores fitness values of molecules.
-
direct_members¶ Held here are direct members of the
Population. In other words, these are the molecules not held by any subpopulations. As a result, not all members of aPopulationare stored in this attribute.- Type
listofMolecule
-
subpopulations¶ A
listholding the subpopulations.- Type
listofPopulation
Methods
add_members(self, molecules[, duplicate_key])Add
Moleculeinstances to thePopulation.add_subpopulation(self, population)Add a clone of population to
subpopulations.clone(self)Return a clone.
close_process_pool(self)Close an open process pool.
dump(self, path[, include_attrs, …])Dump the
Populationto a file.get_fitness_values(self)Return the fitness values of molecules.
init_all(building_blocks, topology_graphs[, …])Make all possible molecules from groups of building blocks.
init_diverse(building_blocks, …[, …])Construct a chemically diverse
Population.init_from_list(pop_list[, use_cache])Initialize a population from a
listrepresentation.init_random(building_blocks, …[, …])Construct molecules for a random
Population.load(path[, use_cache])Initialize a
Populationfrom one dumped to a file.open_process_pool(self[, num_processes])Open a process pool.
optimize(self, optimizer[, num_processes])Optimize the structures of molecules in the population.
remove_duplicates(self[, …])Remove duplicates from the population.
remove_members(self, key)Remove all members where
key(member)isTrue.set_fitness_values_from_calculators(self, …)Set the fitness values of molecules.
set_fitness_values_from_dict(self, …)Set the fitness values of molecules.
set_mol_ids(self, n[, overwrite])Give each member of the population an id starting from n.
to_list(self[, include_attrs, …])Convert the population to a
listrepresentation.write(self, path)Write the
.molfiles of members to a directory.-
__init__(self, *args)¶ Initialize a
Population.- Parameters
*args (
Molecule,Population) – A population is initialized with theMoleculeandPopulationinstances it should hold.
Examples
bb1 = stk.BuildingBlock('CCC') bb2 = stk.BuildingBlock('NCCNCNC') bb3 = stk.BuildingBlock('[Br]CCC[Br]') pop1 = stk.Population(bb1, bb2, bb3) bb4 = stk.BuildingBlock('NNCCCN') # pop2 has pop1 as a subpopulation and bb4 as a direct # member. pop2 = stk.Population(pop1, bb4)
-
add_members(self, molecules, duplicate_key=None)¶ Add
Moleculeinstances to thePopulation.The added
Moleculeinstances are added as direct members of the population, they are not placed into any subpopulations.- Parameters
molecules (
iterableofMolecule) – The molecules to be added as direct members.duplicate_key (
callable, optional) – If notNone,duplicate_key(mol)is evalued on each molecule in members. If a molecule with the same duplicate_key is already present in the population, the molecule is not added.
- Returns
None
- Return type
NoneType
-
add_subpopulation(self, population)¶ Add a clone of population to
subpopulations.Only a clone of the population container is made. The molecules it holds are not copies.
- Parameters
population (
Population) – The population to be added as a subpopulation.- Returns
None
- Return type
NoneType
-
clone(self)¶ Return a clone.
The clone will share the
Moleculeobjects, copies ofMoleculeobjects will not be made.- Returns
The clone.
- Return type
Examples
import stk # Make an intial population. pop = stk.Population(stk.BuildingBlock('NCCN')) # Make a clone. clone = pop.clone()
-
close_process_pool(self)¶ Close an open process pool.
- Returns
The population.
- Return type
-
dump(self, path, include_attrs=None, ignore_missing_attrs=False)¶ Dump the
Populationto a file.- Parameters
path (
str) – The full path of the file to which thePopulationshould be dumped.include_attrs (
listofstr, optional) – The names of attributes of the molecules to be added to the JSON. Each attribute is saved as a string usingrepr().ignore_missing_attrs (
bool, optional) – IfFalseand an attribute in include_attrs is not held by aMolecule, an error will be raised.
- Returns
None
- Return type
NoneType
-
get_fitness_values(self)¶ Return the fitness values of molecules.
- Returns
Maps a
Moleculeto its fitness value.- Return type
dict
-
classmethod
init_all(building_blocks, topology_graphs, num_processes=None, duplicates=False, use_cache=False)¶ Make all possible molecules from groups of building blocks.
- Parameters
building_blocks (
listofMolecule) –A
listholding nested building blocks, for examplebbs1 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] bbs2 = [ stk.ConstructedMolecule(...), stk.BuildingBlock(...), ..., ] bbs3 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] building_blocks = [bbs1, bbs2, bbs3]
To construct a new
ConstructedMolecule, aMoleculeis picked from each of the sublists in building_blocks. The pickedMoleculeinstances are then supplied toConstructedMolecule# mol is a new ConstructedMolecule. bb1 is selected # from bbs1, bb2 is selected from bbs2 and bb3 is # selected from bbs3. mol = stk.ConstructedMolecule( building_blocks=[bb1, bb2, bb3], topology_graph=topology_pick )
The order a
Moleculeinstance is given to theConstructedMoleculeis determined by the sublist of building_blocks it was picked from. Note that the number of sublists in building_blocks is not fixed. It merely has to be compatible with the topology_graphs.topology_graphs (
listofTopologyGraph) – The topology graphs of .ConstructedMolecule being made.num_processes (
int, optional) – The number of parallel processes to create when constructing the molecules. IfNone, creates a process for each core on the computer.duplicates (
bool, optional) – IfFalse, duplicate structures are removed from the population.use_cache (
bool, optional) – Toggles use of the molecular cache.
- Returns
A
Populationholding .ConstructedMolecule instances.- Return type
Examples
Construct all possible cage molecules from some precursors
import stk amines = [ stk.BuildingBlock('NCCCN', ['amine']), stk.BuildingBlock('NCCCCCN', ['amine']), stk.BuildingBlock('NCCOCCN', ['amine']), ] aldehydes = [ stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=C(C=O)COCC=O', ['aldehyde']), ] # A total of 9 cages will be created. cages = stk.Population.init_all( building_blocks=[amines, aldehydes], topology_graphs=[stk.cage.FourPlusSix()] )
Use the constructed cages and a new bunch of building blocks to create all possible cage complexes.
encapsulants = [ stk.BuildingBlock('[Br][Br]'), stk.BuildingBlock('[F][F]'), ] # Every combination of cage and encapsulant. complexes = stk.Population.init_all( building_blocks=[cages, encapsulants], topology_graphs=[stk.host_guest_complex.Complex()] )
-
classmethod
init_diverse(building_blocks, topology_graphs, size, random_seed=None, use_cache=False)¶ Construct a chemically diverse
Population.All constructed molecules are held in
direct_members.In order to construct a
ConstructedMolecule, a randomMoleculeis selected from each sublist in building_blocks. Once the first construction is complete, the nextMoleculeselected from each sublist is the one with the most different Morgan fingerprint to the prior one. The third construction uses randomly selectedMoleculeobjects again and so on. This is done until sizeConstructedMoleculeinstances have been constructed.- Parameters
building_blocks (
listofMolecule) –A
listholding nested building blocks, for examplebbs1 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] bbs2 = [ stk.ConstructedMolecule(...), stk.BuildingBlock(...), ... ] bbs3 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] building_blocks = [bbs1, bbs2, bbs3]
To construct a new
ConstructedMolecule, aMoleculeis picked from each of the sublists in building_blocks. The pickedMoleculeinstances are then supplied to theConstructedMolecule# mol is a new ConstructedMolecule. bb1 is selected # from bbs1, bb2 is selected from bbs2 and bb3 is # selected from bbs3. mol = stk.ConstructedMolecule( building_blocks=[bb1, bb2, bb3], topology_graph=topology_pick )
The order a
Moleculeinstance is given toConstructedMoleculeis determined by the sublist of building_blocks it was picked from. Note that the number of sublists in building_blocks is not fixed. It merely has to be compatible with the topology_graphs.topology_graphs (
iterableofTopologyGraph) – An iterable holding topology grpahs which should be randomly selected for the construction of aConstructedMolecule.size (
int) – The desired size of thePopulation.random_seed (
int, optional) – Seed for the random number generator to get replicable results.use_cache (
bool, optional) – Toggles use of the molecular cache.
- Returns
A population filled with the constructed molecules.
- Return type
Examples
Construct a diverse
Populationof cage molecules from some precursorsimport stk amines = [ stk.BuildingBlock('NCCCN', ['amine']), stk.BuildingBlock('NCCCCCN', ['amine']), stk.BuildingBlock('NCCOCCN', ['amine']), ] aldehydes = [ stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=C(C=O)COCC=O', ['aldehyde']), ] # A total of 4 cages will be created. cages = stk.Population.init_diverse( building_blocks=[amines, aldehydes], topology_graphs=[stk.cage.FourPlusSix()], size=4 )
Use the constructed cages and a new bunch of building blocks to create some diverse cage complexes.
encapsulants = [ stk.BuildingBlock('[Br][Br]'), stk.BuildingBlock('[F][F]'), ] # 4 combinations of cage and encapsulant. complexes = stk.Population.init_diverse( building_blocks=[cages, encapsulants], topology_graphs=[stk.host_guest_complex.Complex()], size=4 )
-
classmethod
init_from_list(pop_list, use_cache=False)¶ Initialize a population from a
listrepresentation.- Parameters
pop_list (
list) –A
listwhich represents aPopulation. Like the ones created byto_list(). For example in,pop_list = [{...}, [{...}], [{...}, {...}], {...}]
pop_listrepresents thePopulation, sublists represent its subpopulations and thedict{...}represents the members.use_cache (
bool, optional) – Toggles use of the molecular cache.
- Returns
The population represented by pop_list.
- Return type
-
classmethod
init_random(building_blocks, topology_graphs, size, random_seed=None, use_cache=False)¶ Construct molecules for a random
Population.All molecules are held in
direct_members.From the supplied building blocks a random
Moleculeis selected from each sublist to form aConstructedMolecule. This is done until sizeConstructedMoleculeobjects have been constructed.- Parameters
building_blocks (
listofMolecule) –A
listholding nested building blocks, for examplebbs1 = [ stk.BuildingBlock(...), sk.BuildingBlock(...), ... ] bbs2 = [ stk.ConstructedMolecule(...), stk.BuildingBlock(...), ... ] bbs3 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] building_blocks = [bbs1, bbs2, bbs3]
To construct a new
ConstructedMolecule, aMoleculeis picked from each of the sublists in building_blocks. The pickedMoleculeinstances are then supplied toConstructedMolecule# mol is a new ConstructedMolecule. bb1 is selected # from bbs1, bb2 is selected from bbs2 and bb3 is # selected from bbs3. mol = stk.ConstructedMolecule( building_blocks=[bb1, bb2, bb3], topology_graph=topology_pick )
The order a
Moleculeinstance is given to theConstructedMoleculeis determined by the sublist of building_blocks it was picked from. Note that the number of sublists in building_blocks is not fixed. It merely has to be compatible with the topology_graphs.topology_graphs (
iterableofTopologyGraph) – Aniterableholding topology graphs which should be randomly selected during initialization ofConstructedMolecule.size (
int) – The size of the population to be initialized.random_seed (
int, optional) – Seed for the random number generator to get replicable results.use_cache (
bool, optional) – Toggles use of the molecular cache.
- Returns
A population filled with random
ConstructedMoleculeinstances.- Return type
Examples
Construct 5 random cage molecules from some precursors
import stk amines = [ stk.BuildingBlock('NCCCN', ['amine']), stk.BuildingBlock('NCCCCCN', ['amine']), stk.BuildingBlock('NCCOCCN', ['amine']), ] aldehydes = [ stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=C(C=O)COCC=O', ['aldehyde']), ] # A total of 5 cages will be created. cages = stk.Population.init_random( building_blocks=[amines, aldehydes], topology_graphs=[stk.cage.FourPlusSix()], size=5 )
Use the constructed cages and a new bunch of building blocks to create some random cage complexes.
encapsulants = [ stk.BuildingBlock('[Br][Br]'), stk.BuildingBlock('[F][F]'), ] # Random combinations of cage and encapsulant. complexes = stk.Population.init_random( building_blocks=[cages, encapsulants], topology_graphs=[stk.host_guest_complex.Complex()], size=5 )
-
classmethod
load(path, use_cache=False)¶ Initialize a
Populationfrom one dumped to a file.- Parameters
path (
str) – The full path of the file holding the dumped population.use_cache (
bool, optional) – Toggles use of the moleular cache.
- Returns
The population stored in the dump file.
- Return type
-
open_process_pool(self, num_processes=None)¶ Open a process pool.
- Parameters
num_processes (
int, optional) – The number of processes in the pool. IfNone, then creates a process for each core on the computer.- Returns
The population.
- Return type
- Raises
RuntimeError – If a process pool is already open.
-
optimize(self, optimizer, num_processes=None)¶ Optimize the structures of molecules in the population.
The molecules are optimized serially or in parallel depending if num_processes is
1or more. The serial version may be faster in cases where all molecules have already been optimized and the optimizer will skip them. In this case creating a parallel process pool creates unnecessary overhead.- Parameters
optimizer (
Optimizer) – The optimizer used to carry out the optimizations.num_processes (
int, optional) – The number of parallel processes to create. Optimization will run serially if1. IfNone, creates a process for each core on the computer. This parameter will be ignored if the population has an open process pool.
- Returns
None
- Return type
NoneType
-
remove_duplicates(self, across_subpopulations=True, key=<built-in function id>)¶ Remove duplicates from the population.
The question of which molecule is preserved when duplicates are removed is difficult to answer. The iteration through a population is depth-first, so a rule such as “the molecule in the topmost population is preserved” is not the case here. Rather, the first molecule found is preserved.
However, this question is only relevant if duplicates in different subpopulations are being removed. In this case it is assumed that it is more important to have a single instance than to worry about which subpopulation it is in.
If the duplicates are being removed from within subpopulations, each subpopulation will end up with a single instance of all molecules held before. There is no “choice”.
- Parameters
across_subpopulations (
bool, optional) – WhenFalseduplicates are only removed from within a given subpopulation. IfTrue, all duplicates are removed, regardless of which subpopulation they are in.key (
callable, optional) – Two molecules are considered the same if the values returned bykey(molecule)are the same.
- Returns
None
- Return type
NoneType
-
remove_members(self, key)¶ Remove all members where
key(member)isTrue.- Parameters
key (
callable) – A callable which takes 1 argument. Each member of the population is passed as the argument to key in turn. If the result isTruethen the member is removed from the population.- Returns
None
- Return type
NoneType
-
set_fitness_values_from_calculators(self, fitness_calculator, fitness_normalizer=None, num_processes=None)¶ Set the fitness values of molecules.
- Parameters
fitness_calculator (
FitnessCalculator) – Used to calculate the initial fitness values.fitness_normalizer (
FitnessNormalizer, optional) – Used to normalize the fitness values.num_processes (
int, optional) – The number of parallel processes to create. Calculations will run serially if1. IfNone, creates a process for each core on the computer. This parameter will be ignored if the population has an open process pool.
- Returns
The population is returned.
- Return type
-
set_fitness_values_from_dict(self, fitness_values)¶ Set the fitness values of molecules.
- Parameters
fitness_values (
dict) – Maps molecules in the population to their fitness values.- Returns
The population is returned.
- Return type
-
set_mol_ids(self, n, overwrite=False)¶ Give each member of the population an id starting from n.
This method adds an
idattribute to eachMoleculeinstance held by the population.- Parameters
n (
int) – A number. Members of thisPopulationare given a unique number as an id, starting from n and incremented by one between members.overwrite (
bool, optional) – IfTrue, existing ids are replaced.
- Returns
The value of the last id assigned, plus 1.
- Return type
int
-
to_list(self, include_attrs=None, ignore_missing_attrs=False)¶ Convert the population to a
listrepresentation.- Parameters
include_attrs (
listofstr, optional) – The names of attributes to be added to the molecular representations. Each attribute is saved as a string usingrepr().ignore_missing_attrs (
bool, optional) – IfFalseand an attribute in include_attrs is not held by aMolecule, an error will be raised.
- Returns
A
listrepresentation of thePopulation.- Return type
list
-
write(self, path)¶ Write the
.molfiles of members to a directory.- Parameters
path (
str) – The full path of the directory into which the.molfile is written.- Returns
None
- Return type
NoneType
-
-
class
Population(*args)¶ Bases:
objectA container for
Moleculeobjects.Populationinstances can be nested.In addition to holding
Moleculeobjects, thePopulationclass can be used to create large numbers of these instances through the class methods beginning with “init”.Moleculeinstances held by aPopulationcan have their structures optimized in parallel through theoptimize()method.It supports all expected and necessary container operations such as iteration, indexing and membership checks (via the
is inoperator).-
direct_members¶ Held here are direct members of the
Population. In other words, these are the molecules not held by any subpopulations. As a result, not all members of aPopulationare stored in this attribute.- Type
listofMolecule
-
subpopulations¶ A
listholding the subpopulations.- Type
listofPopulation
Examples
A
Populationcan be iterated through just like alistimport stk # Create a population. pop = stk.Population( stk.BuildingBlock(...), stk.ConstructedMolecule(...), stk.BuildingBlock(...), stk.BuildingBlock(...), stk.Population( stk.BuildingBlock(...), stk.ConstructedMolecule(...) ) stk.ConstructedMolecule(...), stk.BuildingBlock(...) ) for member in pop: do_stuff(member)
When iterating through a
Populationyou will also iterate through nested members, that is members which are held by subpopulations. If you only wish to iterate through direct members, you canfor member in pop.direct_members: do_stuff(member)
You can also get access to members by using indices. Indices have access to all members in the population
first_member = pop[0] second_member = pop[1]
Indices will first access direct members of the population and then access members in the subpopulations. Indices access nested members depth-first
pop2 = stk.Population(bb1, bb2, stk.Population(bb3, bb4)) # Get bb1. pop2[0] # Get bb2. pop2[1] # Get bb3. pop2[2] # Get bb4. pop2[3]
You can get a subpopulation by taking a slice
# new_pop is a new Population instance and has no nesting. new_pop = pop[2:4]
You can take the length of a population to get the total number of members
len(pop)
Adding populations creates a new population with both of the added populations as subpopulations
# added has no direct members and two subpopulations, pop and # pop2. added = pop + pop2
Subtracting populations creates a new, flat population.
# subbed has all objects in pop except those also found in # pop2. subbed = pop - pop2
You can check if an object is already present in the population.
bb1 = stk.BuildingBlock(...) bb2 = stk.BuildingBlock(...) pop3 = stk.Population(bb1) # Returns True. bb1 in pop3 # Returns False. bb2 in pop3 # Returns True. bb2 not in pop3
If you want to run multiple
optimize()calls in a row, use the “with” statement. This keeps a single process pool open, and means you do not create a new one for eachoptimize()call. It also automatically closes the pool for you when the block exitspopulation = stk.Population(...) # Keep a process pool open through the "with" statement. with population.open_process_pool(8): # All optimize calls within this block will use the # same process pool. population.optimize(stk.UFF()) population.add_members(...) population.optimize(stk.UFF()) # Process pool is automatically cleaned up when the block # exits.
Methods
add_members(self, molecules[, duplicate_key])Add
Moleculeinstances to thePopulation.add_subpopulation(self, population)Add a clone of population to
subpopulations.clone(self)Return a clone.
close_process_pool(self)Close an open process pool.
dump(self, path[, include_attrs, …])Dump the
Populationto a file.init_all(building_blocks, topology_graphs[, …])Make all possible molecules from groups of building blocks.
init_diverse(building_blocks, …[, …])Construct a chemically diverse
Population.init_from_list(pop_list[, use_cache])Initialize a population from a
listrepresentation.init_random(building_blocks, …[, …])Construct molecules for a random
Population.load(path[, use_cache])Initialize a
Populationfrom one dumped to a file.open_process_pool(self[, num_processes])Open a process pool.
optimize(self, optimizer[, num_processes])Optimize the structures of molecules in the population.
remove_duplicates(self[, …])Remove duplicates from the population.
remove_members(self, key)Remove all members where
key(member)isTrue.set_mol_ids(self, n[, overwrite])Give each member of the population an id starting from n.
to_list(self[, include_attrs, …])Convert the population to a
listrepresentation.write(self, path)Write the
.molfiles of members to a directory.-
__init__(self, *args)¶ Initialize a
Population.- Parameters
*args (
Molecule,Population) – A population is initialized with theMoleculeandPopulationinstances it should hold.
Examples
bb1 = stk.BuildingBlock('CCC') bb2 = stk.BuildingBlock('NCCNCNC') bb3 = stk.BuildingBlock('[Br]CCC[Br]') pop1 = stk.Population(bb1, bb2, bb3) bb4 = stk.BuildingBlock('NNCCCN') # pop2 has pop1 as a subpopulation and bb4 as a direct # member. pop2 = stk.Population(pop1, bb4)
-
add_members(self, molecules, duplicate_key=None)¶ Add
Moleculeinstances to thePopulation.The added
Moleculeinstances are added as direct members of the population, they are not placed into any subpopulations.- Parameters
molecules (
iterableofMolecule) – The molecules to be added as direct members.duplicate_key (
callable, optional) – If notNone,duplicate_key(mol)is evalued on each molecule in members. If a molecule with the same duplicate_key is already present in the population, the molecule is not added.
- Returns
None
- Return type
NoneType
-
add_subpopulation(self, population)¶ Add a clone of population to
subpopulations.Only a clone of the population container is made. The molecules it holds are not copies.
- Parameters
population (
Population) – The population to be added as a subpopulation.- Returns
None
- Return type
NoneType
-
clone(self)¶ Return a clone.
The clone will share the
Moleculeobjects, copies ofMoleculeobjects will not be made.- Returns
The clone.
- Return type
Examples
import stk # Make an intial population. pop = stk.Population(stk.BuildingBlock('NCCN')) # Make a clone. clone = pop.clone()
-
close_process_pool(self)¶ Close an open process pool.
- Returns
The population.
- Return type
-
dump(self, path, include_attrs=None, ignore_missing_attrs=False)¶ Dump the
Populationto a file.- Parameters
path (
str) – The full path of the file to which thePopulationshould be dumped.include_attrs (
listofstr, optional) – The names of attributes of the molecules to be added to the JSON. Each attribute is saved as a string usingrepr().ignore_missing_attrs (
bool, optional) – IfFalseand an attribute in include_attrs is not held by aMolecule, an error will be raised.
- Returns
None
- Return type
NoneType
-
classmethod
init_all(building_blocks, topology_graphs, num_processes=None, duplicates=False, use_cache=False)¶ Make all possible molecules from groups of building blocks.
- Parameters
building_blocks (
listofMolecule) –A
listholding nested building blocks, for examplebbs1 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] bbs2 = [ stk.ConstructedMolecule(...), stk.BuildingBlock(...), ..., ] bbs3 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] building_blocks = [bbs1, bbs2, bbs3]
To construct a new
ConstructedMolecule, aMoleculeis picked from each of the sublists in building_blocks. The pickedMoleculeinstances are then supplied toConstructedMolecule# mol is a new ConstructedMolecule. bb1 is selected # from bbs1, bb2 is selected from bbs2 and bb3 is # selected from bbs3. mol = stk.ConstructedMolecule( building_blocks=[bb1, bb2, bb3], topology_graph=topology_pick )
The order a
Moleculeinstance is given to theConstructedMoleculeis determined by the sublist of building_blocks it was picked from. Note that the number of sublists in building_blocks is not fixed. It merely has to be compatible with the topology_graphs.topology_graphs (
listofTopologyGraph) – The topology graphs of .ConstructedMolecule being made.num_processes (
int, optional) – The number of parallel processes to create when constructing the molecules. IfNone, creates a process for each core on the computer.duplicates (
bool, optional) – IfFalse, duplicate structures are removed from the population.use_cache (
bool, optional) – Toggles use of the molecular cache.
- Returns
A
Populationholding .ConstructedMolecule instances.- Return type
Examples
Construct all possible cage molecules from some precursors
import stk amines = [ stk.BuildingBlock('NCCCN', ['amine']), stk.BuildingBlock('NCCCCCN', ['amine']), stk.BuildingBlock('NCCOCCN', ['amine']), ] aldehydes = [ stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=C(C=O)COCC=O', ['aldehyde']), ] # A total of 9 cages will be created. cages = stk.Population.init_all( building_blocks=[amines, aldehydes], topology_graphs=[stk.cage.FourPlusSix()] )
Use the constructed cages and a new bunch of building blocks to create all possible cage complexes.
encapsulants = [ stk.BuildingBlock('[Br][Br]'), stk.BuildingBlock('[F][F]'), ] # Every combination of cage and encapsulant. complexes = stk.Population.init_all( building_blocks=[cages, encapsulants], topology_graphs=[stk.host_guest_complex.Complex()] )
-
classmethod
init_diverse(building_blocks, topology_graphs, size, random_seed=None, use_cache=False)¶ Construct a chemically diverse
Population.All constructed molecules are held in
direct_members.In order to construct a
ConstructedMolecule, a randomMoleculeis selected from each sublist in building_blocks. Once the first construction is complete, the nextMoleculeselected from each sublist is the one with the most different Morgan fingerprint to the prior one. The third construction uses randomly selectedMoleculeobjects again and so on. This is done until sizeConstructedMoleculeinstances have been constructed.- Parameters
building_blocks (
listofMolecule) –A
listholding nested building blocks, for examplebbs1 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] bbs2 = [ stk.ConstructedMolecule(...), stk.BuildingBlock(...), ... ] bbs3 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] building_blocks = [bbs1, bbs2, bbs3]
To construct a new
ConstructedMolecule, aMoleculeis picked from each of the sublists in building_blocks. The pickedMoleculeinstances are then supplied to theConstructedMolecule# mol is a new ConstructedMolecule. bb1 is selected # from bbs1, bb2 is selected from bbs2 and bb3 is # selected from bbs3. mol = stk.ConstructedMolecule( building_blocks=[bb1, bb2, bb3], topology_graph=topology_pick )
The order a
Moleculeinstance is given toConstructedMoleculeis determined by the sublist of building_blocks it was picked from. Note that the number of sublists in building_blocks is not fixed. It merely has to be compatible with the topology_graphs.topology_graphs (
iterableofTopologyGraph) – An iterable holding topology grpahs which should be randomly selected for the construction of aConstructedMolecule.size (
int) – The desired size of thePopulation.random_seed (
int, optional) – Seed for the random number generator to get replicable results.use_cache (
bool, optional) – Toggles use of the molecular cache.
- Returns
A population filled with the constructed molecules.
- Return type
Examples
Construct a diverse
Populationof cage molecules from some precursorsimport stk amines = [ stk.BuildingBlock('NCCCN', ['amine']), stk.BuildingBlock('NCCCCCN', ['amine']), stk.BuildingBlock('NCCOCCN', ['amine']), ] aldehydes = [ stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=C(C=O)COCC=O', ['aldehyde']), ] # A total of 4 cages will be created. cages = stk.Population.init_diverse( building_blocks=[amines, aldehydes], topology_graphs=[stk.cage.FourPlusSix()], size=4 )
Use the constructed cages and a new bunch of building blocks to create some diverse cage complexes.
encapsulants = [ stk.BuildingBlock('[Br][Br]'), stk.BuildingBlock('[F][F]'), ] # 4 combinations of cage and encapsulant. complexes = stk.Population.init_diverse( building_blocks=[cages, encapsulants], topology_graphs=[stk.host_guest_complex.Complex()], size=4 )
-
classmethod
init_from_list(pop_list, use_cache=False)¶ Initialize a population from a
listrepresentation.- Parameters
pop_list (
list) –A
listwhich represents aPopulation. Like the ones created byto_list(). For example in,pop_list = [{...}, [{...}], [{...}, {...}], {...}]
pop_listrepresents thePopulation, sublists represent its subpopulations and thedict{...}represents the members.use_cache (
bool, optional) – Toggles use of the molecular cache.
- Returns
The population represented by pop_list.
- Return type
-
classmethod
init_random(building_blocks, topology_graphs, size, random_seed=None, use_cache=False)¶ Construct molecules for a random
Population.All molecules are held in
direct_members.From the supplied building blocks a random
Moleculeis selected from each sublist to form aConstructedMolecule. This is done until sizeConstructedMoleculeobjects have been constructed.- Parameters
building_blocks (
listofMolecule) –A
listholding nested building blocks, for examplebbs1 = [ stk.BuildingBlock(...), sk.BuildingBlock(...), ... ] bbs2 = [ stk.ConstructedMolecule(...), stk.BuildingBlock(...), ... ] bbs3 = [ stk.BuildingBlock(...), stk.BuildingBlock(...), ... ] building_blocks = [bbs1, bbs2, bbs3]
To construct a new
ConstructedMolecule, aMoleculeis picked from each of the sublists in building_blocks. The pickedMoleculeinstances are then supplied toConstructedMolecule# mol is a new ConstructedMolecule. bb1 is selected # from bbs1, bb2 is selected from bbs2 and bb3 is # selected from bbs3. mol = stk.ConstructedMolecule( building_blocks=[bb1, bb2, bb3], topology_graph=topology_pick )
The order a
Moleculeinstance is given to theConstructedMoleculeis determined by the sublist of building_blocks it was picked from. Note that the number of sublists in building_blocks is not fixed. It merely has to be compatible with the topology_graphs.topology_graphs (
iterableofTopologyGraph) – Aniterableholding topology graphs which should be randomly selected during initialization ofConstructedMolecule.size (
int) – The size of the population to be initialized.random_seed (
int, optional) – Seed for the random number generator to get replicable results.use_cache (
bool, optional) – Toggles use of the molecular cache.
- Returns
A population filled with random
ConstructedMoleculeinstances.- Return type
Examples
Construct 5 random cage molecules from some precursors
import stk amines = [ stk.BuildingBlock('NCCCN', ['amine']), stk.BuildingBlock('NCCCCCN', ['amine']), stk.BuildingBlock('NCCOCCN', ['amine']), ] aldehydes = [ stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=CCC(C=O)CC=O', ['aldehyde']), stk.BuildingBlock('O=C(C=O)COCC=O', ['aldehyde']), ] # A total of 5 cages will be created. cages = stk.Population.init_random( building_blocks=[amines, aldehydes], topology_graphs=[stk.cage.FourPlusSix()], size=5 )
Use the constructed cages and a new bunch of building blocks to create some random cage complexes.
encapsulants = [ stk.BuildingBlock('[Br][Br]'), stk.BuildingBlock('[F][F]'), ] # Random combinations of cage and encapsulant. complexes = stk.Population.init_random( building_blocks=[cages, encapsulants], topology_graphs=[stk.host_guest_complex.Complex()], size=5 )
-
classmethod
load(path, use_cache=False)¶ Initialize a
Populationfrom one dumped to a file.- Parameters
path (
str) – The full path of the file holding the dumped population.use_cache (
bool, optional) – Toggles use of the moleular cache.
- Returns
The population stored in the dump file.
- Return type
-
open_process_pool(self, num_processes=None)¶ Open a process pool.
- Parameters
num_processes (
int, optional) – The number of processes in the pool. IfNone, then creates a process for each core on the computer.- Returns
The population.
- Return type
- Raises
RuntimeError – If a process pool is already open.
-
optimize(self, optimizer, num_processes=None)¶ Optimize the structures of molecules in the population.
The molecules are optimized serially or in parallel depending if num_processes is
1or more. The serial version may be faster in cases where all molecules have already been optimized and the optimizer will skip them. In this case creating a parallel process pool creates unnecessary overhead.- Parameters
optimizer (
Optimizer) – The optimizer used to carry out the optimizations.num_processes (
int, optional) – The number of parallel processes to create. Optimization will run serially if1. IfNone, creates a process for each core on the computer. This parameter will be ignored if the population has an open process pool.
- Returns
None
- Return type
NoneType
-
remove_duplicates(self, across_subpopulations=True, key=<built-in function id>)¶ Remove duplicates from the population.
The question of which molecule is preserved when duplicates are removed is difficult to answer. The iteration through a population is depth-first, so a rule such as “the molecule in the topmost population is preserved” is not the case here. Rather, the first molecule found is preserved.
However, this question is only relevant if duplicates in different subpopulations are being removed. In this case it is assumed that it is more important to have a single instance than to worry about which subpopulation it is in.
If the duplicates are being removed from within subpopulations, each subpopulation will end up with a single instance of all molecules held before. There is no “choice”.
- Parameters
across_subpopulations (
bool, optional) – WhenFalseduplicates are only removed from within a given subpopulation. IfTrue, all duplicates are removed, regardless of which subpopulation they are in.key (
callable, optional) – Two molecules are considered the same if the values returned bykey(molecule)are the same.
- Returns
None
- Return type
NoneType
-
remove_members(self, key)¶ Remove all members where
key(member)isTrue.- Parameters
key (
callable) – A callable which takes 1 argument. Each member of the population is passed as the argument to key in turn. If the result isTruethen the member is removed from the population.- Returns
None
- Return type
NoneType
-
set_mol_ids(self, n, overwrite=False)¶ Give each member of the population an id starting from n.
This method adds an
idattribute to eachMoleculeinstance held by the population.- Parameters
n (
int) – A number. Members of thisPopulationare given a unique number as an id, starting from n and incremented by one between members.overwrite (
bool, optional) – IfTrue, existing ids are replaced.
- Returns
The value of the last id assigned, plus 1.
- Return type
int
-
to_list(self, include_attrs=None, ignore_missing_attrs=False)¶ Convert the population to a
listrepresentation.- Parameters
include_attrs (
listofstr, optional) – The names of attributes to be added to the molecular representations. Each attribute is saved as a string usingrepr().ignore_missing_attrs (
bool, optional) – IfFalseand an attribute in include_attrs is not held by aMolecule, an error will be raised.
- Returns
A
listrepresentation of thePopulation.- Return type
list
-
write(self, path)¶ Write the
.molfiles of members to a directory.- Parameters
path (
str) – The full path of the directory into which the.molfile is written.- Returns
None
- Return type
NoneType
-