Crossover

  1. GeneticRecombination

  2. Jumble

  3. If

  4. TryCatch

  5. Random

  6. RaisingCalculator

Crossover is implement through Crosser objects. Crossers take a group of molecules and recombine them to produce offspring molecules. How crossers are used can be seen in the documentation of the various Crosser classes, for example GeneticRecombination or Jumble.

Making New Crossers

To add a new Crosser, make a new class which inherits Crosser. This is an abstract base class and all of its virtual methods need to be implemented.

class Crosser

Bases: stk.calculators.base_calculators.EAOperation

Abstract base class for crossers.

Crossers take multiple molecules and recombine them to make new, offspring, molecules.

Methods

cross(self, \*mols)

Cross mols.

set_cache_use(self, use_cache)

Set use of the molecular cache on or off.

__init__(self, /, *args, **kwargs)

Initialize self. See help(type(self)) for accurate signature.

cross(self, *mols)

Cross mols.

Parameters

*mols (Molecule) – The molecules on which a crossover operation is performed.

Yields

Molecule – The generated offspring.

set_cache_use(self, use_cache)

Set use of the molecular cache on or off.

Parameters

use_cache (bool) – True if the molecular cache is to be used.

Returns

The calculator.

Return type

EAOperation

Raises

NotImplementedError – This is a virtual method and needs to be implemented in a subclass.

class GeneticRecombination(key, random_yield_order=True, random_seed=None, use_cache=False)

Bases: stk.calculators.base_calculators._EAOperation, stk.calculators.ea.crossers.Crosser

Recombine building blocks using biological systems as a model.

Overall, this crosser mimics how animals and plants inherit DNA from their parents, except generalized to work with any number of parents. First it is worth discussing some terminology. A gene is a the smallest packet of genetic information. In animals, each gene can have multiple alleles. For example, there is a gene for hair color, and individual alleles for black, red, brown, etc. hair. This means that every person has a gene for hair color, but a person with black hair will have the black hair allele and a person with red hair will have the red hair allele. When two parents produce an offspring, the offspring will have a hair color gene and will inherit the allele of one of the parents at random. Therefore, if you have two parents, one with black hair and one with red hair, the offspring will either have black or red hair, depending on which allele they inherit.

In stk molecules, each building block represents an allele. The question is, which gene is each building block an allele of? To answer that, let’s first construct a couple of building block molecules

bb1 = stk.BuildingBlock('NCC(N)CN', ['amine'])
bb2 = stk.BuildingBlock('O=CCC=O', ['aldehyde'])
bb3 = stk.BuildingBlock('O=CCNC(C=O)C=O', ['aldehyde'])
bb4 = stk.BuildingBlock('NCOCN', ['amine'])

We can define a function which analyzes a building block molecule and returns the gene it belongs to, for example

def determine_gene(building_block):
    return building_block.func_groups[0].fg_type.name

Here, we can see that the gene to which each building block molecule belongs is given by the functional group name. Therefore there is an 'amine' gene which has two alleles bb1 and bb4 and there is an 'aldehyde' gene which has two alleles bb2 and bb3.

Alternatively, we could have defined a function such as

def determine_gene(building_block):
    return len(building_block.func_groups)

Now we can see that we end up with the gene called 3 which has two alleles bb1 and bb3 and a second gene called 2 which has the alleles bb2 and bb4.

To produce offspring molecules, this class categorizes each building block of the parent molecules into genes using the key parameter. Then, to generate a single offspring, it picks a random building block for every gene. The picked building blocks are used to construct the offspring. The topology graph of the offspring is one of the parent’s. For obvious reasons, this approach works with any number of parents.

Examples

Note that any number of parents can be used for the crossover

import stk

# Create the molecules which will be crossed.
bb1 = stk.BuildingBlock('NCCN', ['amine'])
bb2 = stk.BuildingBlock('O=CCCCC=O', ['aldehyde'])
polymer1  = stk.ConstructedMolecule(
    building_blocks=[bb1, bb2],
    topology_graph=stk.polymer.Linear('AB', [0, 0], n=2)
)

bb3 = stk.BuildingBlock('NCCCN', ['amine'])
bb4 = stk.BuildingBlock('O=C[Si]CCC=O', ['aldehyde'])
polymer2  = stk.ConstructedMolecule(
    building_blocks=[bb3, bb4],
    topology_graph=stk.polymer.Linear('AB', [0, 0], n=2)
)

bb5 = stk.BuildingBlock('NC[Si]CN', ['amine'])
bb6 = stk.BuildingBlock('O=CCNNCCC=O', ['aldehyde'])
polymer3  = stk.ConstructedMolecule(
    building_blocks=[bb5, bb6],
    topology_graph=stk.polymer.Linear('AB', [0, 0], n=2)
)

# Create the crosser.
recombination = stk.GeneticRecombination(
    key=lambda mol: mol.func_groups[0].fg_type.name
)

# Get the offspring molecules.
cohort1 = list(
    recombination.cross(polymer1, polymer2, polymer3)
)

# Get a second set of offspring molecules.
cohort2 = list(
    recombination.cross(polymer1, polymer2, polymer3)
)

# Make a third set of offspring molecules by crossing two of
# the offspring molecules.
offspring1, offspring2, *rest = cohort1
cohort3 = list(
    recombination.cross(offspring1, offspring2)
)

Methods

cross(self, \*mols)

Cross mols.

set_cache_use(self, use_cache)

Set use of the molecular cache on or off.

__init__(self, key, random_yield_order=True, random_seed=None, use_cache=False)

Initialize a GeneticRecombination instance.

Parameters
  • key (callable) – A callable, which takes a Molecule object and returns its gene or category. To produce an offspring, one of the building blocks from each category is picked at random.

  • random_seed (int, optional) – The random seed to use.

  • use_cache (bool, optional) – Toggles use of the molecular cache.

cross(self, *mols)

Cross mols.

Parameters

*mols (Molecule) – The molecules on which a crossover operation is performed.

Yields

Molecule – The generated offspring.

set_cache_use(self, use_cache)

Set use of the molecular cache on or off.

Parameters

use_cache (bool) – True if the molecular cache is to be used.

Returns

The calculator.

Return type

EAOperation

class Jumble(num_offspring_building_blocks, duplicate_building_blocks=False, random_yield_order=True, random_seed=None, use_cache=False)

Bases: stk.calculators.base_calculators._EAOperation, stk.calculators.ea.crossers.Crosser

Distributes all building blocks among offspring.

Puts all the building blocks from each parent into one big pot and building blocks are drawn from the pot to generate the offspring. The offspring inherit the topology graph of one of the parents.

Examples

Note that any number of parents can be used for the crossover

import stk

# Create the molecules which will be crossed.
bb1 = stk.BuildingBlock('NCCN', ['amine'])
bb2 = stk.BuildingBlock('O=CCCCC=O', ['aldehyde'])
polymer1  = stk.ConstructedMolecule(
    building_blocks=[bb1, bb2],
    topology_graph=stk.polymer.Linear('AB', [0, 0], n=2)
)

bb3 = stk.BuildingBlock('NCCCN', ['amine'])
bb4 = stk.BuildingBlock('O=C[Si]CCC=O', ['aldehyde'])
polymer2  = stk.ConstructedMolecule(
    building_blocks=[bb3, bb4],
    topology_graph=stk.polymer.Linear('AB', [0, 0], n=2)
)

bb5 = stk.BuildingBlock('NC[Si]CN', ['amine'])
bb6 = stk.BuildingBlock('O=CCNNCCC=O', ['aldehyde'])
polymer3  = stk.ConstructedMolecule(
    building_blocks=[bb5, bb6],
    topology_graph=stk.polymer.Linear('AB', [0, 0], n=2)
)

# Create the crosser.
jumble = stk.Jumble(num_offspring_building_blocks=2)

# Get the offspring molecules.
cohort1 = list(jumble.cross(polymer1, polymer2, polymer3))

# Get a second set of offspring molecules.
cohort2 = list(jumble.cross(polymer1, polymer2, polymer3))

# Make a third set of offspring molecules by crossing two of
# the offspring molecules.
offspring1, offspring2, *rest = cohort1
cohort3 = list(jumble.cross(offspring1, offspring2))

Methods

cross(self, \*mols)

Cross mols.

set_cache_use(self, use_cache)

Set use of the molecular cache on or off.

__init__(self, num_offspring_building_blocks, duplicate_building_blocks=False, random_yield_order=True, random_seed=None, use_cache=False)

Initialize a Jumble instance.

Parameters
  • num_offspring_building_blocks (int) – The number of building blocks each offspring is made from.

  • duplicate_building_blocks (bool, optional) – Indicates whether the building blocks used to construct the offspring must all be unique.

  • random_seed (int, optional) – The random seed to use.

  • use_cache (bool, optional) – Toggles use of the molecular cache.

cross(self, *mols)

Cross mols.

Parameters

*mols (Molecule) – The molecules on which a crossover operation is performed.

Yields

Molecule – The generated offspring.

set_cache_use(self, use_cache)

Set use of the molecular cache on or off.

Parameters

use_cache (bool) – True if the molecular cache is to be used.

Returns

The calculator.

Return type

EAOperation