A script to replace guests in complexes

Why?

Simply put, a co-worker asked if stk could swap one guest for another in a host-guest complex (regardless of chemistry), and I already had some code for doing it, but thought it is pretty useful for everyone. So, here we are!

How?

The entire script in the examples directory is shown below and is available, alongside Jupyter notebooks used in the video below, here. This process uses stk to read the host-guest and new guest molecules. But, NetworkX is the real hero here! Basically, we convert the stk.Molecule class to a NetworkX.graph based on the bonds and atoms in the molecule. Then, use the NetworkX.connected_components(graph) to get atoms that are not bonded (e.g. host and guest molecules). The rest is simple: a helper function to collect either the biggest (host) or smallest (guest) component, and then build a new stk.host_guest.Complex ConstructedMolecule from the host and the new guest. A reasonable conformer is produced using SpinDry (stk.Spinner). What I like about this is that stk does not care about what the host and guest actually are - so use your imagination about what structural replacements you can do!

import stk
import networkx as nx
import sys
import os


def get_disconnected_components(molecule):

    # Produce a graph from the molecule that does not include edges
    # where the bonds to be optimized are.
    mol_graph = nx.Graph()
    for atom in molecule.get_atoms():
        mol_graph.add_node(atom.get_id())

    # Add edges.
    for bond in molecule.get_bonds():
        pair_ids = (
            bond.get_atom1().get_id(), bond.get_atom2().get_id()
        )
        mol_graph.add_edge(*pair_ids)

    # Get atom ids in disconnected subgraphs.
    components = {}
    for c in nx.connected_components(mol_graph):
        c_ids = sorted(c)
        molecule.write('temp_mol.mol', atom_ids=c_ids)
        num_atoms = len(c_ids)
        newbb = stk.BuildingBlock.init_from_file('temp_mol.mol')
        os.system('rm temp_mol.mol')

        components[num_atoms] = newbb

    return components


def extract_host(molecule):
    components = get_disconnected_components(molecule)
    return components[max(components.keys())]


def extract_guest(molecule):
    components = get_disconnected_components(molecule)
    return components[min(components.keys())]


def main():
    if (not len(sys.argv) == 3):
        print(
            f'Usage: {__file__}\n'
            '   Expected 2 arguments: host_with_g_file new_guest_file'
        )
        sys.exit()
    else:
        host_with_g_file = sys.argv[1]
        new_guest_file = sys.argv[2]

    # Load in host.
    host_with_guest = stk.BuildingBlock.init_from_file(host_with_g_file)
    
    # Load in new guest.
    new_guest = stk.BuildingBlock.init_from_file(new_guest_file)
    
    # Split host and guest, assuming host has more atoms than guest.
    host = extract_host(host_with_guest)
    old_guest = extract_guest(host_with_guest)
    
    # Build new host-guest structure, with Spindry optimiser to 
    # do some conformer searching.
    new_host = stk.ConstructedMolecule(
        stk.host_guest.Complex(
            host=stk.BuildingBlock.init_from_molecule(host),
            guests=(stk.host_guest.Guest(new_guest), ),
            # There are options for the Spinner class, 
            # if the optimised conformer is crap.
            optimizer=stk.Spinner(),
        ),
    )
    
    # Write out new host guest.
    new_host.write('new_host_guest.mol')


if __name__ == '__main__':
    main()

Examples and limitations.

Currently, the provided script swaps out the smallest molecule in a complex for the new molecule (defined from .mol files). However, the tutorial below shows that we can swap the host or guest for any stk molecule. Additionally, we could easily extend this to systems with more than two distinct molecules.

Please, test it, use it, break it and send me feedback!