pysoftk.linear_polymer package
Submodules
pysoftk.linear_polymer.calculators module
- class pysoftk.linear_polymer.calculators.Opt(xyz_file)[source]
Bases:
objectProvides tools for performing geometrical optimizations of molecular structures. This class leverages external computational chemistry engines, specifically GFN-XTB2 and PySCF (with its semiempirical implementations), to minimize the energy of a given molecular geometry.
The primary use case for this class is to find stable 3D conformations of molecules by iteratively adjusting atomic positions until a local energy minimum is reached.
Note
Successful execution of the optimization methods within this class requires the prior installation and correct configuration of the chosen computational engine: * GFN-XTB: The xtb executable must be accessible in your system’s PATH.
You can typically install it via conda or by compiling from source.
- PySCF Semiempirical: The pyscf Python package, including its
semiempirical modules, must be installed in your Python environment. Installation is usually done via pip (pip install pyscf).
- pyscf_semi(steps)[source]
Performs a geometry optimization using the PySCF semiempirical implementation, specifically the MINDO/3 method.
This function sets up a PySCF molecular object from the provided XYZ file, configures a semiempirical calculation, and then applies a geometry optimization algorithm (Berny solver) to find the equilibrium structure. The optimized coordinates are then printed to an XYZ file.
- Parameters:
steps (int, optional) – The maximum number of steps the PySCF optimization routine will attempt to converge the geometry. The optimization may terminate earlier if convergence criteria are met.
- Returns:
This method does not return any value. Instead, it generates an output file named pyscf_final.xyz in the current working directory, which contains the optimized Cartesian coordinates of the molecule. A confirmation message will be printed to the console upon successful file creation.
- Return type:
None
- class pysoftk.linear_polymer.calculators.Pyscf_print[source]
Bases:
objectA utility class designed to convert and save the results of a PySCF geometry optimization into a standard Cartesian coordinate (XYZ) file format.
This class acts as a bridge between PySCF’s internal molecular object representation and a widely recognized format for molecular structures, making the optimized geometries easily shareable and viewable with molecular visualization software.
- xyz(mol)[source]
Writes the optimized Cartesian coordinates from a PySCF molecular object to a new .xyz file named pyscf_final.xyz.
This function takes the atomic symbols and their optimized coordinates from a PySCF Mole object, converts the coordinates from Bohr to Angstroms, and formats them into the XYZ file standard.
- Parameters:
mol (pyscf.gto.mole.Mole) – A PySCF `gto.Mole` object containing the optimized molecular geometry, typically obtained from a PySCF geometry optimization calculation. This object holds information about atomic symbols and their Cartesian coordinates.
- Returns:
This method does not return any value. It creates a file named pyscf_final.xyz in the current working directory, which contains the optimized molecular structure. A confirmation message “File pyscf_final.xyz has been created.” is printed to standard output.
- Return type:
None
pysoftk.linear_polymer.linear_polymer module
- class pysoftk.linear_polymer.linear_polymer.Lp(mol, atom, n_copies, shift)[source]
Bases:
objectA class for constructing linear polymers from individual molecular units (monomers) using RDKit functionalities.
This class facilitates the creation of polymeric structures by taking a monomer RDKit molecule, replicating it a specified number of times, and then combining these copies into a linear chain. It handles the spatial arrangement and bonding of the monomer units, including the removal of placeholder atoms and subsequent geometry optimization using force fields.
Examples
>>> from rdkit import Chem >>> from pysoftk.linear_polymers.linear_polymer import Lp >>> # Example: Create a simple monomer with a placeholder atom (e.g., [Pt]) >>> monomer_smiles = "C([Pt])C" >>> monomer = Chem.MolFromSmiles(monomer_smiles) >>> # Initialize Lp with the monomer, placeholder atom, number of copies, and shift >>> linear_poly_builder = Lp(mol=monomer, atom="Pt", n_copies=3, shift=1.5) >>> # Generate the linear polymer with MMFF force field >>> polymer_mol = linear_poly_builder.linear_polymer(force_field="MMFF") >>> print(f"Polymer atoms: {polymer_mol.GetNumAtoms()}") >>> print(f"Polymer bonds: {polymer_mol.GetNumBonds()}")
Note
The RDKit and OpenBabel Python packages must be installed for this class to function correctly. Ensure that any placeholder atoms used in the monomer SMILES (e.g., “[Pt]”) are consistent with the atom parameter.
- atom
- bond_conn(outmol)[source]
Identifies and processes the bonding information related to the placeholder atom within the combined super-monomer structure (outmol).
This method extracts information about the atoms that are bonded to the placeholder atom, which is crucial for forming new bonds between monomers and subsequently removing the placeholder.
- Parameters:
outmol (rdkit.Chem.rdchem.Mol) – The RDKit Mol object representing the partially formed polymer (combined super-monomer), containing the placeholder atoms.
- Returns:
Tuple – A tuple containing two lists: - The first list (all_conn) contains tuples of atom indices
that represent the connections to be made between the monomers after the placeholder atoms are removed.
The second list (erase_br) contains the indices of the placeholder atoms that need to be removed from the molecule.
- Return type:
(list[tuple], list[int])
- copy_mol()[source]
Generates a list of identical RDKit molecular objects, each representing a copy of the initial monomer.
Before replication, the conformer of the original molecule is canonicalized to ensure consistent orientation for all copies.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit Mol object (self.mol) that will be replicated.
- Returns:
fragments – A list containing n_copies identical RDKit Mol objects, each ready to be incorporated into the polymer chain.
- Return type:
list[rdkit.Chem.rdchem.Mol]
- linear_polymer(force_field='MMFF', relax_iterations=350, rot_steps=125, no_att=True)[source]
Generates and optimizes the 3D structure of a linear polymer from the provided monomer unit.
This is the main method for creating the final polymer. It first constructs a “proto-polymer” by combining monomers and forming initial bonds. Then, it can optionally remove any remaining placeholder atoms. Finally, it applies a specified force field to relax the geometry and performs rotor optimization to refine the polymer’s conformation.
- Parameters:
force_field (str, optional) – The name of the force field to use for geometry optimization. Accepted values are “MMFF”, “UFF”, or “MMFF94”. “MMFF” will automatically default to “MMFF94”. Defaults to “MMFF”.
relax_iterations (int, optional) – The maximum number of iterations to use during the force field relaxation step. A higher number generally leads to better minimization but takes longer. Defaults to 350.
rot_steps (int, optional) – The number of steps for rotor optimization, which involves optimizing dihedral angles to find low-energy conformers. Defaults to 125.
no_att (bool, optional) – If True, any remaining placeholder atoms (as defined by self.atom) will be removed from the polymer structure before force field optimization. If False, placeholder atoms will be retained. Defaults to True.
- Returns:
newMol_H – An OpenBabel Molecule object representing the optimized 3D structure of the linear polymer.
- Return type:
openbabel.pybel.Molecule
- Raises:
ValueError – If an invalid force_field is provided or if relax_iterations or rot_steps are not integers.
- max_dist_mol()[source]
Calculates and returns the maximum interatomic distance within the initial RDKit monomer molecule (self.mol).
This distance is crucial for determining the appropriate spacing between repeated monomer units to avoid steric clashes during the initial polymer assembly.
- Returns:
The largest distance (in Ångstroms) found between any two atoms in the monomer’s conformational representation.
- Return type:
np.float
- mol
- n_copies
- polimerisation(fragments)[source]
Recursively combines a list of RDKit monomer fragments into a single linear polymer RDKit molecule.
Each subsequent fragment is combined with the growing polymer chain, with a calculated X-axis offset to maintain linearity and avoid overlap. The atoms in the resulting molecule are then renumbered for canonical ordering.
- Parameters:
fragments (list[rdkit.Chem.rdchem.Mol]) – A list of RDKit Mol objects, each representing a monomer unit to be joined to form the polymer.
- Returns:
outmol – A single RDKit Mol object representing the combined linear polymer with all monomer units spatially arranged and canonically ordered.
- Return type:
rdkit.Chem.rdchem.Mol
- proto_polymer()[source]
Constructs the initial “proto-polymer” by combining monomer units, forming new bonds, and removing the placeholder atoms.
This method orchestrates the steps of copying monomers, spatially arranging them, identifying and creating new bonds between them, and finally removing the temporary placeholder atoms used for connection. Hydrogen atoms are also added to the resulting molecule.
- Returns:
newMol_H – A new RDKit Mol object representing the raw linear polymer structure, with placeholder atoms removed and new bonds formed, and explicit hydrogen atoms added with their coordinates.
- Return type:
rdkit.Chem.rdchem.Mol
- shift
- x_shift()[source]
Computes the refined X-axis translation value for positioning subsequent monomer units during polymer construction.
This method adjusts the initial shift value based on the maximum distance between atoms in the monomer, aiming to create a more realistic and physically sound initial arrangement of the polymer chain.
- Returns:
shift_final – The calculated X-axis translation value (in Ångstroms) that will be applied to each repeating unit to properly space them in the linear polymer.
- Return type:
float
- class pysoftk.linear_polymer.linear_polymer.Lpr(mol, replacements, max_repetitions=10, final_replacement='*')[source]
Bases:
objectFacilitates the generation of complex chemical structures by recursively substituting placeholders within a SMILES string, with a defined stopping condition and a final replacement for any remaining placeholders.
This class is particularly useful for building polymeric or dendritic structures where a repeating unit needs to be expanded multiple times. It integrates with OpenBabel for 3D structure generation and force field optimization of the final molecule.
- final_replacement
- generate_recursive_smiles(force_field='MMFF', relax_iterations=350, rot_steps=125)[source]
Executes the recursive SMILES generation process, followed by 3D structure generation and optimization using OpenBabel.
The method iteratively substitutes placeholders in the SMILES string until either no more substitutions can be made or max_repetitions is reached. After generating the final SMILES string, it converts it to a 3D molecule, applies a force field for geometry relaxation, and performs rotor optimization.
- Parameters:
force_field (str, optional) – The name of the force field to use for geometry optimization. Accepted values are “MMFF”, “UFF”, or “MMFF94”. “MMFF” will automatically default to “MMFF94”. Defaults to “MMFF”.
relax_iterations (int, optional) – The maximum number of iterations to use during the force field relaxation step. A higher number generally leads to better minimization but takes longer. Defaults to 350.
rot_steps (int, optional) – The number of steps for rotor optimization, which involves optimizing dihedral angles to find low-energy conformers. Defaults to 125.
- Returns:
An OpenBabel Molecule object representing the optimized 3D structure of the generated molecule.
- Return type:
openbabel.pybel.Molecule
- Raises:
KeyError – If a placeholder specified in the SMILES string is not found in the replacements dictionary.
ValueError – If an invalid force_field is provided or if relax_iterations or rot_steps are not integers.
- max_repetitions
- mol
- replacements
pysoftk.linear_polymer.super_monomer module
Super Monomer Module.
This module provides the Sm class, which handles the chemical linking of two RDKit molecules via a designated placeholder atom. It includes methods for full 3D monomer generation, as well as highly optimized 2D topological merging for ultra-large polymer assemblies.
- class pysoftk.linear_polymer.super_monomer.Sm(mol_1: Mol, mol_2: Mol, atom: str)[source]
Bases:
objectClass to create a new combined molecule (dimer/monomer) from two provided RDKit Mol objects by linking them at specified placeholder atoms.
This class supports standard SMARTS-based reactions with 3D embedding, as well as ultra-fast direct 2D graph manipulation for large-scale polymerization algorithms.
- mol_1
The first RDKit molecule object, containing the placeholder atom.
- Type:
rdkit.Chem.rdchem.Mol
- mol_2
The second RDKit molecule object, also containing the placeholder atom.
- Type:
rdkit.Chem.rdchem.Mol
- atom
The atomic symbol of the placeholder atom (e.g., ‘Br’, ‘At’, ‘X’) indicating the connection points.
- Type:
str
- atom
- build_topology_only() Mol[source]
Highly optimized method to connect molecules via direct 2D graph manipulation.
Bypasses the RDKit reaction engine and SMILES conversion, allowing instantaneous merging of large polymer chains. It strictly skips 3D embedding and Hydrogen addition to maintain O(1) performance.
- Returns:
The cleanly connected 2D RDKit Mol object.
- Return type:
rdkit.Chem.rdchem.Mol
- constructor() Mol[source]
Combines the two input molecules using a SMARTS-defined chemical reaction.
The reaction replaces the placeholder atoms and joins the molecules. It converts the resulting products to SMILES and back to a Mol object to naturally sanitize and standardize the chemical graph.
- Returns:
The resulting combined molecule as an RDKit Mol object.
- Return type:
rdkit.Chem.rdchem.Mol
- Raises:
ValueError – If the reaction fails to yield valid products or SMILES strings.
- mol_1
- mol_2