pysoftk.tools package
Submodules
pysoftk.tools.utils_func module
- pysoftk.tools.utils_func.atom_neigh(mol, atom)[source]
Identifies all placeholder atoms of a given type and their direct neighbors.
For each atom in the molecule that matches the specified placeholder atom symbol, this function finds all its neighboring atoms. It returns a flat list of tuples, where each tuple contains the index of a placeholder atom and the index of one of its neighbors. If a placeholder has multiple neighbors, multiple tuples will be generated for that placeholder, one for each neighbor.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
atom (str) – The atomic symbol of the placeholder atom (e.g., “Br”, “*”).
- Returns:
- A list of (placeholder_atom_index, neighbor_atom_index)
tuples. The list is flattened. Example: If P1 is a placeholder bonded to N1a and N1b, and P2 is a placeholder bonded to N2a, the result could be [(idx_P1, idx_N1a), (idx_P1, idx_N1b), (idx_P2, idx_N2a)]. The order depends on RDKit’s internal atom iteration.
- Return type:
list[tuple[int, int]]
- pysoftk.tools.utils_func.count_plholder(mol, atom)[source]
Counts the occurrences of a specific placeholder atom type in an RDKit molecule.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object to inspect.
atom (str) – The atomic symbol of the placeholder atom to count (e.g., “Br”, “*”).
- Returns:
- The total number of placeholder atoms of the specified type found
in the molecule.
- Return type:
int
- pysoftk.tools.utils_func.create_pol(mol, atom, tpb)[source]
Creates specified bonds in a molecule and removes internal placeholder atoms.
This function modifies the input RDKit molecule mol by: 1. Converting mol to an RWMol object to allow modifications. 2. Adding new single bonds between atom pairs specified in tpb. 3. Identifying all atoms matching the placeholder atom symbol (e.g., “Br”, “*”). 4. Removing all occurrences of these placeholder atoms except for those
that are effectively terminal (i.e., the ones with the smallest and largest atom indices among all identified placeholders, after sorting).
This is typically used in polymer construction to form bonds between monomer units (connected via placeholders) and then remove the (now internal) placeholder atoms. Terminal placeholders might be retained.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object. This is often a single Mol object containing disconnected fragments or monomers linked by placeholders. It will be modified.
atom (str) – The atomic symbol of the placeholder atom (e.g., “Br”, “*”). This symbol is used to identify atoms for removal.
tpb (list[tuple[int, int]]) – A list of tuples, where each tuple (idx1, idx2) specifies the atom indices to be connected by a new single bond.
- Returns:
- The modified RDKit molecule object (as an RWMol)
with new bonds added and specified internal placeholders removed.
- Return type:
rdkit.Chem.rdchem.RWMol
- pysoftk.tools.utils_func.get_file_extension(path)[source]
Extracts the filename and extension from a given path.
- Parameters:
path (str) – The full path to the file.
- Returns:
- A tuple containing two strings:
The filename without the extension.
The file extension (including the dot, e.g., “.txt”). If the path has no extension, this will be an empty string.
- Return type:
tuple[str, str]
- pysoftk.tools.utils_func.pattern_mol_seq(mols, pattern)[source]
Generates a sequence of molecule names based on a pattern string and a list of molecule names.
It first identifies unique characters in the pattern (e.g., “A”, “B” from “A+B+A”). Then, it creates replacement tuples by pairing these unique characters (sorted) with the molecule names provided in the mols list, in their respective orders. Finally, it uses pattern_repl to substitute these characters in the pattern with their corresponding molecule names and splits the result.
Example
mols = [“water”, “ethanol”] pattern = “A+B+A” Unique chars in pattern (sorted alphabetically, case-insensitively): [‘A’, ‘B’] Replacement tuples: [(‘A’, “water”), (‘B’, “ethanol”)] Resulting sequence: [“water”, “ethanol”, “water”]
- Parameters:
mols (list[str]) – A list of molecule names. The order should correspond to the alphabetically sorted unique characters from the pattern.
pattern (str) – A string defining the sequence pattern, using characters as placeholders for molecules (e.g., “A+B+A”).
- Returns:
- A list of molecule names arranged according to the pattern.
Returns an empty list or partially resolved list if mols and unique characters in pattern do not align as expected.
- Return type:
list[str]
- pysoftk.tools.utils_func.pattern_recon(pattern)[source]
Identifies unique characters in a string pattern and sorts them.
The sorting is case-insensitive alphabetical. For example, “aBcAb” would result in [‘A’, ‘B’, ‘c’] (or [‘a’, ‘b’, ‘c’] depending on which case is preserved by set for mixed-case duplicates, then sorted alphabetically based on their uppercase representation).
- Parameters:
pattern (str) – The input string pattern (e.g., “ABCA”).
- Returns:
- A list of unique characters from the pattern, sorted
alphabetically (case-insensitively, preserving the original case of the first occurrence encountered by set for each character). For “bBaAcC”, output could be [‘A’, ‘a’, ‘B’, ‘b’, ‘C’, ‘c’] if all are unique, or [‘A’, ‘B’, ‘C’] if ‘a’ is treated as ‘A’ by set (which it isn’t). Actually, set is case-sensitive. So “bBaAcC” -> {‘b’, ‘B’, ‘a’, ‘A’, ‘c’, ‘C’}. sorted(…, key=str.upper) then sorts these as A,a,B,b,C,c.
- Return type:
list[str]
- pysoftk.tools.utils_func.pattern_repl(pattern, tup_repl)[source]
Replaces characters in a pattern string based on a list of replacement tuples.
The function iterates through each tuple in tup_repl. Each tuple is expected to contain an old character (or substring) to be replaced and a new string (the replacement). After all replacements, the modified pattern string is split by the “+” delimiter, and any empty strings resulting from the split are removed.
- Parameters:
pattern (str) – The initial string pattern (e.g., “A+B+A”).
tup_repl (list[tuple[str, str]]) – A list of tuples, where each tuple contains two strings: (substring_to_replace, replacement_string). Example: [(‘A’, ‘mol1’), (‘B’, ‘mol2’)]
- Returns:
- A list of strings derived from the modified pattern after
replacements and splitting by “+”. Empty strings are filtered out. For pattern=”A+B+A” and tup_repl=[(‘A’,’X’),(‘B’,’Y’)], it becomes “X+Y+X”, then returns [‘X’, ‘Y’, ‘X’].
- Return type:
list[str]
- pysoftk.tools.utils_func.tuple_bonds(lst_atm_neigh)[source]
Generates pairs of atom indices for creating new bonds from a list of placeholder-neighbor pairs.
This function processes an ordered list of (placeholder_atom_index, neighbor_atom_index) tuples. It extracts the neighbor indices from the “internal” tuples in this list (i.e., excluding the first and the last tuple from lst_atm_neigh when collecting neighbor indices) and then pairs these collected neighbor indices sequentially. This is typically used to define new bonds that will connect monomers in a polymer chain.
The function assumes lst_atm_neigh is appropriately ordered (e.g., by placeholder index along a chain) for the desired connectivity.
Example
If lst_atm_neigh = [(P0,N0), (P1,N1), (P2,N2), (P3,N3), (P4,N4)]. - It considers internal elements for neighbor collection: (P1,N1), (P2,N2), (P3,N3). - It extracts their neighbor indices: bond_idx becomes [N1, N2, N3]. - It then pairs them: [(N1, N2)]. The last element N3 is unpaired if bond_idx has odd length.
- Parameters:
lst_atm_neigh (list[tuple[int, int]]) – An ordered list of tuples, where each tuple is (placeholder_atom_index, neighbor_atom_index). This list might be generated from atom_neigh and subsequently sorted/ordered.
- Returns:
- A list of tuples, where each tuple contains two
neighbor_atom_indices that are intended to be bonded together. For example, [(neighbor_idx1, neighbor_idx2), (neighbor_idx3, neighbor_idx4)]. Returns an empty list if bond_idx has fewer than two elements.
- Return type:
list[tuple[int, int]]
pysoftk.tools.utils_ob module
- pysoftk.tools.utils_ob.check_bond_order(file_name)[source]
Checks and corrects bond orders for a molecule in a given file.
This function reads a molecule from a file, removes existing hydrogens, attempts to connect disconnected parts (if any), perceives bond orders, and then re-adds hydrogens. The modified molecule is written to a new file with “_new” appended to its original base name.
- Parameters:
file_name (str) – The path to the input molecular file. The file format is inferred from its extension and must be supported by Open Babel.
- Returns:
This function does not return a value but writes a new molecular file (e.g., if file_name is “molecule.mol”, output is “molecule_new.mol”).
- Return type:
None
- pysoftk.tools.utils_ob.ff_ob_relaxation(mol, FF='MMFF94', relax_iterations=100, ff_thr=1e-06)[source]
Performs geometry optimization on an Open Babel molecule using a specified force field.
The molecule’s coordinates are updated in place.
- Parameters:
mol (pybel.Molecule) – The Open Babel molecule (as a Pybel Molecule object) to be optimized. The OBMol attribute of this object will be used.
FF (str, optional) – The force field to be used for optimization. Supported options include “MMFF94”, “UFF”, “GAFF”. Defaults to “MMFF94”.
relax_iterations (int, optional) – The maximum number of iterations for the conjugate gradients algorithm. Defaults to 100.
ff_thr (float, optional) – The convergence criterion (e.g., energy difference or RMS gradient) for the force field optimization. Defaults to 1.0e-6.
- Returns:
The input molecule with its geometry optimized. The modification is done in-place, and the same molecule object is returned.
- Return type:
pybel.Molecule
- pysoftk.tools.utils_ob.global_opt(mol, FF='MMFF94', relax_iterations=150, rot_steps=125, ff_thr=1e-06)[source]
Performs a global geometry optimization on an Open Babel molecule.
This typically involves a combination of conformational searching (rotor search) and local geometry minimizations (e.g., conjugate gradients) to find a low-energy structure. The molecule’s coordinates are updated in place.
- Parameters:
mol (pybel.Molecule) – The Open Babel molecule (as a Pybel Molecule object) to be optimized.
FF (str, optional) – The force field to be used. Options include “MMFF94”, “UFF”, “GAFF”. Defaults to “MMFF94”.
relax_iterations (int, optional) – The maximum number of iterations for conjugate gradients steps. Defaults to 150.
rot_steps (int, optional) – The number of iterations for the weighted rotor search. Defaults to 125.
ff_thr (float, optional) – The convergence criterion for the final conjugate gradients optimization. Defaults to 1.0e-6. A less strict criterion (ff_thr * 100) is used for the initial relaxation.
- Returns:
The input molecule with its geometry globally optimized. The modification is done in-place.
- Return type:
pybel.Molecule
- pysoftk.tools.utils_ob.rotor_opt(mol, FF='MMFF94', rot_steps=125)[source]
Performs a rotor search (conformational search) on an Open Babel molecule.
This function attempts to find low-energy conformations by rotating rotatable bonds. The molecule’s coordinates are updated in place to reflect the best conformation found.
- Parameters:
mol (pybel.Molecule) – The Open Babel molecule (as a Pybel Molecule object) for which to perform the rotor search.
FF (str, optional) – The force field to be used during the rotor search. Supported options include “MMFF94”, “UFF”, “GAFF”. Defaults to “MMFF94”.
rot_steps (int, optional) – The number of steps or iterations for the weighted rotor search. This influences the thoroughness of the search. Defaults to 125.
- Returns:
The input molecule with its conformation optimized by the rotor search. The modification is done in-place.
- Return type:
pybel.Molecule
pysoftk.tools.utils_rdkit module
- pysoftk.tools.utils_rdkit.MMFF_rel(mol, relax_iterations, vdw_par=0.001)[source]
Function to employ a MMFF molecular mechanics FF.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit Mol object
relax_iterations (int) – Number of iterations to perform a FF geometry optimization.
vdw_par (float) – Extension of the vdW interaction range.
- Returns:
mol – RDKit Mol object
- Return type:
rdkit.Chem.rdchem.Mol
- pysoftk.tools.utils_rdkit.UFF_rel(mol, relax_iterations, vdw_par=0.001)[source]
Function to employ an UFF molecular mechanics FF.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit Mol object
relax_iterations (int) – Number of iterations to perform a FF geometry optimization.
vdw_par (float) – Extension of the vdW interaction range.
- Returns:
mol – RDKit Mol object
- Return type:
rdkit.Chem.rdchem.Mol
- pysoftk.tools.utils_rdkit.etkdgv3_energies(mol, num_conf=1)[source]
Calculate molecular configurations using the RDKit-ETKDG3 method.
- Parameters:
mol (rdkit.Chem.Mol) – RDKit Mol object
num_conf (int) – The number of configurations requested to be computed.
- Returns:
datapoint – RDKit Mol object
- Return type:
rdkit.Chem.rdistGeom.EmbedMultipleConfs
- pysoftk.tools.utils_rdkit.no_swap(mol, relax_iterations, force_field='MMFF')[source]
- Function to sanitize a molecule with Hydrogens
and the user defined atomic place holder.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit Mol object
force_field (str) – Selected FF to perform a relaxation
relax_iterations (int) – Number of iterations to perform a FF geometry optimization.
- Returns:
newMol_H – RDKit Mol object
- Return type:
rdkit.Chem.rdchem.Mol
- pysoftk.tools.utils_rdkit.plc_holder(mol, atom)[source]
Function Seeking for a specific placeholder atom.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit Mol object
atom (str) – The placeholder atom
- Returns:
return – List of neighbors from the place holder atom.
- Return type:
list
- pysoftk.tools.utils_rdkit.remove_plcholder(mol, atom)[source]
Function to remove atom placeholder.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit Mol object
atom (str) – The placeholder atom
- Returns:
return – RDKit mol object.
- Return type:
RDKit.object
- pysoftk.tools.utils_rdkit.swap_hyd(mol, relax_iterations, atom, force_field='MMFF')[source]
Function to swap atomic place holders to Hydrogen atoms.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit Mol object
relax_iterations (int) – Number of iterations to perform a FF geometry optimization.
atom (str) – The placeholder atom to combine the molecules and form a new monomer.
force_field (str) – Selected FF to perform a geometry optimization.
- Returns:
newMol_H – RDKit Mol object
- Return type:
rdkit.Chem.rdchem.Mol
Module contents
- pysoftk.tools.MMFF_rel(mol, relax_iterations, vdw_par=0.001)[source]
Function to employ a MMFF molecular mechanics FF.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit Mol object
relax_iterations (int) – Number of iterations to perform a FF geometry optimization.
vdw_par (float) – Extension of the vdW interaction range.
- Returns:
mol – RDKit Mol object
- Return type:
rdkit.Chem.rdchem.Mol
- pysoftk.tools.UFF_rel(mol, relax_iterations, vdw_par=0.001)[source]
Function to employ an UFF molecular mechanics FF.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit Mol object
relax_iterations (int) – Number of iterations to perform a FF geometry optimization.
vdw_par (float) – Extension of the vdW interaction range.
- Returns:
mol – RDKit Mol object
- Return type:
rdkit.Chem.rdchem.Mol
- pysoftk.tools.atom_neigh(mol, atom)[source]
Identifies all placeholder atoms of a given type and their direct neighbors.
For each atom in the molecule that matches the specified placeholder atom symbol, this function finds all its neighboring atoms. It returns a flat list of tuples, where each tuple contains the index of a placeholder atom and the index of one of its neighbors. If a placeholder has multiple neighbors, multiple tuples will be generated for that placeholder, one for each neighbor.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object.
atom (str) – The atomic symbol of the placeholder atom (e.g., “Br”, “*”).
- Returns:
- A list of (placeholder_atom_index, neighbor_atom_index)
tuples. The list is flattened. Example: If P1 is a placeholder bonded to N1a and N1b, and P2 is a placeholder bonded to N2a, the result could be [(idx_P1, idx_N1a), (idx_P1, idx_N1b), (idx_P2, idx_N2a)]. The order depends on RDKit’s internal atom iteration.
- Return type:
list[tuple[int, int]]
- pysoftk.tools.check_bond_order(file_name)[source]
Checks and corrects bond orders for a molecule in a given file.
This function reads a molecule from a file, removes existing hydrogens, attempts to connect disconnected parts (if any), perceives bond orders, and then re-adds hydrogens. The modified molecule is written to a new file with “_new” appended to its original base name.
- Parameters:
file_name (str) – The path to the input molecular file. The file format is inferred from its extension and must be supported by Open Babel.
- Returns:
This function does not return a value but writes a new molecular file (e.g., if file_name is “molecule.mol”, output is “molecule_new.mol”).
- Return type:
None
- pysoftk.tools.count_plholder(mol, atom)[source]
Counts the occurrences of a specific placeholder atom type in an RDKit molecule.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object to inspect.
atom (str) – The atomic symbol of the placeholder atom to count (e.g., “Br”, “*”).
- Returns:
- The total number of placeholder atoms of the specified type found
in the molecule.
- Return type:
int
- pysoftk.tools.create_pol(mol, atom, tpb)[source]
Creates specified bonds in a molecule and removes internal placeholder atoms.
This function modifies the input RDKit molecule mol by: 1. Converting mol to an RWMol object to allow modifications. 2. Adding new single bonds between atom pairs specified in tpb. 3. Identifying all atoms matching the placeholder atom symbol (e.g., “Br”, “*”). 4. Removing all occurrences of these placeholder atoms except for those
that are effectively terminal (i.e., the ones with the smallest and largest atom indices among all identified placeholders, after sorting).
This is typically used in polymer construction to form bonds between monomer units (connected via placeholders) and then remove the (now internal) placeholder atoms. Terminal placeholders might be retained.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – The RDKit molecule object. This is often a single Mol object containing disconnected fragments or monomers linked by placeholders. It will be modified.
atom (str) – The atomic symbol of the placeholder atom (e.g., “Br”, “*”). This symbol is used to identify atoms for removal.
tpb (list[tuple[int, int]]) – A list of tuples, where each tuple (idx1, idx2) specifies the atom indices to be connected by a new single bond.
- Returns:
- The modified RDKit molecule object (as an RWMol)
with new bonds added and specified internal placeholders removed.
- Return type:
rdkit.Chem.rdchem.RWMol
- pysoftk.tools.etkdgv3_energies(mol, num_conf=1)[source]
Calculate molecular configurations using the RDKit-ETKDG3 method.
- Parameters:
mol (rdkit.Chem.Mol) – RDKit Mol object
num_conf (int) – The number of configurations requested to be computed.
- Returns:
datapoint – RDKit Mol object
- Return type:
rdkit.Chem.rdistGeom.EmbedMultipleConfs
- pysoftk.tools.ff_ob_relaxation(mol, FF='MMFF94', relax_iterations=100, ff_thr=1e-06)[source]
Performs geometry optimization on an Open Babel molecule using a specified force field.
The molecule’s coordinates are updated in place.
- Parameters:
mol (pybel.Molecule) – The Open Babel molecule (as a Pybel Molecule object) to be optimized. The OBMol attribute of this object will be used.
FF (str, optional) – The force field to be used for optimization. Supported options include “MMFF94”, “UFF”, “GAFF”. Defaults to “MMFF94”.
relax_iterations (int, optional) – The maximum number of iterations for the conjugate gradients algorithm. Defaults to 100.
ff_thr (float, optional) – The convergence criterion (e.g., energy difference or RMS gradient) for the force field optimization. Defaults to 1.0e-6.
- Returns:
The input molecule with its geometry optimized. The modification is done in-place, and the same molecule object is returned.
- Return type:
pybel.Molecule
- pysoftk.tools.get_file_extension(path)[source]
Extracts the filename and extension from a given path.
- Parameters:
path (str) – The full path to the file.
- Returns:
- A tuple containing two strings:
The filename without the extension.
The file extension (including the dot, e.g., “.txt”). If the path has no extension, this will be an empty string.
- Return type:
tuple[str, str]
- pysoftk.tools.global_opt(mol, FF='MMFF94', relax_iterations=150, rot_steps=125, ff_thr=1e-06)[source]
Performs a global geometry optimization on an Open Babel molecule.
This typically involves a combination of conformational searching (rotor search) and local geometry minimizations (e.g., conjugate gradients) to find a low-energy structure. The molecule’s coordinates are updated in place.
- Parameters:
mol (pybel.Molecule) – The Open Babel molecule (as a Pybel Molecule object) to be optimized.
FF (str, optional) – The force field to be used. Options include “MMFF94”, “UFF”, “GAFF”. Defaults to “MMFF94”.
relax_iterations (int, optional) – The maximum number of iterations for conjugate gradients steps. Defaults to 150.
rot_steps (int, optional) – The number of iterations for the weighted rotor search. Defaults to 125.
ff_thr (float, optional) – The convergence criterion for the final conjugate gradients optimization. Defaults to 1.0e-6. A less strict criterion (ff_thr * 100) is used for the initial relaxation.
- Returns:
The input molecule with its geometry globally optimized. The modification is done in-place.
- Return type:
pybel.Molecule
- pysoftk.tools.no_swap(mol, relax_iterations, force_field='MMFF')[source]
- Function to sanitize a molecule with Hydrogens
and the user defined atomic place holder.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit Mol object
force_field (str) – Selected FF to perform a relaxation
relax_iterations (int) – Number of iterations to perform a FF geometry optimization.
- Returns:
newMol_H – RDKit Mol object
- Return type:
rdkit.Chem.rdchem.Mol
- pysoftk.tools.pattern_mol_seq(mols, pattern)[source]
Generates a sequence of molecule names based on a pattern string and a list of molecule names.
It first identifies unique characters in the pattern (e.g., “A”, “B” from “A+B+A”). Then, it creates replacement tuples by pairing these unique characters (sorted) with the molecule names provided in the mols list, in their respective orders. Finally, it uses pattern_repl to substitute these characters in the pattern with their corresponding molecule names and splits the result.
Example
mols = [“water”, “ethanol”] pattern = “A+B+A” Unique chars in pattern (sorted alphabetically, case-insensitively): [‘A’, ‘B’] Replacement tuples: [(‘A’, “water”), (‘B’, “ethanol”)] Resulting sequence: [“water”, “ethanol”, “water”]
- Parameters:
mols (list[str]) – A list of molecule names. The order should correspond to the alphabetically sorted unique characters from the pattern.
pattern (str) – A string defining the sequence pattern, using characters as placeholders for molecules (e.g., “A+B+A”).
- Returns:
- A list of molecule names arranged according to the pattern.
Returns an empty list or partially resolved list if mols and unique characters in pattern do not align as expected.
- Return type:
list[str]
- pysoftk.tools.pattern_recon(pattern)[source]
Identifies unique characters in a string pattern and sorts them.
The sorting is case-insensitive alphabetical. For example, “aBcAb” would result in [‘A’, ‘B’, ‘c’] (or [‘a’, ‘b’, ‘c’] depending on which case is preserved by set for mixed-case duplicates, then sorted alphabetically based on their uppercase representation).
- Parameters:
pattern (str) – The input string pattern (e.g., “ABCA”).
- Returns:
- A list of unique characters from the pattern, sorted
alphabetically (case-insensitively, preserving the original case of the first occurrence encountered by set for each character). For “bBaAcC”, output could be [‘A’, ‘a’, ‘B’, ‘b’, ‘C’, ‘c’] if all are unique, or [‘A’, ‘B’, ‘C’] if ‘a’ is treated as ‘A’ by set (which it isn’t). Actually, set is case-sensitive. So “bBaAcC” -> {‘b’, ‘B’, ‘a’, ‘A’, ‘c’, ‘C’}. sorted(…, key=str.upper) then sorts these as A,a,B,b,C,c.
- Return type:
list[str]
- pysoftk.tools.pattern_repl(pattern, tup_repl)[source]
Replaces characters in a pattern string based on a list of replacement tuples.
The function iterates through each tuple in tup_repl. Each tuple is expected to contain an old character (or substring) to be replaced and a new string (the replacement). After all replacements, the modified pattern string is split by the “+” delimiter, and any empty strings resulting from the split are removed.
- Parameters:
pattern (str) – The initial string pattern (e.g., “A+B+A”).
tup_repl (list[tuple[str, str]]) – A list of tuples, where each tuple contains two strings: (substring_to_replace, replacement_string). Example: [(‘A’, ‘mol1’), (‘B’, ‘mol2’)]
- Returns:
- A list of strings derived from the modified pattern after
replacements and splitting by “+”. Empty strings are filtered out. For pattern=”A+B+A” and tup_repl=[(‘A’,’X’),(‘B’,’Y’)], it becomes “X+Y+X”, then returns [‘X’, ‘Y’, ‘X’].
- Return type:
list[str]
- pysoftk.tools.plc_holder(mol, atom)[source]
Function Seeking for a specific placeholder atom.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit Mol object
atom (str) – The placeholder atom
- Returns:
return – List of neighbors from the place holder atom.
- Return type:
list
- pysoftk.tools.remove_plcholder(mol, atom)[source]
Function to remove atom placeholder.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit Mol object
atom (str) – The placeholder atom
- Returns:
return – RDKit mol object.
- Return type:
RDKit.object
- pysoftk.tools.rotor_opt(mol, FF='MMFF94', rot_steps=125)[source]
Performs a rotor search (conformational search) on an Open Babel molecule.
This function attempts to find low-energy conformations by rotating rotatable bonds. The molecule’s coordinates are updated in place to reflect the best conformation found.
- Parameters:
mol (pybel.Molecule) – The Open Babel molecule (as a Pybel Molecule object) for which to perform the rotor search.
FF (str, optional) – The force field to be used during the rotor search. Supported options include “MMFF94”, “UFF”, “GAFF”. Defaults to “MMFF94”.
rot_steps (int, optional) – The number of steps or iterations for the weighted rotor search. This influences the thoroughness of the search. Defaults to 125.
- Returns:
The input molecule with its conformation optimized by the rotor search. The modification is done in-place.
- Return type:
pybel.Molecule
- pysoftk.tools.swap_hyd(mol, relax_iterations, atom, force_field='MMFF')[source]
Function to swap atomic place holders to Hydrogen atoms.
- Parameters:
mol (rdkit.Chem.rdchem.Mol) – RDKit Mol object
relax_iterations (int) – Number of iterations to perform a FF geometry optimization.
atom (str) – The placeholder atom to combine the molecules and form a new monomer.
force_field (str) – Selected FF to perform a geometry optimization.
- Returns:
newMol_H – RDKit Mol object
- Return type:
rdkit.Chem.rdchem.Mol
- pysoftk.tools.tuple_bonds(lst_atm_neigh)[source]
Generates pairs of atom indices for creating new bonds from a list of placeholder-neighbor pairs.
This function processes an ordered list of (placeholder_atom_index, neighbor_atom_index) tuples. It extracts the neighbor indices from the “internal” tuples in this list (i.e., excluding the first and the last tuple from lst_atm_neigh when collecting neighbor indices) and then pairs these collected neighbor indices sequentially. This is typically used to define new bonds that will connect monomers in a polymer chain.
The function assumes lst_atm_neigh is appropriately ordered (e.g., by placeholder index along a chain) for the desired connectivity.
Example
If lst_atm_neigh = [(P0,N0), (P1,N1), (P2,N2), (P3,N3), (P4,N4)]. - It considers internal elements for neighbor collection: (P1,N1), (P2,N2), (P3,N3). - It extracts their neighbor indices: bond_idx becomes [N1, N2, N3]. - It then pairs them: [(N1, N2)]. The last element N3 is unpaired if bond_idx has odd length.
- Parameters:
lst_atm_neigh (list[tuple[int, int]]) – An ordered list of tuples, where each tuple is (placeholder_atom_index, neighbor_atom_index). This list might be generated from atom_neigh and subsequently sorted/ordered.
- Returns:
- A list of tuples, where each tuple contains two
neighbor_atom_indices that are intended to be bonded together. For example, [(neighbor_idx1, neighbor_idx2), (neighbor_idx3, neighbor_idx4)]. Returns an empty list if bond_idx has fewer than two elements.
- Return type:
list[tuple[int, int]]