Code documentation¶
mai.ads_sites module¶
-
class
mai.ads_sites.
ads_pos_optimizer
(adsorbate_constructor, write_file=True, new_mofs_path=None, error_path=None, log_stats=True)¶ Bases:
object
This identifies ideal adsorption sites
- Args:
adsorbate_constructor (class): adsorbate_constructor class containing many relevant defaults
write_file (bool): if True, the new ASE atoms object should be written to a CIF file (defaults to True)
new_mofs_path (string): path to store the new CIF files if write_file is True (defaults to /new_mofs)
error_path (string): path to store any adsorbates flagged as problematic (defaults to /errors)
log_stats (bool): print stats about process
-
check_and_write
(new_mof, new_name)¶ Check for overlapping atoms and write CIF file
- Args:
new_mof (ASE Atoms object): the new MOF-adsorbate complex
new_name (string): the name of the new CIF file to write
- Returns:
- overlap (boolean): True or False for overlapping atoms
-
construct_mof
(mof, ads_pos, site_idx)¶ Construct the MOF-adsorbate complex
- Args:
ads_pos_optimizer (class): see ads_sites.py for details
ads (string): adsorbate species
ads_pos (numpy array): 1D numpy array for the proposed adsorption position
- Returns:
- mof (ASE Atoms object): ASE Atoms object with adsorbate
-
get_NNs
(ads_pos, site_idx)¶ Get the number of atoms nearby the proposed adsorption site within r_cut
- Args:
ads_pos (numpy array): 1D numpy array for the proposed adsorption position
site_idx (int): ASE index for adsorption site
- Returns:
NN (int): number of neighbors within r_cut
min_dist (float): distance from adsorbate to nearest atom
n_overlap (int): number of overlapping atoms
-
get_bi_ads_pos
(normal_vec, center_coord, site_idx)¶ Get adsorption site for a 2-coordinate site
- Args:
normal_vec (numpy array): 1D numpy array for the normal vector to the line
center_coord (numpy array): 1D numpy array for adsorption site
site_idx (int): ASE index of adsorption site
- Returns:
- ads_pos (numpy array): 1D numpy array for the proposed adsorption position
-
get_dist_planar
(normal_vec)¶ Get distance vector for planar adsorption site
- Args:
- normal_vec (numpy array): 1D numpy array for normal vector
- Returns:
- dist (float): distance vector scaled to d_MX1
-
get_new_atoms
(ads_pos, site_idx)¶ Get new ASE atoms object with adsorbate from pymatgen analysis
- Args:
- ads_pos (numpy array): 1D numpy array for the proposed adsorption position
- Returns:
- new_mof (ASE Atoms object): new ASE Atoms object with adsorbate
-
get_new_atoms_grid
(site_pos, ads_pos)¶ Get new ASE atoms object with adsorbate from energy grid
- Args:
- ads_pos (numpy array): 1D numpy array for the proposed adsorption position
- Returns:
- new_mof (ASE Atoms object): new ASE Atoms object with adsorbate
-
get_nonplanar_ads_pos
(scaled_sum_dist, center_coord)¶ Get adsorption site for non-planar structure
- Args:
scaled_sum_dist (numpy array): 2D numpy array for the scaled Euclidean distance vectors between each coordinating atom and the central atom (i.e. the adsorption site)
center_coord (numpy array): 1D numpy array for adsorption site
- Returns:
- ads_pos (numpy array): 1D numpy array for the proposed adsorption position
-
get_opt_ads_pos
(mic_coords, site_idx)¶ Get the optimal adsorption site
- Args:
mic_coords (numpy array): 2D numpy array for the coordinates of each coordinating atom using the central atom (i.e. adsorption site) as the origin
site_idx (int): ASE index of adsorption site
- Returns:
- ads_pos (numpy array): 1D numpy array for the proposed adsorption position
-
get_planar_ads_pos
(center_coord, dist, site_idx)¶ Get adsorption site for planar structure
- Args:
center_coord (numpy array): 1D numpy array for adsorption site (i.e. the central atom)
site_idx (int): ASE index for adsorption site
- Returns:
- ads_pos (numpy array): 1D numpy array for the proposed adsorption position
-
get_tri_ads_pos
(normal_vec, scaled_sum_dist, center_coord, site_idx)¶ Get adsorption site for a 3-coordinate site
- Args:
normal_vec (numpy array): 1D numpy array for the normal vector to the line
scaled_sum_dist (numpy array): 2D numpy array for the scaled Euclidean distance vectors between each coordinating atom and the central atom (i.e. the adsorption site)
center_coord (numpy array): 1D numpy array for adsorption site
site_idx (int): ASE index of adsorption site
- Returns:
- ads_pos (numpy array): 1D numpy array for the proposed adsorption position
mai.adsorbate_constructor module¶
-
class
mai.adsorbate_constructor.
adsorbate_constructor
(ads='X', d_MX1=2.0, eta=1, connect=1, d_X1X2=1.25, d_X2X3=None, ang_MX1X2=None, ang_triads=None, r_cut=2.5, sum_tol=0.5, rmse_tol=0.25, overlap_tol=0.75)¶ Bases:
object
This class constructs an ASE atoms object with an adsorbate Initialized variables
- Args:
ads (string): string of element or molecule for adsorbate (defaults to ‘X’)
d_MX1 (float): distance between adsorbate and surface atom. If used with get_adsorbate_grid, it represents the maximum distance (defaults to 2.0)
eta (int): denticity of end-on (1) or side-on (2) (defaults to 1)
connect (int): the connecting atom in the species string (defaults to 1)
d_X1X2 (float): X1-X2 bond length (defaults to 1.25)
d_X2X3 (float): X2-X3 bond length for connect == 1 or X1-X3 bond length for connect == 2 (defaults to d_bond1)
ang_MX1X2 (float): site-X1-X2 angle (for diatomics, defaults to 180 degrees except for side-on in which it defaults to 90 or end-on O2 in which it defaults to 120; for triatomics, defaults to 180 except for H2O in which it defaults to 104.5)
ang_triads (float): X3-X1-X2 angle (defaults to 180 degrees for connect == 1 and 90 degrees for connect == 2)
r_cut (float): cutoff distance for calculating nearby atoms when ranking adsorption sites
sum_tol (float): threshold to determine planarity. when the sum of the Euclidean distance vectors of coordinating atoms is less than sum_tol, planarity is assumed
rmse_tol (float): second threshold to determine planarity. when the root mean square error of the best-fit plane is less than rmse_tol, planarity is assumed
overlap_tol (float): distance below which atoms are assumed to be overlapping
-
get_adsorbate
(atoms_path=None, site_idx=None, omd_path=None, NN_method='crystal', allowed_sites=None, write_file=True, new_mofs_path=None, error_path=None, NN_indices=None, atoms=None, new_atoms_name=None)¶ Add an adsorbate using PymatgenNN or OMD
Args:
atoms_path (string): filepath to the CIF file
site_idx (int): ASE index for the adsorption site
omd_path (string): filepath to OMD results folder (defaults to ‘/oms_results’)
NN_method (string): string representing the desired Pymatgen nearest neighbor algorithm. options include ‘crystal’,vire’,’okeefe’, and others. See NN_algos.py (defaults to ‘crystal’)
allowed_sites (list of strings): list of allowed site species for use with automatic OMS detection
write_file (bool): if True, the new ASE atoms object should be written to a CIF file (defaults to True)
new_mofs_path (string): path to store the new CIF files if write_file is True (defaults to /new_mofs within the directory containing the starting CIF file)
error_path (string): path to store any adsorbates flagged as problematic (defaults to /errors within the directory containing the starting CIF file)
NN_indices (list of ints): list of indices for first coordination sphere (these are usually automatically detected via the default of None)
atoms (ASE Atoms object): the ASE Atoms object of the MOF to add the adsorbate to (only include if atoms_path is not specified)
new_atoms_name (string): the name of the MOF used for file I/O purposes (defaults to the basename of atoms_path if provided)
- Returns:
- new_atoms (Atoms object): ASE Atoms object of MOF with adsorbate
-
get_adsorbate_grid
(atoms_path=None, site_idx=None, grid_path=None, grid_format='ASCII', write_file=True, new_mofs_path=None, error_path=None, atoms=None, new_atoms_name=None)¶ This function adds a molecular adsorbate based on a potential energy grid
- Args:
atoms_path (string): filepath to the CIF file
site_idx (int): ASE index for the adsorption site
grid_path (string): path to the directory containing the PEG (defaults to /energy_grids)
grid_format (string): accepts either ‘ASCII’ or ‘cube’ and is the file format for the PEG (defaults to ‘ASCII’)
write_file (bool): if True, the new ASE atoms object should be written to a CIF file (defaults to True)
new_mofs_path (string): path to store the new CIF files if write_file is True (defaults to /new_mofs)
error_path (string): path to store any adsorbates flagged as problematic (defaults to /errors)
atoms (ASE Atoms object): the ASE Atoms object of the MOF to add the adsorbate to (only include if atoms_path is not specified)
new_atoms_name (string): the name of the MOF used for file I/O purposes (defaults to the basename of atoms_path if provided)
- Returns:
- new_atoms (Atoms object): ASE Atoms object of MOF with adsorbate
mai.grid_handler module¶
-
mai.grid_handler.
cube_to_xyzE
(cube_file)¶ Converts cube to ASCII file Adopted from code by Julen Larrucea Original source: https://github.com/julenl/molecular_modeling_scripts
- Args:
- cube_file (string): path to cube file
- Returns:
- pd_data (Pandas dataframe): dataframe of (x,y,z,E) grid
-
mai.grid_handler.
get_best_grid_pos
(atoms, max_dist, site_idx, grid_filepath)¶ Finds minimum energy position in grid dataframe
- Args:
atoms (ASE Atoms object): Atoms object of structure
max_dist (float): maximum distance from active site to consider
site_idx (int): ASE index of adsorption site
grid_filepath (string): path to energy grid
- Returns:
- ads_pos (array): 1D numpy array for the ideal adsorption position
-
mai.grid_handler.
grid_within_cutoff
(df, atoms, max_dist, site_pos, partition=1000000.0)¶ Reduces grid dataframe into data within max_dist of active site
- Args:
df (pandas df object): df containing energy grid details (x,y,z,E)
atoms (ASE Atoms object): Atoms object of structure
max_dist (float): maximum distance from active site to consider
site_pos (array): numpy array of the adsorption site
partition (float): how many data points to partition the df for. This is used to prevent memory overflow errors. Decrease if memory errors arise.
- Returns:
- new_df (pandas df object): modified df only around max_dist from active site and also with a new distance (d) column
-
mai.grid_handler.
read_grid
(grid_filepath)¶ Convert energy grid to pandas dataframe
- Args:
- grid_filepath (string): path to energy grid (must be .cube or .grid)
- Returns:
- df (pandas df object): df containing energy grid details (x,y,z,E)
mai.NN_algos module¶
-
mai.NN_algos.
get_NNs_pm
(atoms, site_idx, NN_method)¶ Get coordinating atoms to the adsorption site
- Args:
atoms (Atoms object): atoms object of MOF
site_idx (int): ASE index of adsorption site
NN_method (string): string representing the desired Pymatgen nearest neighbor algorithm: refer to http://pymatgen.org/_modules/pymatgen/analysis/local_env.html
- Returns:
- neighbors_idx (list of ints): ASE indices of coordinating atoms
mai.oms_handler module¶
-
mai.oms_handler.
get_ase_NN_idx
(atoms, coords)¶ Get the ASE indices for the coordinating atoms
- Args:
atoms (Atoms object): ASE Atoms object for the MOF
coords (numpy array): coordinates of the coordinating atoms
- Returns:
- ase_NN_idx (list of ints): ASE indices of the coordinating atoms
-
mai.oms_handler.
get_ase_oms_idx
(atoms, coords)¶ Get the ASE index of the OMS
- Args:
atoms (Atoms object): ASE Atoms object for the MOF
coords (numpy array): coordinates of the OMS
- Returns:
- ase_oms_idx (int): ASE index of OMS
-
mai.oms_handler.
get_omd_data
(oms_data_path, name, atoms)¶ Get info about the open metal site from OpenMetalDetector results files
- Args:
oms_data_path (string): path to the OpenMetalDetector results
name (string): name of the MOF
atoms (ASE Atoms object): Atoms object for the MOF
- Returns:
- omsex_dict (dict): dictionary of data from the OpenMetalDetector results
mai.regression module¶
-
mai.regression.
OLS_fit
(xyz)¶ Make ordinary least squares fit to z=a+bx+cy and return the normal vector
- Args:
- xyz (numpy array): 2D numpy array of XYZ values (N rows, 3 cols)
- Returns:
- normal_vec (numpy array): 1D numpy array for the normal vector
-
mai.regression.
TLS_fit
(xyz)¶ Make total least squares fit to ax+by+cz+d=0 and return the normal vector
- Args:
- xyz (numpy array): 2D numpy array of XYZ values (N rows, 3 cols)
- Returns:
rmse (float): root mean square error of fit
normal_vec (numpy array): 1D numpy array for the normal vector
mai.species_rules module¶
-
mai.species_rules.
add_CH4_SS
(mof, site_idx, ads_pos)¶ Add CH4 to the structure from single-site model
- Args:
mof (ASE Atoms object): starting ASE Atoms object of structure
site_idx (int): ASE index of site based on single-site model
ads_pos (array): 1D numpy array for the best adsorbate position
- Returns:
- mof (ASE Atoms object): ASE Atoms object with adsorbate
-
mai.species_rules.
add_diatomic
(mof, ads_species, ads_pos, site_idx, d_X1X2=1.25, ang_MX1X2=None, eta=1, connect=1, r_cut=2.5, overlap_tol=0.75)¶ Add diatomic to the structure
- Args:
mof (ASE Atoms object): starting ASE Atoms object of structure
ads_species (string): adsorbate species
ads_pos (array): 1D numpy array for the best adsorbate position
site_idx (int): ASE index of site
d_X1X2 (float): X1-X2 bond length (defaults to 1.25)
ang_MX1X2 (float): site-X1-X2 angle (defaults to 180 degrees except for side-on in which it defaults to 90 degrees prior to the centering)
eta (int): denticity of end-on (1) or side-on (2) (defaults to 1)
connect (int): the connecting atom in the species string (defaults to 1)
r_cut (float): cutoff distance for calculating nearby atoms when ranking adsorption sites (defualts to 2.5)
overlap_tol (float): distance below which atoms are assumed to be overlapping (defualts to 0.75)
- Returns:
- mof (ASE Atoms object): ASE Atoms object with adsorbate
-
mai.species_rules.
add_monoatomic
(mof, ads_species, ads_pos)¶ Add adsorbate to the ASE atoms object
- Args:
mof (ASE Atoms object): starting ASE Atoms object of structure
ads_species (string): adsorbate species
ads_pos (numpy array): 1D numpy array for the proposed adsorption position
- Returns:
- mof (ASE Atoms object): ASE Atoms object with adsorbate
-
mai.species_rules.
add_triatomic
(mof, ads_species, ads_pos, site_idx, d_X1X2=1.25, d_X2X3=None, ang_MX1X2=None, ang_triads=None, connect=1, r_cut=2.5, overlap_tol=0.75)¶ Add triatomic to the structure
- Args:
mof (ASE Atoms object): starting ASE Atoms object of structure
ads_species (string): adsorbate species
ads_pos (array): 1D numpy array for the best adsorbate position
site_idx (int): ASE index of site
d_X1X2 (float): X1-X2 bond length (defaults to 1.25)
d_X2X3 (float): X2-X3 bond length for connect == 1 or X1-X3 bond length for connect == 2 (defaults to d_X1X2)
ang_MX1X2 (float): site-X1-X2 angle (defaults to 180 degrees)
ang_triads (float): triatomic angle (defaults to 180 degrees for connect == 1 and ang_MX1X2 for connect == 2)
connect (int): the connecting atom in the species string (defaults to 1)
r_cut (float): cutoff distance for calculating nearby atoms when ranking adsorption sites (defualts to 2.5)
overlap_tol (float): distance below which atoms are assumed to be overlapping (defualts to 0.75)
- Returns:
- mof (ASE Atoms object): ASE Atoms object with adsorbate
mai.tools module¶
-
mai.tools.
get_refcode
(atoms_filename)¶ Get the name of the MOF
- Args:
- atoms_filename (string): filename of the ASE Atoms object (accepts CIFS, POSCARs, and CONTCARs)
- Return:
- refcode (string): name of MOF (defaults to ‘mof’ if the original filename is just named CONTCAR or POSCAR)
-
mai.tools.
string_to_formula
(species_string)¶ Convert a species string to a chemical formula
- Args:
- species_string (string): string of atomic/molecular species
- Return:
- formula (string): stoichiometric chemical formula