Code documentation

mai.ads_sites module

class mai.ads_sites.ads_pos_optimizer(adsorbate_constructor, write_file=True, new_mofs_path=None, error_path=None, log_stats=True)

Bases: object

This identifies ideal adsorption sites

Args:

adsorbate_constructor (class): adsorbate_constructor class containing many relevant defaults

write_file (bool): if True, the new ASE atoms object should be written to a CIF file (defaults to True)

new_mofs_path (string): path to store the new CIF files if write_file is True (defaults to /new_mofs)

error_path (string): path to store any adsorbates flagged as problematic (defaults to /errors)

log_stats (bool): print stats about process

check_and_write(new_mof, new_name)

Check for overlapping atoms and write CIF file

Args:

new_mof (ASE Atoms object): the new MOF-adsorbate complex

new_name (string): the name of the new CIF file to write

Returns:
overlap (boolean): True or False for overlapping atoms
construct_mof(mof, ads_pos, site_idx)

Construct the MOF-adsorbate complex

Args:

ads_pos_optimizer (class): see ads_sites.py for details

ads (string): adsorbate species

ads_pos (numpy array): 1D numpy array for the proposed adsorption position

Returns:
mof (ASE Atoms object): ASE Atoms object with adsorbate
get_NNs(ads_pos, site_idx)

Get the number of atoms nearby the proposed adsorption site within r_cut

Args:

ads_pos (numpy array): 1D numpy array for the proposed adsorption position

site_idx (int): ASE index for adsorption site

Returns:

NN (int): number of neighbors within r_cut

min_dist (float): distance from adsorbate to nearest atom

n_overlap (int): number of overlapping atoms

get_bi_ads_pos(normal_vec, center_coord, site_idx)

Get adsorption site for a 2-coordinate site

Args:

normal_vec (numpy array): 1D numpy array for the normal vector to the line

center_coord (numpy array): 1D numpy array for adsorption site

site_idx (int): ASE index of adsorption site

Returns:
ads_pos (numpy array): 1D numpy array for the proposed adsorption position
get_dist_planar(normal_vec)

Get distance vector for planar adsorption site

Args:
normal_vec (numpy array): 1D numpy array for normal vector
Returns:
dist (float): distance vector scaled to d_MX1
get_new_atoms(ads_pos, site_idx)

Get new ASE atoms object with adsorbate from pymatgen analysis

Args:
ads_pos (numpy array): 1D numpy array for the proposed adsorption position
Returns:
new_mof (ASE Atoms object): new ASE Atoms object with adsorbate
get_new_atoms_grid(site_pos, ads_pos)

Get new ASE atoms object with adsorbate from energy grid

Args:
ads_pos (numpy array): 1D numpy array for the proposed adsorption position
Returns:
new_mof (ASE Atoms object): new ASE Atoms object with adsorbate
get_nonplanar_ads_pos(scaled_sum_dist, center_coord)

Get adsorption site for non-planar structure

Args:

scaled_sum_dist (numpy array): 2D numpy array for the scaled Euclidean distance vectors between each coordinating atom and the central atom (i.e. the adsorption site)

center_coord (numpy array): 1D numpy array for adsorption site

Returns:
ads_pos (numpy array): 1D numpy array for the proposed adsorption position
get_opt_ads_pos(mic_coords, site_idx)

Get the optimal adsorption site

Args:

mic_coords (numpy array): 2D numpy array for the coordinates of each coordinating atom using the central atom (i.e. adsorption site) as the origin

site_idx (int): ASE index of adsorption site

Returns:
ads_pos (numpy array): 1D numpy array for the proposed adsorption position
get_planar_ads_pos(center_coord, dist, site_idx)

Get adsorption site for planar structure

Args:

center_coord (numpy array): 1D numpy array for adsorption site (i.e. the central atom)

site_idx (int): ASE index for adsorption site

Returns:
ads_pos (numpy array): 1D numpy array for the proposed adsorption position
get_tri_ads_pos(normal_vec, scaled_sum_dist, center_coord, site_idx)

Get adsorption site for a 3-coordinate site

Args:

normal_vec (numpy array): 1D numpy array for the normal vector to the line

scaled_sum_dist (numpy array): 2D numpy array for the scaled Euclidean distance vectors between each coordinating atom and the central atom (i.e. the adsorption site)

center_coord (numpy array): 1D numpy array for adsorption site

site_idx (int): ASE index of adsorption site

Returns:
ads_pos (numpy array): 1D numpy array for the proposed adsorption position

mai.adsorbate_constructor module

class mai.adsorbate_constructor.adsorbate_constructor(ads='X', d_MX1=2.0, eta=1, connect=1, d_X1X2=1.25, d_X2X3=None, ang_MX1X2=None, ang_triads=None, r_cut=2.5, sum_tol=0.5, rmse_tol=0.25, overlap_tol=0.75)

Bases: object

This class constructs an ASE atoms object with an adsorbate Initialized variables

Args:

ads (string): string of element or molecule for adsorbate (defaults to ‘X’)

d_MX1 (float): distance between adsorbate and surface atom. If used with get_adsorbate_grid, it represents the maximum distance (defaults to 2.0)

eta (int): denticity of end-on (1) or side-on (2) (defaults to 1)

connect (int): the connecting atom in the species string (defaults to 1)

d_X1X2 (float): X1-X2 bond length (defaults to 1.25)

d_X2X3 (float): X2-X3 bond length for connect == 1 or X1-X3 bond length for connect == 2 (defaults to d_bond1)

ang_MX1X2 (float): site-X1-X2 angle (for diatomics, defaults to 180 degrees except for side-on in which it defaults to 90 or end-on O2 in which it defaults to 120; for triatomics, defaults to 180 except for H2O in which it defaults to 104.5)

ang_triads (float): X3-X1-X2 angle (defaults to 180 degrees for connect == 1 and 90 degrees for connect == 2)

r_cut (float): cutoff distance for calculating nearby atoms when ranking adsorption sites

sum_tol (float): threshold to determine planarity. when the sum of the Euclidean distance vectors of coordinating atoms is less than sum_tol, planarity is assumed

rmse_tol (float): second threshold to determine planarity. when the root mean square error of the best-fit plane is less than rmse_tol, planarity is assumed

overlap_tol (float): distance below which atoms are assumed to be overlapping

get_adsorbate(atoms_path=None, site_idx=None, omd_path=None, NN_method='crystal', allowed_sites=None, write_file=True, new_mofs_path=None, error_path=None, NN_indices=None, atoms=None, new_atoms_name=None)

Add an adsorbate using PymatgenNN or OMD

Args:

atoms_path (string): filepath to the CIF file

site_idx (int): ASE index for the adsorption site

omd_path (string): filepath to OMD results folder (defaults to ‘/oms_results’)

NN_method (string): string representing the desired Pymatgen nearest neighbor algorithm. options include ‘crystal’,vire’,’okeefe’, and others. See NN_algos.py (defaults to ‘crystal’)

allowed_sites (list of strings): list of allowed site species for use with automatic OMS detection

write_file (bool): if True, the new ASE atoms object should be written to a CIF file (defaults to True)

new_mofs_path (string): path to store the new CIF files if write_file is True (defaults to /new_mofs within the directory containing the starting CIF file)

error_path (string): path to store any adsorbates flagged as problematic (defaults to /errors within the directory containing the starting CIF file)

NN_indices (list of ints): list of indices for first coordination sphere (these are usually automatically detected via the default of None)

atoms (ASE Atoms object): the ASE Atoms object of the MOF to add the adsorbate to (only include if atoms_path is not specified)

new_atoms_name (string): the name of the MOF used for file I/O purposes (defaults to the basename of atoms_path if provided)

Returns:
new_atoms (Atoms object): ASE Atoms object of MOF with adsorbate
get_adsorbate_grid(atoms_path=None, site_idx=None, grid_path=None, grid_format='ASCII', write_file=True, new_mofs_path=None, error_path=None, atoms=None, new_atoms_name=None)

This function adds a molecular adsorbate based on a potential energy grid

Args:

atoms_path (string): filepath to the CIF file

site_idx (int): ASE index for the adsorption site

grid_path (string): path to the directory containing the PEG (defaults to /energy_grids)

grid_format (string): accepts either ‘ASCII’ or ‘cube’ and is the file format for the PEG (defaults to ‘ASCII’)

write_file (bool): if True, the new ASE atoms object should be written to a CIF file (defaults to True)

new_mofs_path (string): path to store the new CIF files if write_file is True (defaults to /new_mofs)

error_path (string): path to store any adsorbates flagged as problematic (defaults to /errors)

atoms (ASE Atoms object): the ASE Atoms object of the MOF to add the adsorbate to (only include if atoms_path is not specified)

new_atoms_name (string): the name of the MOF used for file I/O purposes (defaults to the basename of atoms_path if provided)

Returns:
new_atoms (Atoms object): ASE Atoms object of MOF with adsorbate

mai.grid_handler module

mai.grid_handler.cube_to_xyzE(cube_file)

Converts cube to ASCII file Adopted from code by Julen Larrucea Original source: https://github.com/julenl/molecular_modeling_scripts

Args:
cube_file (string): path to cube file
Returns:
pd_data (Pandas dataframe): dataframe of (x,y,z,E) grid
mai.grid_handler.get_best_grid_pos(atoms, max_dist, site_idx, grid_filepath)

Finds minimum energy position in grid dataframe

Args:

atoms (ASE Atoms object): Atoms object of structure

max_dist (float): maximum distance from active site to consider

site_idx (int): ASE index of adsorption site

grid_filepath (string): path to energy grid

Returns:
ads_pos (array): 1D numpy array for the ideal adsorption position
mai.grid_handler.grid_within_cutoff(df, atoms, max_dist, site_pos, partition=1000000.0)

Reduces grid dataframe into data within max_dist of active site

Args:

df (pandas df object): df containing energy grid details (x,y,z,E)

atoms (ASE Atoms object): Atoms object of structure

max_dist (float): maximum distance from active site to consider

site_pos (array): numpy array of the adsorption site

partition (float): how many data points to partition the df for. This is used to prevent memory overflow errors. Decrease if memory errors arise.

Returns:
new_df (pandas df object): modified df only around max_dist from active site and also with a new distance (d) column
mai.grid_handler.read_grid(grid_filepath)

Convert energy grid to pandas dataframe

Args:
grid_filepath (string): path to energy grid (must be .cube or .grid)
Returns:
df (pandas df object): df containing energy grid details (x,y,z,E)

mai.NN_algos module

mai.NN_algos.get_NNs_pm(atoms, site_idx, NN_method)

Get coordinating atoms to the adsorption site

Args:

atoms (Atoms object): atoms object of MOF

site_idx (int): ASE index of adsorption site

NN_method (string): string representing the desired Pymatgen nearest neighbor algorithm: refer to http://pymatgen.org/_modules/pymatgen/analysis/local_env.html

Returns:
neighbors_idx (list of ints): ASE indices of coordinating atoms

mai.oms_handler module

mai.oms_handler.get_ase_NN_idx(atoms, coords)

Get the ASE indices for the coordinating atoms

Args:

atoms (Atoms object): ASE Atoms object for the MOF

coords (numpy array): coordinates of the coordinating atoms

Returns:
ase_NN_idx (list of ints): ASE indices of the coordinating atoms
mai.oms_handler.get_ase_oms_idx(atoms, coords)

Get the ASE index of the OMS

Args:

atoms (Atoms object): ASE Atoms object for the MOF

coords (numpy array): coordinates of the OMS

Returns:
ase_oms_idx (int): ASE index of OMS
mai.oms_handler.get_omd_data(oms_data_path, name, atoms)

Get info about the open metal site from OpenMetalDetector results files

Args:

oms_data_path (string): path to the OpenMetalDetector results

name (string): name of the MOF

atoms (ASE Atoms object): Atoms object for the MOF

Returns:
omsex_dict (dict): dictionary of data from the OpenMetalDetector results

mai.regression module

mai.regression.OLS_fit(xyz)

Make ordinary least squares fit to z=a+bx+cy and return the normal vector

Args:
xyz (numpy array): 2D numpy array of XYZ values (N rows, 3 cols)
Returns:
normal_vec (numpy array): 1D numpy array for the normal vector
mai.regression.TLS_fit(xyz)

Make total least squares fit to ax+by+cz+d=0 and return the normal vector

Args:
xyz (numpy array): 2D numpy array of XYZ values (N rows, 3 cols)
Returns:

rmse (float): root mean square error of fit

normal_vec (numpy array): 1D numpy array for the normal vector

mai.species_rules module

mai.species_rules.add_CH4_SS(mof, site_idx, ads_pos)

Add CH4 to the structure from single-site model

Args:

mof (ASE Atoms object): starting ASE Atoms object of structure

site_idx (int): ASE index of site based on single-site model

ads_pos (array): 1D numpy array for the best adsorbate position

Returns:
mof (ASE Atoms object): ASE Atoms object with adsorbate
mai.species_rules.add_diatomic(mof, ads_species, ads_pos, site_idx, d_X1X2=1.25, ang_MX1X2=None, eta=1, connect=1, r_cut=2.5, overlap_tol=0.75)

Add diatomic to the structure

Args:

mof (ASE Atoms object): starting ASE Atoms object of structure

ads_species (string): adsorbate species

ads_pos (array): 1D numpy array for the best adsorbate position

site_idx (int): ASE index of site

d_X1X2 (float): X1-X2 bond length (defaults to 1.25)

ang_MX1X2 (float): site-X1-X2 angle (defaults to 180 degrees except for side-on in which it defaults to 90 degrees prior to the centering)

eta (int): denticity of end-on (1) or side-on (2) (defaults to 1)

connect (int): the connecting atom in the species string (defaults to 1)

r_cut (float): cutoff distance for calculating nearby atoms when ranking adsorption sites (defualts to 2.5)

overlap_tol (float): distance below which atoms are assumed to be overlapping (defualts to 0.75)

Returns:
mof (ASE Atoms object): ASE Atoms object with adsorbate
mai.species_rules.add_monoatomic(mof, ads_species, ads_pos)

Add adsorbate to the ASE atoms object

Args:

mof (ASE Atoms object): starting ASE Atoms object of structure

ads_species (string): adsorbate species

ads_pos (numpy array): 1D numpy array for the proposed adsorption position

Returns:
mof (ASE Atoms object): ASE Atoms object with adsorbate
mai.species_rules.add_triatomic(mof, ads_species, ads_pos, site_idx, d_X1X2=1.25, d_X2X3=None, ang_MX1X2=None, ang_triads=None, connect=1, r_cut=2.5, overlap_tol=0.75)

Add triatomic to the structure

Args:

mof (ASE Atoms object): starting ASE Atoms object of structure

ads_species (string): adsorbate species

ads_pos (array): 1D numpy array for the best adsorbate position

site_idx (int): ASE index of site

d_X1X2 (float): X1-X2 bond length (defaults to 1.25)

d_X2X3 (float): X2-X3 bond length for connect == 1 or X1-X3 bond length for connect == 2 (defaults to d_X1X2)

ang_MX1X2 (float): site-X1-X2 angle (defaults to 180 degrees)

ang_triads (float): triatomic angle (defaults to 180 degrees for connect == 1 and ang_MX1X2 for connect == 2)

connect (int): the connecting atom in the species string (defaults to 1)

r_cut (float): cutoff distance for calculating nearby atoms when ranking adsorption sites (defualts to 2.5)

overlap_tol (float): distance below which atoms are assumed to be overlapping (defualts to 0.75)

Returns:
mof (ASE Atoms object): ASE Atoms object with adsorbate

mai.tools module

mai.tools.get_refcode(atoms_filename)

Get the name of the MOF

Args:
atoms_filename (string): filename of the ASE Atoms object (accepts CIFS, POSCARs, and CONTCARs)
Return:
refcode (string): name of MOF (defaults to ‘mof’ if the original filename is just named CONTCAR or POSCAR)
mai.tools.string_to_formula(species_string)

Convert a species string to a chemical formula

Args:
species_string (string): string of atomic/molecular species
Return:
formula (string): stoichiometric chemical formula