2.2. Molecular Hamiltonians¶
A molecular Hamiltonian is defined in three steps:
Set up a molecular geometry (see Specifying the molecular geometry)
Define a Gaussian basis set (see Specifying the atomic basis set)
Calculate the matrix representation of all contributions separately that are defined in the molecular Hamiltonian (see Computing the matrix representation of the Hamiltonian)
Before an HF or post-HF calculation can be performed, the overlap matrix of the atomic basis set as well as the molecular orbitals need to be specified (see Computing the overlap matrix and defining the orbitals)
2.2.1. Specifying the molecular geometry¶
The basis set reader in PyBEST requires the molecular coordinates to be defined in an xyz file. If an xyz file is present, only the (relative) path to the file (in most cases just the file name) needs to be specified. If the molecular geometry is not present as an xyz file, PyBEST will internally convert it to a file in the xyz format and store it in a temporary directory. Note that PyBEST can load/dump the molecular geometry from/to different file formats. The file format is determined automatically using the extension or prefix of the filename. For more information on supported file formats, refer to Input/Output Operations: the IOData Container.
In addition to providing the filename, the molecular geometry can be specified as described below.
2.2.1.1. Reading the molecular geometry from file¶
The molecular geometry can be read from file using the method
from_file()
of the
IOData
class,
mol = IOData.from_file(filename)
For instance, the molecular coordinates can be provided in the conventional xyz file format,
mol = IOData.from_file('molecule.xyz')
If just the filename is given, PyBEST assumes that the file is present in the working directory. If the coordinate file is lying elsewhere, the absolute or relative path have to be provided, for instance,
mol = IOData.from_file('./coords/molecule.xyz')
2.2.1.2. Constructing a molecular geometry from scratch¶
A molecular geometry can also be generated using Python code. The following example constructs the atomic coordinates for a water molecule and writes the corresponding geometry information to an xyz file. Note that the coordinates have to be defined in bohr (atomic units). The xyz file is represented in Angstrom.
import numpy as np
from pybest.io import IOData
from pybest.utils import get_com
# define the molecule
natom = 3
alpha = 109.0 / 2.0 * np.pi / 180.0 # bond angle in radian
radius = 1.8 # distance between two neighboring atoms in bohr
# define the coordinates and elements
coordinates = np.zeros((natom, 3))
atom = np.array(["O", "H", "H"], dtype=object) # must be strings
# update coordinates of the hydrogens (the oxygen atoms remains at the origin)
coordinates[1, 1] = -radius * np.sin(alpha)
coordinates[1, 2] = radius * np.cos(alpha)
coordinates[2, 1] = radius * np.sin(alpha)
coordinates[2, 2] = radius * np.cos(alpha)
# assign coordinates to container
mol = IOData(coordinates=coordinates, atom=atom, title="Water")
# move center of mass to the origin
com = get_com(mol)
mol.coordinates = mol.coordinates - com
# write the molecule to an XYZ file (optional)
mol.to_file("water.xyz")
2.2.2. Specifying the atomic basis set¶
PyBEST uses the Libint2 basis set reader. Thus, the supported basis set format is the g94 format. Note that Libint2 needs to be compiled to support up to a maximum angular momentum quantum number of seven, that is, K functions.
PyBEST is distributed with its own basis set library. An atomic basis set can
be read in from the PyBEST basis set library using the function call
get_gobasis()
.
2.2.2.1. Supported atomic basis sets in the PyBEST basis set library¶
All supported basis sets are located in src/pybest/data/basis
and include
Minimal basis sets
STO-3G, STO-6G
Pople basis sets
3-21g
6-31g, 6-31g*, 6-31g**, 6-31++g**
Correlation consistent basis sets
cc-pVDZ, cc-pVTZ, cc-pVQZ, cc-pV5Z
cc-pcVDZ, cc-pcVTZ, cc-pcVQZ
aug-cc-pVDZ, aug-cc-pVTZ, aug-cc-pVQZ, aug-cc-pV5Z
ANO basis sets
ANO-RCC (large)
ANO-RCC-VDZ, ANO-RCC-VDZP, ANO-RCC-VTZ, ANO-RCC-VTZP, ANO-RCC-VQZP
2.2.2.2. The same basis set for different atoms¶
By default, all atoms in the molecule are assigned the same atomic basis set.
To define an atomic basis set, the molecular geometry has to be provided to
the get_gobasis()
(see Reading the molecular geometry from file).
The atomic basis set can then be constructed by only one function call,
gobasis = get_gobasis(basisname, coordinate_filename)
where basisname
is the name of the basis set (a string) and coordinate_filename
is the file containing the coordinates. If a cc-pVDZ
basis is used and the
coordinates are stored in the mol.xyz
file, the basis set can be defined
as follows
gobasis = get_gobasis('cc-pvdz', 'mol.xyz')
2.2.2.3. Loading user-defined basis sets from file¶
If an atomic basis set is not present in the PyBEST basis set library, it can
be read from file. Note that the basis set file has to be available in the
g94 file format.
To read in a user-defined basis set file, the basisname
argument of the
get_gobasis()
function has to be substituted
by the path to the basis set file.
gobasis = get_gobasis(path_to_basis, coordinate_filename)
path_to_basis
can be either the absolute or relative path. If the user-specified
basis set is stored in the my_own_dz.g94
file, the above example becomes
gobasis = get_gobasis('./my_own_dz.g94', 'mol.xyz')
Note
The file extension .g94
has to be explicitely provided.
Only with the file extension, PyBEST will read a user-defined basis set
instead of searching for the basis set in its basis set library.
2.2.2.4. Loading geometry and basis set information from another file or external program¶
The molecular geometry and basis set information can also be read in from
another file generated by external software like (Molpro, ORCA, PSI4, etc).
Supported file formats are .mkl
and .molden
(in addition to
PyBEST’s internal .h5
format).
As for molecular geometries, the from_file()
method has to be used to load the file.
# Load the geometry, orbital basis, and wave function from a molden file:
mol = IOData.from_file('water.molden')
# Print the number of basis functions
print(mol.gobasis.nbasis)
The corresponding information on the basis set or molecular orbitals is stored
in the corresponding attributes. For instance, the basis set object is available
in the gobasis
attribute as the return value.
See also Input/Output Operations: the IOData Container for more details on how to handle I/O operations.
2.2.3. Loading point charges and embedding potentials from another file¶
PyBEST can read user-specified point charges and embedding potentials obtained from external programs.
2.2.3.1. Point charges¶
The point charge file should be similar to an .xyz
coordinate file, with a .pc
extension. The first row records the number of point charges specified in the file. The
second row can be left empty or used for comments. The remaining rows must have the x, y, and z
coordinates (in Angstrom) of the point charges and the value of the charges (last column), in that order.
An example is shown here.
6
Charges (xyz coord in Angstrom, charge value)
1.00000000 0.00000000 0.00000000 1.0
-1.00000000 0.00000000 0.00000000 1.0
0.00000000 1.00000000 0.00000000 1.0
0.00000000 -1.00000000 0.00000000 1.0
0.00000000 0.00000000 1.00000000 1.0
0.00000000 0.00000000 -1.00000000 1.0
To read in a point charge file, the get_charges()
function is used, and the file name of the point charge file
(str
) is provided as an argument for this function.
from pybest.gbasis import get_charges
charges = get_charges("filename.pc")
2.2.3.2. Embedding potential¶
PyBEST can also read embedding potentials calculated using density functional approximations, from external
software packages, like ADF. The external embedding potential must be with the .emb
extension.
Its structure is similar to that of a point charge file. The first row shows the number of grid points.
The second row is reserved for comments or can be left empty. For the remaining rows, the first
three columns show the x, y, and z coordinates of the grid points, while the fourth and fifth columns have the
electron density at that particular grid point and the associated weight, respectively [gomes2008embedding].
A snippet of an embedding potential file is shown here for example.
9660
9.786939509959390548e+00 9.786939509959385219e+00 -1.559883756011516098e+01 7.883610481834972461e+01 -1.737991753428243899e-11
1.103004381738485584e+01 1.103004381738485051e+01 -1.317031110192973031e+01 1.247435090968685643e+02 -1.367767580206739005e-14
8.644496266296856746e+00 8.644496266296851417e+00 -1.313731163432065685e+01 3.500304032634051055e+01 -8.124997547931548861e-14
9.289482143189065511e+00 9.289482143189061958e+00 -1.172747263669079132e+01 5.860986764063856214e+01 -2.395554471674016197e-13
4.037276242018354111e+00 1.506732005940319041e+01 -1.317031110192973031e+01 1.247435090968685643e+02 -1.367767580206739005e-14
The embedding potential can be read using the get_embedding()
function.
from pybest.gbasis import get_embedding
embedding_potential = get_embedding("filename.emb")
2.2.4. Specifying dummy or ghost atoms and active fragments¶
In PyBEST, dummy or ghost atoms and active molecular fragments are defined
using the get_gobasis()
function, that is,
during the construction of the atomic basis set.
2.2.4.1. Dummy or ghost atoms¶
All dummy or ghost atoms are passed as a list to the get_gobasis()
function that contains the indices of all dummy/ghost atoms as its elements.
The elements are sorted according to their order in the xyz coordinate file.
# set first (0) and second (1) element in mol.xyz as dummy/ghost atom
gobasis = get_gobasis('cc-pvdz', 'mol.xyz', dummy=[0,1])
Note
In PyBEST, all indexing is performed using the Python convention, that is, beginning with 0.
2.2.4.2. Active fragments¶
PyBEST allows us to divide one xyz coordinate file in an active and inactive fragment. The inactive fragment will be neglected during the construction of the basis set and molecular Hamiltonian.
# define molecule containing atoms 3, 4, and 5 as active fragment contained
# in the mol.xyz file
gobasis = get_gobasis('cc-pvdz', 'mol.xyz', active_fragment=[3,4,5])
2.2.4.3. Combining ghost atoms and active fragments¶
The dummy
option and the active_fragment
option can be combined so that additional ghost or
dummy atoms can be defined in the active fragment.
# define molecule containing atoms 3, 4, and 5 as active fragment contained
# in the mol.xyz file, set atom index 3 to be a ghost atom
gobasis = get_gobasis('cc-pvdz', 'mol.xyz', active_fragment=[3,4,5], dummy=[3])
Note
The dummy index has to be included in the active_fragment
list. If this is not the case,
an exception will be raised.
2.2.5. Computing the matrix representation of the Hamiltonian¶
As soon as the molecular geometry and atomic basis sets are defined, the matrix representation of the molecular Hamiltonian can be calculated. In the current version of PyBEST, each term has to be evaluated separately. The molecular Hamiltonian contains only one- and two-electron integrals as well as a constant term due to the nuclei-nuclei repulsion. Various one- and two-electron integrals are implemented in PyBEST (see also List of Supported One- and Two-Electron Integrals for a list of supported integrals). Most integrals are calculated using the LibInt package [valeev2019] using the modern C++ API, except the Cholesky-decomposed two-electron repulsion integrals, which are evaluated using the LibChol Library, the pVp integrals, which are determined by interfacing specific features from the LibInt package [valeev2019], and the DKH2 Hamiltonian, which is calculated using Python code.
All one-electron integrals as well as the overlap matrix of the atomic basis
set are stored as two-index objects (that is, in its full or dense matrix
representation), which are implemented in the DenseLinalgFactory
class.
The two-electron integrals are either stored as a dense four-index object
(DenseLinalgFactory
class, memory intensive) or Cholesky decomposed
(CholeskyLinalgFactory
class, more memory efficient).
See PyBEST objects: the LinalgFactory for more information on the LinalgFactory
and its features.
The evaluation of all one- and two-electron integrals is performed by
various PyBEST function calls.
2.2.5.1. The Schrödinger Hamiltonian¶
The electronic Schrödigner Hamiltonian,
which contains the kinetic energy of the electrons, the electron-nucleus attraction, the electron-electron repulsion, and the nucleus-nucleus repulsion terms. In PyBEST, each term has to be evaluated explicitly.
Before computing all one- and two-electron integrals, the Linalg factory
has to be specified.
Currently, two Linalg factories are supported: DenseLinalgFactory
and
CholeskyLinalgFactory
, which differ only in the representation of the
two-electron repulsion integrals. The one-electron integrals are represented
in their dense representation using the DenseLinalgFactory
class,
irrespective of the choice between DenseLinalgFactory
or
CholeskyLinalgFactory
.
For both cases, the number of atomic basis functions has to be passed as an
argument. It is stored in the nbasis
attribute of the PyBasis
instance
(here gobasis
).
# use DenseLinalgFactory
lf = DenseLinalgFactory(gobasis.nbasis)
# use CholeskyLinalgFactory
lf = CholeskyLinalgFactory(gobasis.nbasis)
Note
The DenseLinalgFactory
and CholeskyLinalgFactory
differ only in their
representation of the two-electron integrals. All smaller dimensional tensors are
represented using the DenseLinalgFactory.
The kinetic energy can be calculated using the
compute_kinetic()
function, which takes an
instance of PyBasis
as argument,
# evaluate kinetic energy of electrons and store in kin
kin = compute_kinetic(gobasis)
Similarly, the electron repulsion integrals are determined
# evaluate electron repulsion integrals and store in eri
eri = compute_eri(gobasis)
# or evaluate Cholesky-decomposed two-electron integals, the default threshold is
# set to 1e-3
eri = compute_cholesky_eri(gobasis, threshold=1e-8)
The potential energy associated with the nuclei is determined in a similar fashion,
# Nuclear-electron attraction term stored in na
na = compute_nuclear(gobasis)
# Nuclear-nuclear repulsion term stored as dictionary in nuc
nuc = compute_nuclear_repulsion(gobasis)
A complete example can be found here:
from pybest import context
from pybest.gbasis import (
compute_dipole,
compute_eri,
compute_kinetic,
compute_nuclear,
compute_nuclear_repulsion,
compute_overlap,
compute_quadrupole,
get_gobasis,
)
from pybest.linalg import DenseLinalgFactory
from pybest.utils import get_com
# Molecular Schroedinger Hamiltonian for the water molecule
# ---------------------------------------------------------
# Define coordinate file
# ----------------------------------------------------
coord = context.get_fn("test/water.xyz")
# Create a Gaussian basis set
# ---------------------------
factory = get_gobasis("cc-pvdz", coord)
# Create a linalg factory
# -----------------------
lf = DenseLinalgFactory(factory.nbasis)
# Compute integrals in the atom-centered Gaussian basis set
# ---------------------------------------------------------
kin = compute_kinetic(factory)
ne = compute_nuclear(factory)
eri = compute_eri(factory)
nuc = compute_nuclear_repulsion(factory)
2.2.5.2. The DKH Hamiltonian¶
The one-electron terms in eq. (2.1) can be substituted by the
DKH Hamiltonian to account for (scalar) relativistic effects. PyBEST currently
supports DKH up to second order (DKH2).
The DKH2 Hamiltonian can be calculated using the
compute_dkh()
function, which takes an
instance of PyBasis
as argument,
# evaluate the DKH2 Hamiltonian and store in dkh
dkh = compute_dkh(gobasis)
A complete example can be found here:
from pybest import context
from pybest.gbasis import (
compute_dkh,
compute_nuclear_repulsion,
compute_overlap,
get_gobasis,
)
from pybest.linalg import DenseLinalgFactory
# DKH2 Hamiltonian for the U dimer (the lowest closed-shell state)
# --------------------------------
# Define coordinate file
# ----------------------
coord = context.get_fn("test/u2.xyz")
# Create a Gaussian basis set
# ---------------------------
factory = get_gobasis("ano-rcc-vdz", coord)
# Create a linalg factory
# -----------------------
lf = DenseLinalgFactory(factory.nbasis)
# Compute integrals in the atom-centered Gaussian basis set
# ---------------------------------------------------------
dkh = compute_dkh(factory)
nuc = compute_nuclear_repulsion(factory)
# ERI are not computed in this example
# Compute overlap matrix of atom-centered basis set
# -------------------------------------------------
olp = compute_overlap(factory)
# Create orbitals that store the AO/MO coefficients
# -------------------------------------------------
orb = lf.create_orbital()
2.2.6. Computing the overlap matrix and defining the orbitals¶
To perform a HF or post-HF calculation, the overlap matrix of the atomic
orbital basis set has to be calculated and a set of (molecular) orbitals has
to be defined.
The overlap matrix can be calculated using the compute_overlap()
method, which requires an instance of PyBasis
as argument. The return value is
a dense two-index object of DenseLinalgFactory
(as all one-electron integrals).
# overlap matrix of the atomic basis set
olp = compute_overlap(gobasis)
The (molecular) orbitals are initialized using the create_orbital()
method of the DenseLinalgFactory
class,
# create molecular orbitals
orb = lf.create_orbital()
The orbitals are an instance of the DenseOrbital
class. The instance
orb
contains the AO/MO coefficients (orb.coeffs
), the occupation numbers
(orb.occupations
), and orbital energies (orb.energies
),
which are stored as two- (coefficients) or one-dimensional (occupations, energies)
numpy arrays. See also Working with Orbitals in PyBEST for more details on
orbitals and how to work with orbitals in PyBEST.
2.2.7. Computing the multipole moment integrals¶
PyBEST supports the calculation of the total (electric + nuclear) dipole and quadrupole moment
integrals.
The dipole integrals can be calculated using the
compute_dipole()
method, which requires an
instance of PyBasis
as argument. The return value is a list containing the
overlap matrix and each component {x, y, z} of the (Cartesian) electric dipole operator,
each being a dense two-index object of DenseLinalgFactory
(as all one-electron integrals).
# electric dipole moment integrals of the atomic basis set
olp, mux, muy, muz = compute_dipole(gobasis)
# or store as tuple
dipole = compute_dipole(gobasis)
The dipole origin is set to the center of charge (COM, in our terminology).
You first determine the COM and then pass it explicitly to the
compute_dipole()
method,
# determine COM
x, y, z = get_com(gobasis)
# electric dipole moment integrals of the atomic basis set wrt COM
olp, mux, muy, muz, origin_coord = compute_dipole(gobasis, x=x, y=y, z=z)
# or store as tuple
dipole = compute_dipole(gobasis, x=x, y=y, z=z)
The origin of the dipole moment can be changed by specifying values (in bohr) of
x
, y
, and z
while calling the compute_dipole()
function.
It is also returned as a dictionary {"origin_coord": [x, y, z]}
stored in the origin_coord
return value in the example above.
The electric quadrupole moment integrals can be determined in a similar way
using the compute_quadrupole()
method.
In addition to the overlap matrix and each component of the (Cartesian) electric
dipole operator, all 6 components, {xx, xy, xz, yy, yz, zz}, of the Cartesian
electric quadrupole operator are returned (thus, 10 objects in total),
# electric quadrupole moment integrals of the atomic basis set
# ------------------------------------------------------------
olp, mux, muy, muz, muxx, muxy, muxz, muyy, muyz, muzz = compute_quadrupole(gobasis)
# or store as tuple
quadrupole = compute_quadrupole(gobasis)
The quadrupole moment integrals
are always determined with respect to the origin of the Cartesian coordinate system,
{0,0,0}
.
# electric quadrupole moment integrals of the atomic basis set wrt origin (0,0,0)
# -------------------------------------------------------------------------------
olp, mux, muy, muz, muxx, muxy, muxz, muyy, muyz, muzz = compute_quadrupole(gobasis, x=0, y=0, z=0)
# or store as tuple
quadrupole = compute_quadrupole(gobasis, x=0, y=0, z=0)
PyBEST provides additional wrappers that can be used to determine the dipole and quadrupole moments for various wave functions (see Computing the dipole moment and Computing the quadrupole moment for more details).
2.2.8. Example Python script¶
2.2.8.1. The molecular Hamiltonian for the water molecule¶
This example shows how to set up the molecular Hamiltonian, orbitals, and overlap matrix for the water molecule.
from pybest import context
from pybest.gbasis import (
compute_dipole,
compute_eri,
compute_kinetic,
compute_nuclear,
compute_nuclear_repulsion,
compute_overlap,
compute_quadrupole,
get_gobasis,
)
from pybest.linalg import DenseLinalgFactory
from pybest.utils import get_com
# Molecular Schroedinger Hamiltonian for the water molecule
# ---------------------------------------------------------
# Define coordinate file
# ----------------------------------------------------
coord = context.get_fn("test/water.xyz")
# Create a Gaussian basis set
# ---------------------------
factory = get_gobasis("cc-pvdz", coord)
# Create a linalg factory
# -----------------------
lf = DenseLinalgFactory(factory.nbasis)
# Compute integrals in the atom-centered Gaussian basis set
# ---------------------------------------------------------
kin = compute_kinetic(factory)
ne = compute_nuclear(factory)
eri = compute_eri(factory)
nuc = compute_nuclear_repulsion(factory)
# Compute overlap matrix of atom-centered basis set
# -------------------------------------------------
olp = compute_overlap(factory)
# Create orbitals that store the AO/MO coefficients
# -------------------------------------------------
orb = lf.create_orbital()
# Get center of mass
# ------------------
x, y, z = get_com(factory)
# electric dipole moment integrals of the atomic basis set wrt COM
# ----------------------------------------------------------------
olp, mux, muy, muz, origin = compute_dipole(factory, x=x, y=y, z=z)
# electric quadrupole moment integrals of the atomic basis set wrt COM
# --------------------------------------------------------------------
olp, mux, muy, muz, muxx, muxy, muxz, muyy, muyz, muzz = compute_quadrupole(
factory, x=x, y=y, z=z
)