5.4. Reading data from file
The current version of PyBEST supports the following file formats to read specific wave function information from disk,
File format |
Description
|
---|---|
|
PyBEST’s internal format. This format allows you to read
all PyBEST objects from a binary file. All internal
checkpoint files that are dump to disk use this format.
|
|
Read some molecular coordinates form an xyz file. By
default all coordinates are transformed from Angstrom to
bohr (atomic units).
|
|
Read orbitals, coordinates, and basis set information
from a molden file. This works for basis sets that
include up to g functions.
|
|
Read orbitals, coordinates, and basis set information
from a molekel file.
|
|
Read Hamiltonian (including the external terms) in the
Molpro FCIDUMP format. All one-electron integrals are
contracted to one single term. PyBEST also return the
molecular orbitals and overlap matrix, assuming that the
molecular orbitals form an orthonormal set.
|
When reading data from one of the above mentioned file formats, PyBEST will
assign it to some IOData
container. The wave function
or molecular information are thus stored as its attributes using the default
attribute names defined in Naming conventions in PyBEST.
Note
If you use the internal format to store your own checkpoint files and you choose different variable names, PyBEST stores the corresponding objects under the user-defined attribute names. If such a checkpoint file is read in, those attributes are accessible under the user-defined names. Note, however, that some operations might not be fully supported if you decide to break PyBEST’s naming convention.
Similar to the dumping procedure (Dumping data to file), PyBEST automatically
recognizes the (supported) file format: the from_file()
method stores the corresponding date in an instance of the IOData
container,
# Read data from some internal (checkpoint) file and store it to the IOData
# container data
# ---------------------------------------------
data = IOData.from_file('checkpoint.h5')
Changing the file extension to one of the supported file formats mentioned above will steer PyBEST’s reading behavior.
5.4.1. Accessing the IOData
container
When reading data from disk using the from_file()
method, an instance of the IOData
container is
created and all data that is contained in the file is stored as attributes of
the container. Once constructed, you can access and modify the corresponding
attributes on the fly. This can be done in a similar manner as explained
in Dumping data to file. The code snippet below shows how to assign, update,
and delete attributes (all objects are defined in Naming conventions in PyBEST),
# Read internal checkpoint file
# -----------------------------
data = IOData.from_file("checkpoint.h5")
# Print all attributes that are contained in checkpoint file
# ----------------------------------------------------------
print("\ninternal file:")
print(data.__dict__)
# Modify data as you please
# -------------------------
del data.eri
print(data.__dict__)
5.4.2. Reading the internal h5 format
The example below, shows how to read an internal checkpoint file (see also previous
section), which ends
with the file extension .h5
,
# Read internal checkpoint file
# -----------------------------
data = IOData.from_file("checkpoint.h5")
# Print all attributes that are contained in checkpoint file
# ----------------------------------------------------------
print("\ninternal file:")
print(data.__dict__)
5.4.3. Reading an xyz file
The example below, summarizes all steps to read in molecular coordinates form
an xyz file using the file extension .xyz
. The corresponding
IOData
container stores the coordinates under
the attribute coordinates
(a np.array
), while the atoms are stored
as a list
(either str
or int
) under the attribute atom
,
# Read xyz file (atoms and coordinates only)
# ------------------------------------------
data = IOData.from_file("mol.xyz")
# Print all attributes that are read in from xyz file
# ---------------------------------------------------
print("\nxyz file:")
print(data.__dict__)
5.4.4. Reading a molden file
A detailed instruction on how to export orbitals to the molden format can be found in Generating molden files. The example below, briefly summarizes how to read in a molden file and how to access its corresponding attributes,
# Read molden file
# ----------------
data = IOData.from_file("water-scf.molden")
# Print all attributes that are read in from xyz file
# ---------------------------------------------------
print("\nmolden file:")
print(data.__dict__)
# Access attributes separately
coord = data.coordinates # np.array
factory = data.gobasis # Basis instance
atom = data.atom # list of str
orb_a = data.orb_a # orbitals
Once a molden file has been read in, you can, for instance, use the gobasis
attribute
to calculate some Hamiltonian matrix elements (see Computing the matrix representation of the Hamiltonian).
5.4.5. Reading a Hamiltonian in the FCIDUMP format
A detailed instruction on how to export a Hamiltonian into the FCIDUMP format can be found in FCIDUMP format. The example below, briefly summarizes how to read in some external Hamiltonian from a FCIDUMP file and how to access its corresponding attributes,
# Read FCIDUMP file
# -----------------
data = IOData.from_file("hamiltonian_mo.FCIDUMP")
# Print all attributes that are read in from FCIDUMP file
# -------------------------------------------------------
print("\nFCIDUMP file:")
print(data.__dict__)
# Access attributes separately
one = data.one # one-electron integrals
two = data.two # two-electron integrals
e_core = data.e_core # core energy
orb_a = data.orb_a # orbitals (assuming orthonormal orbitals)
olp = data.olp # overlap matrix (assuming orthonormal orbitals)
lf = data.lf # an instance of DenseLinalgFactory
5.4.6. Example Python scripts
Several complete examples can be found in the directory data/examples/iodata
.
5.4.6.1. Summary of all supported reading options
This is a basic example that summarizes all steps mentioned above, namely, how
to read and access data from the internal .h5
, the .xyz
, the .molden
,
and the FCIDUMP
format.
Note
This example will only works if you execute the dumping example here first.
from pybest.io import IOData
# Read internal checkpoint file
# -----------------------------
data = IOData.from_file("checkpoint.h5")
# Print all attributes that are contained in checkpoint file
# ----------------------------------------------------------
print("\ninternal file:")
print(data.__dict__)
# Modify data as you please
# -------------------------
del data.eri
print(data.__dict__)
# Read xyz file (atoms and coordinates only)
# ------------------------------------------
data = IOData.from_file("mol.xyz")
# Print all attributes that are read in from xyz file
# ---------------------------------------------------
print("\nxyz file:")
print(data.__dict__)
# Read molden file
# ----------------
data = IOData.from_file("water-scf.molden")
# Print all attributes that are read in from xyz file
# ---------------------------------------------------
print("\nmolden file:")
print(data.__dict__)
# Access attributes separately
coord = data.coordinates # np.array
factory = data.gobasis # Basis instance
atom = data.atom # list of str
orb_a = data.orb_a # orbitals
# Read FCIDUMP file
# -----------------
data = IOData.from_file("hamiltonian_mo.FCIDUMP")
# Print all attributes that are read in from FCIDUMP file
# -------------------------------------------------------
print("\nFCIDUMP file:")
print(data.__dict__)
# Access attributes separately
one = data.one # one-electron integrals
two = data.two # two-electron integrals
e_core = data.e_core # core energy
orb_a = data.orb_a # orbitals (assuming orthonormal orbitals)
olp = data.olp # overlap matrix (assuming orthonormal orbitals)
lf = data.lf # an instance of DenseLinalgFactory