Running TCHInt calculations

TCHInt can be used in two modes:

  • Calculating a complete set of integrals for a given system.

  • Calculating specific integrals on the fly using partially evaluated integrals and resolution of identity, passing them to a calling code.

The first mode can also be used for standalone calculation of 2-body integrals of given potentials

Input files

Both the standalone program and the library use the following input files at runtime, they have to be present in the working directory:

  • input.molden: The molden file containing the geometry, basis set and MO coefficients. Symmetry is taken from the molden file.

  • parameters.*: parameter file for the Jastrow factor. Depending on the factor, it comes with .casl (CASINO), .QMCJF (NTChem) or .at (intrinsic) extension. Is not needed for 2-body potential integrals.

  • tchint.json: optional JSON file for configuring tchint. Has the following entries:

    • “mode”: defines how the integrals shall be evaluated. The available modes are:

      • “direct” (default): Precompute all integrals and store them in memory. The integrals are written to disk (see “storage” below) and read back at repeated calls.

      • “calc”: Always use on-the-fly integration when passing integrals to calling code. Partially evaluated precomputed components are written to disk and reused if available (see “storage” below).

      • “ri”: Use resolution of identity (RI) or on-the-fly integration when evaluating integrals for calling code, depending on the integral. Partially evaluated precomputed components as well as resolution of identity components are written to disk and reused if available (see “storage” below).

    • “grid_lvl”: PySCF’s grid density parameter, on a scale from 0 (coarse, default) to 9 (fine). A value of 2 ought to be accurate for most purposes.

    • “grid_prune”: PySCF’s grid pruning type, which can be set to “none” (default, recommended), “nwchem”, “sg1”, or “treutler” for different pruning algorithms.

    • “jastrow”: Jastrow factor implementation, can be set to:

      • “casino” (default): use the CASINO Jastrow factor included in TCHInt. Requires the parameters in the parameters.casl file.

      • “r4qmc”: use the NTChem/R4QMC factor. This is not included in TCHInt and requires linking with an installed NTChem library. The parameters are read from the parameters.QMCJF file.

      • “atomic”: use the internal TCHInt Jastrow factor implementation, requires the parameters.at parameter file, see The intrinsic Jastrow implementation for details.

      • “none”: use a dummy Jastrow factor implementation returning J=0, equivalent to having no Jastrow factor; this is mainly useful for debugging purposes.

    • “lmat_sparse”: whether to use sparse (true) or dense (false, default) storage for the L matrix. The sparse storage implementation uses twice as much memory per element as the dense storage, so the L matrix should be <50% dense for the sparse storage to provide a reduced memory footprint.

    • “lmat_threshold”: value below which L matrix elements are flushed to zero (1e-10 by default). NB, increasing this value will make the L matrix sparser. This keyword has no effect on L matrices loaded from HDF5 files - the sparsity is fixed by the value of “lmat_threshold” when the file was generated.

    • “ukmat_threshold”: value below which U+K matrix elements are flushed to zero (1e-10 by default).

    • “exclude_6index”: whether to exclude integrals in the TCDUMP where all 6 indices are distinct (false by default; i.e., these integrals are included by default). Note that all integrals connecting the reference determinant with its triple excitations are always included in the TCDUMP regardless of the value of this flag - in other words, integrals with 6 distinct indices 3 of which have values less than or equal to the number of occupied spatial orbitals are never omitted.

    • “nelec”: number of electrons in the system; set to 0 to autodetect from molden file (0 by default).

    • “spin”: number of up- minus number of down-spin electrons (0 by default if nelec is even, 1 if nelec is odd). Note that if nelec is 0 then spin is also autodetected from the molden file and the input value is ignored.

    • “norb_freeze”: number of low-energy spatial orbitals to freeze and electrons per spin to remove (0 by default).

    • “norb_delete”: number of high-energy spatial orbitals to delete (0 by default).

    • “pbc”: flags the use of periodic boundary conditions and contains the following fields:

      • “lattice”: a set of periodicity vectors of size dimensionality (these are both typically three) defining the lattice vectors

      • “units”: can be set to “bohr” (default) or “angstrom”, defining the units used for “lattice”

      • “mesh”: a set of 3 integers determining the number of Monkhorst-Pack FFT-mesh points to use in each direction

      • “grid_type”: selects which integration grid to request from PySCF - this can take the values “becke”, “uniform” or “molecular”

    • “integrals”: calculate the integrals of a 2-body potential instead of the transcorrelated Hamiltonian. Has two fields:

      • “type”: Type of the potential, available are “yukawa” and “erf”. Has to be present to do the potential integration

      • “tau”: Parameter of the potential, defaults to 1.0

    • “storage”: Set the format in which the full integrals are stored. If not specified, the format is decided depending on which files are present, or using the default value if no relevant files are present. Available modes are:

      • “hdf5”: HDF5 format (requires building with HDF5 support, in which case this is the default value of “storage”); precomputed integrals are stored in tcdump.h5 and intermediate integrals and RI components are stored in tcfactors.h5.

      • “ascii”: ASCII format; precomputed integrals are stored in TCDUMP and intermediate integrals and RI components are stored in tcfactors.

    • “write_bare_fcidump” : Toggles write of FCIDUMP.bare. Defaults to false.

    • “fcidump_storage”: Set the format in which the 0/1/2-electron integrals are stored. If not specified, the format is decided depending on which files are present, or using the default value if no relevant files are present. Available modes are:

      • “ascii”: ASCII format (default); bare integrals are read from FCIDUMP.bare and trascorrelated integrals are read from/written to FCIDUMP.

      • “trexio.ascii”: TREXIO in ASCII format (requires building with TREXIO support); bare integrals are read from fcidump.bare.trexio, and trascorrelated integrals are read

        from/written to fcidump.trexio.

      • “trexio.hdf5”: TREXIO in HDF5 format (requires building with TREXIO support, and having built TREXIO with HDF5 support); bare integrals are read from fcidump.bare.trexio.h5, and trascorrelated integrals are read from/written to fcidump.trexio.h5.

    • “xy_perf_level”: integer determining which implementation of the “integrals of the Jastrow factor” routine to use - can be either 0 for the unoptimized version or 1 for the optimized version (default). In tests, level 1 appear to be the optimal choice for code compiled with the Intel compiler against the Intel MKL library; note however that level 0 has been found to be advantageous if the GNU compiler and a generic BLAS library is used.

    • “kmat_perf_level”: integer determining which implementation of the K matrix evaluation routine to use - can be either 0 for the unoptimized version or 1 for the optimized version (default).

    • “lmat_perf_level”: integer determining which implementation of the L matrix evaluation routine to use - can be 0 for the unoptimized version, 1 for an optimized version which uses memory conservatively (default), or 2 for a fully optimized version with significant memory requirements.

    • “lmat_mean_field”: Toggles use of mean-field approximation for L matrix. Defaults to false. If true, mean-field contributions are added to 0-, 1- and 2-body integrals before write of FCIDUMP. The L matrix is never be explicitly evaluated and consequently no TCDUMP will be generated.

    • “spin_averaged_mean_field” : Toggles spin-averaged approximation in mean-field approximation of the L matrix, i.e. use of closed shell code even for open shell systems. Defaults to false.

    • “lmat_mean_field_force_open_shell” : Forces use of open shell code in mean-field approximation to L matrix even for closed shell system. Defaults to false.

    • “lmat_mean_field_dens_file” : File from which density for use in mean-field approximation to the L matrix is to be read. Uses Molpro matrop format. If not given, HF density is used.

    • “lmat_mean_field_perf_level”: integer determining which implementation of the mean-field contraction routines to use - can be 0 for the unoptimized version or 1 for the optimized version (default).

    • “mean_field_storage”: Set the format in which the 0/1/2-electron integrals after application of mean-field approximation to the L matrix are stored. Defaults to ASCII format. Available modes are:

      • “ascii”: ASCII format (default); bare integrals are read from FCIDUMP.bare and trascorrelated integrals are read from/written to FCIDUMP.

      • “npy” : NPY format; Header is written to FCIDUMP and integrals are written to .npy files specified in HEADER.

      • “npy_and_ascii” : NPY format and ASCII format; FCIDUMP includes header specifying .npy files containing integrals as well as integrals

    • “norb_mean_field_freeze”: number of low-energy spatial orbitals to freeze and electrons per spin to remove (0 by default) after application of mean-field approximation to L matrix. Incompatible with norb_freeze!=0 and norb_delete!=0.

    • “kmat_save_memory”: Boolean variable deciding whether to use a memory-saving strategy for the evaluation of two-body integrals which however doubles the number of Jastrow factor evaluations. The default is true. This only has an effect on the unoptimized routine, i.e., only when “xy_perf_level”: 0.

    • “auxbasis”: Auxiliary basis set for resolution of identity (RI) mode. Default is “weigend”.

    • “lmat_4ind_buffer”: Boolean variable determining whether to buffer all 4-distinct-index 6-index matrices in dense storage in on-the-fly operation modes. The default is true.

    • “lmat_6ind_buffer_size”: size in MiB per process of the sparse hash-table buffer of 5- and 6-index integrals for on-the-fly operation modes. Values <= 0 disable this buffer. The default is 100.

    • “lmat_CS_rel_thres”: magnitude of smallest Cauchy-Schwartz upper bound of matrix element (relative to largest) to be calculated in on-the-fly operation modes. Values <= 0 disable CS bound checks. The default is 1e-7.

    • “lmat_CS_dump”: whether to produce a dump of Cauchy-Schwartz upper bound of L matrix elements to a file called lmat_CS_dump.dat. The default is false.

    • “lmat_hist”: Boolean variable determining whether to produce a file “lmat.hist” with histogram information of 6-index matrix elements requested during an on-the-fly calculation. The default is false.

    • “memory_logfile”: name of file to log memory allocations to. The default is an empty string, which disables memory logging.

    • “split_nodes”: integer variable specifying how many “virtual nodes” to split each physical node into. This is inteded for debugging, enabling multi-node codepaths in single-node systems. The default is 1, which disables the functionality.

An example file to do resolution of identity with a cc-pVDZ auxiliary basis set and an intrinsic Jastrow on a medium grid could look like this:

{
  "jastrow" : "atomic" ,
  "auxbasis": "cc-pVDZ",
  "mode"    : "ri"     ,
  "grid_lvl": 4
}

The intrinsic Jastrow implementation

TCHInt comes with its own implementation of a Jastrow factor, following a general polynomial form including electron-electron, electron-nuclear and electron-electron-nuclear terms.

The corresponding parameter files have the following form:

en_str
r_1 r_2 ...
ee_str
e_1 e_2 ...
n_1 m_1 l_1 c_1 at_1_1 at_2_1
...
n_N m_N l_N c_N at_1_N at_2_N

where en_str is a string determining the form of the electron-nuclear terms, ee_str is a string determining the form of the electron-electron terms, r_i are parameters for the effective nuclear-electron distances and e_i the parameters for the effective nuclear-nuclear distances. At the moment, the following forms are supported:

Alternatively, the first four lines may be replaced by a single line r  e. In this case, it defaults exponential forms. Values of 0 lead to a polynomial effective distance instead of the exponential form.

The exponents n_i and m_i are the electron-nuclear exponents, the exponents l_i are those of the electron-electron factors and the coefficients c_i are the coefficients of the i-th term. The electron-nuclear distances of term i refer to atoms at_1_i and at_2_i, in the numbering of the input.molden file.