When information becomes duplication in software, eventually the people that knew about the duplication will forget. And then the information will be changed — but at least one of the duplicates won’t be. This introduces bugs that are extremely difficult to track down.
The Don’t Repeat Yourself (DRY) principle is one that say a developer should systematically, and always, avoid duplication of information.
NECI is a terrible example of this, but it has been improved over time with a lot of effort.
Information is a very broad term. There are many types of duplication that can occur. A non exhaustive list of some types of duplication (and what can be done about them) follows.
Algorithm duplication across data types
There are many algorithms that are either the same, or similar,
across many different data types. The logic involved in these should
be written once.
Major examples are the sorting routines in sort_mod
, general
utilities in util_mod
, shared memory in shared_alloc
and MPI
routines in Parallel_neci
. Prior to implementing a generalised
quicksort there were 37 different sort routines in NECI, using
different sort methods, and containing different bugs.
This duplication should be controlled using templating, as described in section 1.7.
Logic duplication across source files
If the same chain of decision making is recurring in different
regions of the code, these should be abstracted into their own
subroutine which is called from each location. This prevents the
logic being duplicated, and then diverging.
Numerous bad examples of this still persist in NECI.
Duplication of data
Compile time constants should only be specified in one place. A
large proportion of these are found in the lib/cons_neci.F90
source file. Other examples include the layout of the bit
representations (BitReps.F90
). A significant proportion of these
vary depending on the compile configuration, and prior to collecting
them here the code was extremely fragile.
Ongoing cases which are problematic include the use of the literal
constant to specify output to stdout in statements such as
write(6,*)
, which doesn’t interact well with molpro
.
Duplication of representations in memory
It is important to have a well defined canonical representation of
data in memory. The same data should not be allowed to become
duplicated in multiple places.
Temporary arrays, with working data copied into them, should be clearly temporary and discarded as soon as not necessary. If the primary data shifts to a new location, the old storage should be deallocated (if possible), or damaged so that attempts to use it fail loudly (such as putting a value of into a variable that would normally hold an index).
Avoid situations where code might work by accident.
An ongoing situation of this type is the array variable nBasisMax
.
It shadowed a large number of global control variables, and there
are still locations in the code where its value is used in
preference to the global control value as these values diverge, and
its value is the one that works in some obsolete code.
There is one, major, exception to this rule. Code that only exists for
testing purposes (such as the contents of ASSERT
statements, or unit
tests) may be as explicitly duplicated as desired. Their purpose is to
explicitly flag up when anything elsewhere changes - so the risk of them
getting out of sync with the code base is their purpose.
One issue that becomes immediately obvious when repeated logic has been abstracted into specific functions is that conditional logic is executed every time certain actions are taken.
In many cases this is not very important, as it occurs high up the call hierarchy, and the controlled code consumes the vast majority of the execution time. However, the closer we get to the inner most tight loops, the more expensive repeated conditional logic becomes. This is particularly frustrating if the decision making is based on global control parameters, and thus always results in the same code path being taken in a simulation. If the decision lies against the branch prediction metrics, then this is especially bad.
The canonical example of this is accessing the 4-index integrals, which is performed very frequently.
In these case, it is a good idea to separate the decision making logic from the execution, so that the conditional logic is only executed at runtime. This can be done using procedure pointers (called function pointers, or similarly functors in other programming languages.
These require defining the “shape” of a function call (i.e. its
arguments, and return values) in an abstract interface
. A variable can
then be set to point at which of a range of functions with this
signature should be executed.
In NECI the global controlling procedure pointers are located in the
module procedure_pointer
, which contains both the abstract interface
definitions, and the actual pointer variables. These variables can then
be used as functions throughout the code.
These procedure pointers are largely initialised in the routine
init_fcimc_fn_pointers
, where decisions are made between types of
excitation generator, matrix element evaluation, etc. The procedure
pointers involved in integral evaluation are set in
init_getumatel_fn_pointers
.
The use of procedure pointers generates a strict Fortran 2003 dependency for NECI. We used to make use of a hacky abuse of the linker and templating system to implement function pointers without language support, but this was deprecated once compiler support for procedure pointers was reasonably widespread.