Implementation of adaptive free energy estimates in the ORAC program: the BAR-SGE method

We now describe how the machinery introduced in Section 6.3.1 can be employed in SGE simulation programs, such as ORAC. Suppose to deal with $ N$ ensembles of a generic $ \Lambda$-space, be it a temperature-space, a $ \lambda $-space, or even a multiple-parameter space. Without loss of generality, we order the ensembles as $ \Lambda_1 < \Lambda_2 < \dots <
\Lambda_N$. Thus, $ N-1$ optimal weights, $ \Delta f_{1 \rightarrow 2},
\Delta f_{2 \rightarrow 3}, \dots, \Delta f_{N-1 \rightarrow N}$, have to be estimated adaptively.

(1) At the beginning of the simulation we assign the system, i.e. the replica, to a randomly chosen ensemble and start the phase space sampling with the established simulation protocol (Monte Carlo or molecular dynamics). Note that several simulations may run in the generalized-ensemble space, each yielding an independent trajectory. Analogously to REM, a single simulated system will be termed ``replica''. In the ORAC program, we have arbitrarily decided to use the following criteria to distribute the replicas among the ensembles at the beginning of the SGE simulations. In Hamiltonian tempering simulations, if we deal with $ M$ replicas, we assign them to different ensembles with increasing order, from $ \Lambda_1$ to $ \Lambda_N$. If $ M > N$ then the $ (N+1)$th replica is assigned to $ \Lambda_1$ (as the first replica), the $ (N+2)$th replica to $ \Lambda_2$ (as the second replica) and so on. In SGE simulations performed in the $ \lambda $-space all replicas are assigned to $ \Lambda_1$ (see Section 10.2.11 for the definition of the $ \Lambda$ sequence). For the sake of simplicity, in the following presentation of the method we will take into account one replica alone. A discussion regarding multiple-replica simulations is reported in the final part of this section.

(2) Every $ L_a$ steps and for each ensemble $ n$, we store into memory the quantities $ W[n \rightarrow n+1]$ and $ W[n \rightarrow n-1]$, computed as described in Sec. 6.3.1. There is no well-established recipe in choosing $ L_a$, apart from the requirement that it should ensure (as large as possible) decorrelation between work values. During the simulation we must also record the number of stored $ W$ elements, $ N_{n \rightarrow n+1}$ and $ N_{n
\rightarrow n-1}$.

(3) Every $ L_b$ steps, such that $ L_b \gg L_a$ (three orders of magnitude at least), we try a free energy update on the basis of Eq. 6.25 or Eq. 6.27. The scheme we propose for $ \Delta f_{n \rightarrow n+1}$ follows.

(a) First of all we check if the conditions $ N_{n \rightarrow
n+1} > N^\prime$ and $ N_{n + 1 \rightarrow n} > N^\prime$ are met. In such a case Eq. 6.25 is applied (setting $ m =
n+1$) using the stored dimensionless works (see point 2). The threshold $ N^\prime$ is used as a control parameter for the accuracy of the calculation. In the ORAC program we have set $ N^\prime = {\rm
int}(L_b/L_a)$. Once $ \Delta f_{n \rightarrow n+1}$ is known, its square uncertainty is computed according to Eq. 6.26. Then we set $ N_{n \rightarrow n+1} =
0$ and $ N_{n + 1 \rightarrow n} = 0$ and remove $ W[n \rightarrow n+1]$ and $ W[n+1 \rightarrow n]$ from computer memory. Whenever a free energy estimate and the correlated uncertainty are computed, the optimal weight to be used in the acceptance ratio (Eq. 6.22) is determined applying standard formulas from maximum likelihood considerations (see Sec. 6.3.3). This step is realized for $ n = 1, 2,
\dots, N-1$.

(b) If the criteria needed to apply Eq. 6.25 are not met and no $ \Delta f_{n \rightarrow n+1}$ estimate is still available from point 3a, then we try to apply Eq. 6.27. In particular two independent estimates of $ \Delta f_{n \rightarrow n+1}$ are attempted. One comes from Eq. 6.27 by setting $ m =
n+1$, whereas the other comes from Eq. 6.27 applied in the reverse direction (replace $ n$ with $ n+1$ and $ m$ with $ n$ into Eq. 6.27). The two estimates will be invoked in the acceptance ratio of $ n \rightarrow n+1$ and $ n+1 \rightarrow n$ ensemble transitions, respectively (see next point 4). In the former case we need to resort to additional arrays (denoted as $ N_{n
\rightarrow n+1}^{\rm up}$ and $ W^{\rm up}[n \rightarrow n+1]$) to store $ N_{n \rightarrow n+1}$ and $ W[n \rightarrow n+1]$. Separate arrays are necessary because they are subject to different manipulation during the simulation. Specifically, if the condition $ N_{n \rightarrow n+1}^{\rm up} > N^\prime$ is satisfied, then we calculate $ \Delta f_{n \rightarrow n+1}$ via Eq. 6.27. This estimate is employed as such in the acceptance ratio. Then we set $ N_{n \rightarrow n+1}^{\rm up} = 0$ and remove $ W^{\rm up}[n \rightarrow n+1]$ from computer memory. The same protocol is used to calculate $ \Delta f_{n+1 \rightarrow n}$ from the quantities $ N_{n+1 \rightarrow n}^{\rm down}$ and $ W^{\rm down}[n+1 \rightarrow
n]$. The additional arrays introduced here are updated as described at point 2. Note that in this procedure the arrays of step 3a are neither used nor changed. Note also that the procedure described here corresponds to the way of calculating the finite free energy differences in free energy perturbation method[119].

(c) If none of the above criteria is met, then optimal weights are not updated and conventional sampling continues. Storage of dimensionless works as described at point 2 continues as well.

We point out that, if equilibrium is reached slowly (case of large viscous systems, or systems with very complex free energy landscape), then the replicas may tend to get trapped in limited regions of the ensemble space at the early stages of the simulation. This is basically due to initially inaccurate determination of $ \Delta f_{n \rightarrow n+1}$ from Eq. 6.25 (point 3a). If such an event occurs, then subsequent free energy estimates from Eq. 6.25 may become very rare or even impossible. However we can prevent this unwanted situation by passing to the updating criteria of point 3b when the criteria of point 3a are not met for a given (prior established) number of consecutive times (10 times in ORAC). When equilibrium will be approached, the criteria of point 3b will favor transitions of the replicas between neighboring ensembles and eventually the conditions to apply again the criteria of point 3a.

(4) Every $ L_c$ steps a transition $ (x,p)_n \rightarrow (x, p)_{n \pm
1}$ is attempted on the basis of the acceptance ratio of Eq. 6.22 and of the current value of $ \Delta f_{n
\rightarrow n \pm 1}$ (properly reweighted according to the equations reported in Sec. 6.3.3). If the estimate of $ \Delta f_{n
\rightarrow n \pm 1}$ is still not available from the methods described at points 3a and 3b, then the transition is not realized. The upward and downward transitions are chosen with equal probability.

It is worthwhile stressing again that the procedures of point 3b are only aimed to furnish a reliable evaluation of optimal weights when such factors are still not available from the bidirectional algorithm (point 3a) or when the system is get trapped in one or few ensembles (point 3c). Moreover, we remark that the free energy differences estimated via Eq. 6.27 tend to give larger acceptance rates in comparison to the exact free energy differences, thus favoring the transitions toward the ensemble that has not been visited. This is a well-known (biasing) effect of exponential averaging[128], leading to a mean dissipated (dimensionless) work artificially low. As a matter of fact this is a positive effect since it makes easier ensemble transitions during the equilibration phase of the simulation.

In the above discussion, we do not have mentioned the number $ M$ of (independent) replicas that may run in the space of the $ N$ ensembles. In principle, $ M$ can vary from one to infinity on the basis of our computer facilities. The best performance is obtainable if a one-to-one correspondence exists between replicas and computing processors. A rough parallelization could be obtained performing $ M$ independent simulations and then drawing the data from replicas at the end of the simulation to get an augmented statistics. However, the calculation of the optimal weights would be much improved if they were periodically updated on the fly on the basis of the data drawn from all replicas. This is just what ORAC does. In this respect we notice that our version of multiple-replica SGE algorithm is prone to work efficiently also in distributed computing environments. The phase of the simulation where information is exchanged is that described at point 3 (free energy calculation). It should be noted that, when a free energy estimate is performed, the work arrays stored for each replica/processor (see point 2) do not need to be communicated to all other replicas/processors. Only the sums $ \sum_{i=1}^{N_{n \rightarrow
m}} [\cdot]^{-1} - \sum_{j=1}^{N_{m \rightarrow n}} [\cdot]^{-1}$ (case of Eq. 6.25), $ \sum_{i=1}^{N_{n \rightarrow m}}
[\cdot]^{-1} + \sum_{j=1}^{N_{m \rightarrow n}} [\cdot]^{-1}$ (case of Eq. 6.26) and $ \sum_{i=1}^{N_{n \rightarrow m}}
\exp(- W_i[n \rightarrow m])$ (case of Eq. 6.27), together with $ N_{n \rightarrow m}$ and $ N_{m \rightarrow n}$, must be exchanged for all $ N-1$ ensemble transitions. Then each replica/processor ``will think by itself'' to reassemble the global sums. Exchanging one information implies to send $ M (M-1) (N-1)$ real/integer numbers through the net ($ \sim 60$ kB of information using 20 replicas and slightly less than 1 MB of information using 50 replicas). Only in the case of the iterative procedure of Eq. 6.25, one information has to be sent several times per free energy calculation (i.e., the number of iterations needed for solving the equation). The computational cost arising from computer communications can however be reduced updating the free energy rarely. Furthermore, in order to improve the first free energy estimate and hence to speed up the convergence, the $ M$ simulations should be started by distributing the replicas among neighboring ensembles, namely replica 1 to $ \Lambda_1$, replica 2 to $ \Lambda_2$ and so on (see also the discussion at the beginning of the current section).

procacci 2021-12-29