Calculating Ensemble Averages Using Configurations from All Ensembles (MBAR estimator)

As recently shown by Shirts and Chodera[107], all the configurations produced by a REM simulation of replicas, each characterized by a distribution function , can be effectively used to obtain equilibrium averages for any target distribution , using the so-called Multistate Bennett Acceptance Ratio (MBAR) estimator, which is illustrated in the following.

In the ORAC REM implementation, the most general distribution function for replica is given by Eq. 5.15. Given that for each replica one has saved configurations of the kind $\{x_1^m,...x_k^m,...\}$ , it can be easily shown that

$\displaystyle Z_n=\frac{\sum_{m=1}^M Z_mN_m^{-1}\sum_{k=1}^{N_m}\alpha_{nm}(x_k^m)P_n(x_k^m)}{N_n^{-1}\sum_{m=1}^M\sum_{k=1}^{N_n}\alpha_{nm}(x_k^n)P_m(x_k^n)}.$

(5.24)

Eq. 5.24 holds for any arbitrary bridge function $\alpha_{nm}(X)$ , In particular, choosing[107]

$\displaystyle \alpha_{nm}(X) = \frac{N_m Z_m^{-1} }{\sum_l^M N_l Z_l^{-1} P_l(X)}$

(5.25)

Eq. 5.24 transforms as

$\displaystyle Z_n=\sum_{k=1}^N\frac{P_n(x_k)}{\sum_{l=1}^MZ_l^{-1}N_lP_l(x_k)}$

(5.26)

where we have collapsed the two indices

, for the configurations, and

, for the replicas in one single index

running on all the configurations $N=\sum_m N_m$ produced by the REMD. Except for an arbitrary multiplicative factor, Eq. 5.26 can be solved iteratively for the partition function

. At the beginning of the process, one sets

for all

. . At the iteration

we have that

$\displaystyle Z_n(i+1) = \sum_k^{N} W_n(x_k,i)$

(5.27)

where the weights depend on the

calculated at the previous iteration

$\displaystyle W_n(x_k,i) = \frac{e^{-\beta {\bf c}_n\cdot {\bf v}(x_k)}} {\sum_l N_l e^{-\ln Z_l(i)-\beta {\bf c}_l\cdot {\bf v}(x_k)}}$

(5.28)

and we have used the definition of 5.15 for the replica distribution in ORAC . Once the

have been determined, $Z_{*}$ for an arbitrary distribution $P_{*}(X)$ can be calculated using the configurations sampled in the REMD simulation:

$\displaystyle Z_{*} = \sum_k^{N} \frac {P_{*}(x_k)}{\sum_l Z_l^{-1}N_l P_l(x_k)} = \sum_k^{N} W_{*}(x_k)$

(5.29)

Setting for example $P_{*} = P_{1}(X)*A(X)$ , where

is the target distribution $e^{-\beta V(x)}/Z$ (i.e. that of the replica 1) and

is an arbitrary configurational property, we obtain

$\displaystyle <A>_1 = \frac{\int P_{1}(X)A(X) dX}{Z_1} = \frac{\int P_{*}(X) dX}{Z_1} = \frac{ Z_{*}}{Z_1}$

(5.30)

From Eq. 5.29, we have that $Z_{*} = \sum_k^{N} W_1(x_k) A(x_k )$ and that $Z_{1} = \sum_k^{N} W_1 (x_k)$ . Substituting these results into Eq. 5.30, we obtain

$\displaystyle <A>_1 = \frac{\sum_k^{N} W_1(x_k) A(x_k)}{\sum_k^{N} W_1(x_k)}$

(5.31)

where the weights

for all sampled points in the REMD simulation are given by

$\displaystyle W_1(x_k,i) = \frac{e^{-\beta V(x_k) }} {\sum_l N_l e^{-\ln Z_l(i)-\beta {\bf c}_l\cdot {\bf v}(x_k)}}$

(5.32)

In summary, using the configurational energies from all replicas, we first solve iteratively the system 5.26 for all (except for a multiplicative factor), with $1\le n \ne M$ . In doing this, the weights (including ) are also determined. Finally configurational averages at, e.g., the target distribution can be determined using all the REMD configurations by means of Eq. 5.31.

procacci 2021-12-29