Apply dTRAM to REMD data #48

daviddaileo · 2015-08-21T23:00:52Z

I tested out dtram.py for long MD data at different thermodynamic states and it works very well. May I ask how to apply dtram.py to REMD data since the input simulation files have no way to tell the code that the samples consist of many short trajectories? Your help will be greatly appreciated.

franknoe · 2015-08-22T06:43:09Z

I think Christoph should answer how to do that technically. I just have
three general comments:

For REMD or generally multi-temperature simulations the problem is
that the bias energies or reweighting factors depend on the
instantaneous potential energy. Doing this with discretization-based
schemes such as dTRAM or WHAM requires you to discretize the potential
energy scale of the system finer than kT, which is often practically
infeasible because combined with the discretization used in the
configuration space this would create a ridiculous number of states. If
you are working under the global equilibrium assumption, the solution is
to replace WHAM by bin-less WHAM or MBAR and thus weigh each single
configuration sampled instead of working with histograms. Currently the
only TRAM solution in the published code is xTRAM, but I have recently
run into a problem with the initialization, which I'm going to look at
hopefully soon, so use this with care. We have a general and
statistically TRAM method coming up, and I hope that manuscript + code
will be available within one or two months.
If you are using dTRAM through the command-line interface you can
indicate the thermodynamic state and the configuration state of each
sampled configuration. If you read the data by temperature you have to
split them in short contiguous pieces, each getting the thermodynamic
state index of the corresponding temperature index (e.g. 0000 0000
0000 for pieces from the lowest temperature), in order to avoid counting
unphysical transitions between different replicas. If you read the data
by replica, this will be taken care of automatically even if you give
the input in long trajectories because switches in the thermodynamic
state index will mark the unphysical transitions (e.g. 0000 1111 0000).
There is a case that you cannot solve by TRAM, namely if you save
data less frequently than you exchange. In this case you don't know what
is a physical transition at the same thermodynamic state and what isn't,
because this information is usually not stored (you could be in
thermodynamic state 0 in two subsequently stored time-points, but you
might have actually transited through thermodynamic state 1 in between).
We have developed a hybrid TRAM/MBAR method for dealing with this
situation, but again code and manuscript are still coming up.

Am 22/08/15 um 01:00 schrieb Wei Dai:

I tested out dtram.py for long MD data at different thermodynamic
states and it works very well. May I ask how to apply dtram.py to REMD
data since the input simulation files have no way to tell the code
that the samples consist of many short trajectories? Your help will be
greatly appreciated.

—
Reply to this email directly or view it on GitHub
#48.

Prof. Dr. Frank Noe
Head of Computational Molecular Biology group
Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354
Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

daviddaileo · 2015-08-25T21:02:57Z

Thank you very much Frank! Your comments are very helpful. And I managed to make the dTRAM python code work for the REMD data. Then, I compared the results with the dTRAM c++ code that I wrote. They are kinda close but the difference is noticeable. I guess it could be due to some technical implementation, such as the choice of prior count (or sudo count), or how to deal with the NaN or inf results in the process of iteration.
As you pointed out in your comment #1, I did use UWHAM for the REMD data before. But UWHAM did not utilize the kinetic information at all. So I hope that dTRAM could tell me something more about the kinetics of the system even though I paid the price of less accurate thermodynamic information since the bias is treated the same for all samples in the same bin.
I do look forward to your general TRAM method! Thank you again.

fabian-paul · 2015-08-26T08:10:04Z

Hi @daviddaileo, thanks for testing our code!
I guess we didn't clearify in our publication how to deal with zero counts and zero lagrange multipiliers?
@daviddaileo: I you are getting infinities or NaN, there is something wrong. dTRAM is similar to estimating a MSM with a fixed stationary vector. We have discussed some numerical issues that have to be taken into account in our paper http://arxiv.org/pdf/1507.05990v1.pdf, Section III.C.3.
Is this covered by your C++ implementation?
I think also the set of MSM states must be restricted to the largest strongly connected component of the projected count matrix \sum_k C^{k}_{ij} where k runs over the thermodynamic states. Is this secured in your application?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apply dTRAM to REMD data #48

Apply dTRAM to REMD data #48

daviddaileo commented Aug 21, 2015

franknoe commented Aug 22, 2015

daviddaileo commented Aug 25, 2015

fabian-paul commented Aug 26, 2015

Apply dTRAM to REMD data #48

Apply dTRAM to REMD data #48

Comments

daviddaileo commented Aug 21, 2015

franknoe commented Aug 22, 2015

Mail: Arnimallee 6, 14195 Berlin, Germany

daviddaileo commented Aug 25, 2015

fabian-paul commented Aug 26, 2015