When you run multiple molecular dynamic simulations at once, not always do the numbers remain stable or even align. It can be completely random. But what convergence really means is that over time the simulation, as it starts to organize, starts into a well-defined shape of what that system is.
So in my previous posts I talked about the
and looking at the probability density functions. Since we lump all the trials into one list we lose the convergence factor but look at hollistic features of the data.
Once we believe something is happening we want to check that it happened in each molecular dynamic trial. But these trials can run for nanoseconds and up to microseconds and depending on the timestep we could have a sea of data and it can be difficult to check convergence (Case and Point):
So, what do we do? Let’s just have some fun with something called a “Rolling Mean”. Which means every, let’s say 3 numbers, we take an average of that set.
numbers = [ 3, 4, 5, 7, 1, 2, 4, 3, 6 ]
rolling_mean = [ 4, 3.33, 4.33 ]
let’s now 5 numbers
rolling_mean = [4, 3.7ish]
So how do we employ this in with python?
Revisit some of my previous blog posts on getting this data (values, and timestamps) from MDAnalysis
from your simulation.
timestamps = []end_to_end_distance_values_1 = [3....2]
end_to_end_distance_values_2 = [3....5]
end_to_end_distance_values_3 = [3....2]iteration = no_efield.trajectory
for timestep in iteration[starting_frame:ending_frame:skip]:
ts = int(iteration.time) / 1000
timestamps.append(ts)
The window will be used at how many numbers you want to include in each average.
import pandas as pdwindow = 1000
df = pd.DataFrame()
df['no_electric_field_1'] = end_to_end_distance_values_1
df['no_electric_field_2'] = end_to_end_distance_values_2
df['no_electric_field_3'] = end_to_end_distance_values_3df['no_electric_field_mean_1'] =
df['no_electric_field_1'].rolling(window = window).mean() df['no_electric_field_std_1'] =
df['no_electric_field_1'].rolling(window = window).std()df['no_electric_field_mean_2'] =
df['no_electric_field_2'].rolling(window = window).mean() df['no_electric_field_std_2'] =
df['no_electric_field_2'].rolling(window = window).std()df['no_electric_field_mean_2'] =
df['no_electric_field_2'].rolling(window = window).mean() df['no_electric_field_std_2'] =
df['no_electric_field_2'].rolling(window = window).std()
Okay now that we have our data lets plot it. Assuming you will have more, like myself, let’s leave it as a subplot. They all share the same timeseries so let’s just put that on the x-axis, loop over by the window size and add the last value.
from plotly.subplots import make_subplots
import plotly.graph_objects as gocolors = ["#A800FF", "#0079FF", "#00F11D", "#FF7F00", "#FF0900"]fig = make_subplots(
rows=1, cols=1,
shared_xaxes=True,
shared_yaxes=True,
vertical_spacing=0.1,
)x1 = timestamps[::window] + [total_timestamps[-1]]
Y-Coordinates will be a little tricky in the sense. Because we want to check the rolling mean and the standard deviation but visualize it in a way it’s easy for us. So I came up with this design.
- First get all 3 coordinates
- Get all standard deviations
y1 = df['no_electric_field_1'].to_list()[::window] +[df['no_electric_field_mean_1' ].to_list()[-1]]
y2 = df['no_electric_field_2'].to_list()[::window] +[df['no_electric_field_mean_2' ].to_list()[-1]]
y3 = df['no_electric_field_2'].to_list()[::window] +[df['no_electric_field_mean_3' ].to_list()[-1]]error_y1 = df['no_electric_field_1'].to_list()[::window] + [df['no_electric_field_std_1'].to_list()[-1]]
error_y2 = df['no_electric_field_2'].to_list()[::window] + [df['no_electric_field_std_2'].to_list()[-1]]
error_y3 = df['no_electric_field_3'].to_list()[::window] + [df['no_electric_field_std_3'].to_list()[-1]]legend = False fig.add_trace(
go.Scattergl(
x=x1[1:],
y=y1[1:],
name='X',
mode='lines+markers',
line=dict(color=colors[0], width=3),
marker=dict(
size=20,
),
showlegend=legend,
error_y=dict(
type='data',
array=error_y1[1:]
)
),
row=1, col=1
) fig.add_trace(
go.Scattergl(
x=x1[1:],
y=y2[1:],
name='X',
mode='lines+markers',
line=dict(color=colors[0], width=3),
marker=dict(
size=20,
),
showlegend=legend,
error_y=dict(
type='data',
array=error_y2[1:]
)
),
row=1, col=1
)fig.add_trace(
go.Scattergl(
x=x1[1:],
y=y3[1:],
name='X',
mode='lines+markers',
line=dict(color=colors[0], width=3),
marker=dict(
size=20,
),
showlegend=legend,
error_y=dict(
type='data',
array=error_y3[1:]
)
),
row=1, col=1
)
And Voila!
As you can probably tell each dot is the rolling mean value and it’s std deviation. Each color an individual trial. The more these line up the better I can convince myself the system is converged. When you apply these plots on mass scale. You can start to see the bigger picture of your system.
Have fun analyzing!