from phasic import Graph, with_ipv # ALWAYS import phasic first to set jax backend correctlyimport sysimport numpy as npimport pandas as pdimport matplotlib.pyplot as pltfrom matplotlib.colors import LogNormimport seaborn as snsfrom vscodenb import set_vscode_themenp.random.seed(42)set_vscode_theme()sns.set_palette('tab10')
Overriding theme from NOTEBOOK_THEME environment variable. <phasic._DeviceListFilter object at 0x146c14f10>
Consider this simple graph representing an exponential distribution. If you must, you can make it the coalescence time of two lineages in very small population:
Discrete phase-type distributions represent the the number of jumps in a Markov Chain before absorption. We can discretize our graph it to get a geometric distribution:
Overriding theme from NOTEBOOK_THEME environment variable. <phasic._DeviceListFilter object at 0x146c14f10>
The discretize method changes to graph in-place, by connecting each non-absorbing vertex in the graph to an auxiliary vertex with a specified rate. Each AUX vertex connects to its parent with rate 1. Once AUX vertices are added, the graph is normalized so the sum of edge weights sum to 1 for each vertex. discretize returns the graph indices of the added AUX vertices.
You can treat the discrete graph by passing discrete=False (omit the argument), in which case the returned distribution is over continuous time, rather than discrete jumps:
You can use many of the graph methods on both continuous and discrete graphs.
The methods moments, expectation, variance, covariance, pdf, cdf, distribution_context, sample, state_probability, and accumulated_occupancy will treat the graph as continuous unless the discrete=True keyword argument is passed. In this case moments and probabilities are computed across discrete jumps rather than across continuous time.
It is sometimes useful to interpret a discrete graph as a continuous distribution. However, passing discrete=True to a method called on a continuous will raise a ValueError, since continuous graph has not discrete interpretation.
Methods without the discrete keyword argument, either apply to both kinds of graphs (expected_waiting_time), or automatically redirect to functions handling each kind (normalize, reward_transform).
Tip
If you make the distribution discrete using discretize, it changes the is_discrete graph attribute to True.
NB: if you make a discrete graph manually yourself, you need to either set graph.is_discrete = True, or call the appropriate “*_discrete” methods directly.
To model discrete mutations with rate 0.1 along branches of the coalescent tree. We could of course just make a new state-space creation function, but we can also manipulate an existing continuous one.
E.g. if you want to get the distribution of accumulated mutations over time rather than jumps.
Let’s model discrete mutations with rate 0.1 along branches of the coalescent tree. We could of course just make a new state-space creation function, but we can also manipulate an existing continuous one.
nr_samples =4@with_ipv([nr_samples]+[0]*(nr_samples-1))def coalescent(state): transitions = []for i inrange(state.size):for j inrange(i, state.size): same =int(i == j)if same and state[i] <2:continueifnot same and (state[i] <1or state[j] <1):continue new = state.copy() new[i] -=1 new[j] -=1 new[i+j+1] +=1 transitions.append((new, state[i]*(state[j]-same)/(1+same)))return transitionsmutation_graph = Graph(coalescent)mutation_graph.plot()
Overriding theme from NOTEBOOK_THEME environment variable. <phasic._DeviceListFilter object at 0x146c14f10>
This process is automated by the discretize method, which accepts either a single rate of discrete events for all state (E.g. graph.discretize(0.7)), or a callable mapping each state vector to a rate. The following example produces the mutation graph above:
The auxiliary nodes added have dummy states with all zeros and are labelled as “AUX” vertices when the graph is plotted:
mutation_graph.plot()
Overriding theme from NOTEBOOK_THEME environment variable. <phasic._DeviceListFilter object at 0x146c14f10>
If you want, you can assign nodes to subgraphs either by index or by state, by passing a callable as the by_index or by_state argument. For example, we can group the auxiliary nodes we added above into a separate subgraph:
def fun(index):if rewards[index]:return"Auxiliary states\ncollecting a\nreward (carrot)\neach time visited"else:return"Standard states"mutation_graph.plot(by_index=fun)
Overriding theme from NOTEBOOK_THEME environment variable. <phasic._DeviceListFilter object at 0x146c14f10>
Figure 5: Grouping states into auxiliary and standard states.
To find compute the expected number of mutations, we apply rewards such that the mutation vertex get a reward of 1 and all other states get reward of 0. Since this is already encoded in the mutation_vertices array we made above, we can convert it to integers and pass it as rewards:
mutation_graph.expectation(rewards)
0.3666666666666667
We can verify that the expected number of mutations correspond to the total branch length scaled with the mutation rate:
Overriding theme from NOTEBOOK_THEME environment variable. <phasic._DeviceListFilter object at 0x146c14f10>
Notice that with this reward transformation the graph is no longer sparse, as all paths through the graph are represented.
mutations = np.arange(10)pmf = rev_trans_mutation_graph.pdf(mutations)cdf = rev_trans_mutation_graph.cdf(mutations)fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(8, 3), sharey=True)x = np.arange(0, 10, 1)ax1.bar(x, pmf)ax1.set_title("PDF")ax1.set_xlabel('Total number of carrots found')left, right = x, np.arange(1, x.size+1, 1)ax2.hlines(y=cdf, xmin=left, xmax=right, zorder=1)ax2.scatter(left, cdf, s=18, zorder=2)ax2.set_title("CDF")ax2.set_xlabel('Total number of carrots found')ax2.axhline(y=mutation_graph.defect(), linestyle='dotted') ;
(a) Blah blah blah blah…
(b)
Figure 6
Multivariate phase-type distributions (marginal)
We can compute marginal expectations using rewards. Instead of a univariate reward, we can have the distribution earn a vector of rewards for every time unit spent at each vertex. We can use the columns in our state matrix as reward vectors. Singleton rewards are: