CH391L/S14/StochasticGeneExpression

Introduction
Stochastic gene expression is the randomness and unpredictable pattern that occurs in gene transcription or translation mechanisms. The nature of stochastic gene expression is due to the small number of DNA or RNA molecule, which has only a few copies per cell. This makes gene expression a discrete and random biochemical reaction. As a result, gene expression can vary from one cell to another, making genetically identical twins to have same traits in different level. Scientist found the stochastic fluctuation, or “noise”, plays an important part in gene expression and regulation, and many different experimental methods were used to quantify the nature of noise. Two kinds of noise were introduced in earlier studies: intrinsic noise and extrinsic noise Michael2002. Extrinsic noise is caused by the fluctuation of concentration or location of cellular components that regulate or facilitate gene expression, such as polymerase and ribosome. These factors can vary from one single cell to another, thus different cells with identical genome can have different expression profiles. On the other hand, given two cells that have not only the same genome but also the same state of cellular components, the expression rate of a certain gene can still vary. This is due to the inherent randomness of biochemical reaction probabilistic nature of a microscopic event. By studying and comparing different expression distribution profile, scientists are able to analyze the gene regulatory mechanism and understand how two genes interacting with each other.

Experimental study of stochastic gene expression identification


The first study that observed stochastic gene expression was in 1957, when Novick and Weiner tried to induce beta-galactosidase in mouse cell Aaron1957. Instead of increasing the expression level in all cell, they found that only a proportion of the cells had an increased expression level. In 2002, Elowitz and Ozbudak used fluorescent proteins (CFP and YFP) to experimentally quantify differential gene expression in E. coli Michael2002, Ertugrul2002. They used the same promoter to express YFP and CFP and proposed the intrinsic and extrinsic noise concept by analyzing the gene expression level correlation between two fluorescent protein. They found that different promoters can have different effects on intrinsic or extrinsic noise, and while intrinsic noise can be diminished by increasing the mRNA copy number, the extrinsic noise has a more complicated profile corresponded to mRNA translation level.

Mathematical model of stochastic gene expression
Two models were proposed to discuss the rate of gene expression:

Poisson expression statistics
The simplest model of single gene expression is a first order reaction:

dm⁄dt = kR - γR*m

Where m is the concentration of expressed gene, kR is the reaction rare constant, and γR is the rate constant of degradation. However, this approximation is not applicable when the total number of m is small. In such case, the expression rate equation can be written as:

kR*P[m] = γR*(m+1)*P[m+1]

Where P[m] is the probability of having m transcripts in any cell, and kR*P[m] is the rate of producing an additional copy, which will equal to the rate of degradation from m+1 transcript if equilibrium is possible. The equation can only be possible when the distribution of m is a Poisson distribution, which can be showed with single-molecule assay to detect the precise location of individual mRNA molecule.



Two state model of gene regulation
The Poisson model is good at predicting constitutive genes but not regulated genes. The deviation from Poisson distribution can be contributed to the regulatory mechanism of the gene. The Fano factor, which is the ratio of variance (σ) and mean transcript copy number (M), is used to quantify the level of deviation from Poisson distribution. Fano factor is equal to one at Poisson distribution.

F = σ^2 ⁄ M

To model a two-state regulatory system, two constants Kon and Koff were introduced and defined as the transition rate of two states. Gene is available for translation at the on-state while the binding site for ribosome or transcription factor is blocked at the off-state. At the same mean copy number of RNA, the Fano number is close to 1 when Kon is much larger than Koff, and the model is similar to Poisson distribution. When Koff is larger than Kon, the gene is not available for most of the time, thus a translation burst (many transcripts are produced in a short period) can happen at this case and the Fano factor is large.

Understand gene regulatory mechanism with gene expression distribution profiles


The patterns of gene expression distribution can provide insight of gene regulatory mechanism. To quantify single gene expression level, single-molecule fluorescent in situ hybridization (smFISH) is used. mRNA molecules were quantified in intact cells and gene distribution profile can be built for different models. For example, in 2008 Zenklusen et al. found three housekeeping genes have an expression profile that matches to the constitutive gene expression model Daniel2008. Also in 2006, Raj et al. Arjun2006 had studied the gene expression profiles in mammalian cells and found the gene expression distribution match to the second type, the burst expression, of the two-state model.



Study the correlation of noise patterns between two genes can help the scientist to reveal the gene regulatory mechanisms. The expression distribution of two genes can be monitored at the same time in the same cell with the help of smFISH. In 2011, Gandhi et al. analyzed the correlation of expression patterns between genes which were regulated by the same or different promoters under steady state Saumil2011. They found that many genes were highly correlated with each other after induction; on the other hand, the constitutive genes were not as correlated as regulated genes. In another study, the same concept was employed to determine the causal relationship between two genes. In 2008, Dunlop et al. Dunlop2008 tested the dynamic correlation between two proteins X and Y for different regulatory motifs. They measured how the fluctuations of Y at different time points can relate with protein X, and this can help them to understand which molecule is the activator, whether the regulation is repressive or inductive, or both molecules are regulated by another molecule.

iGEM project
Many iGEM projects have included the stochastic gene expression model in their study to build a reliable gene expression model or network. For example, in 2012 the Columbia team used the stochastic model to build a gene-modified detection bacterium that can detect plant pathogen. In 2013 the BIT team also use the model to build the network in their quorum-sensing bacteria and RNA-thermometer system.