Junk in – Junk out!
“A crucial prerequisite for a successful RNA-seq study is that the data generated have the potential to answer the biological questions of interest. This is achieved by first defining a good experimental design […] and second by planning an adequate execution of the sequencing experiment itself, ensuring that data acquisition does not become contaminated with unnecessary biases.” – Conesa et al; 2016
After making the decision to carry out gene expression profiling using RNA-seq, a researcher is faced with the complex but manageable task of setting up the experiment to ensure success. Any biological system requires careful planning to ensure quality input for a useful output, but RNA-seq requires a few unique considerations. We will look at this process in four broad steps.
1. What kind of RNA do you need
The first consideration is what type of RNA you need for the experiment. RNA comes in many forms, and the sample handling varies depending on the target. In almost every eukaryotic situation, it is important to eliminate ribosomal RNA from the sample. Ribosomal RNA is a functional molecule that constitutes part of the translational machinery. As such, it does not provide any information into what genes are actually expressed. However, it is of course still RNA, and so will show up in sequencing runs if not removed. Cofactor molecular biologist JT Forys notes that roughly 95% of the RNA in an unprocessed sample can be rRNA, swamping out the mRNA of interest.
Ribosomal RNA contamination can be reduced to negligible levels by pulling out the rRNA and then degrading it. Biotinylated probes are used to bind rRNA transcripts, and then those probe-rRNA complexes are removed from the sample using streptavidin beads. Other kits use complementary DNA sequences to bind rRNA, which is then degraded by RNAse H.
Conversely, messenger RNA can be selected from a pool of total RNA. In these cases, oligo-dTs attached to beads bind the poly(A) tails of mRNA. These hybridized complexes are then pulled out by isolating the beads. Poly(A) RNA can also be amplified to swamp out the rRNA. This is an efficient and specific way to amplify mRNA, but it carries the risk of creating a 3’ bias in the resulting sample. To some extent, random priming can reduce this bias, so some researchers now use a two-pronged approach of random priming to amplify mRNA and using one of the above methods to deplete rRNA. (For more on Poly(A) enrichment vs Ribosomal removal, see this post by Cofactor CSO Jon Armstrong.)
With the emergence over the past decade of non-coding RNAs as key biological regulators, many researchers are interested in profiling these molecules. Because they are not polyadenylated and may be present at low relative levels within a sample, whole-transcriptome amplification may be useful. Whole-transcriptome analysis returns information about both coding and noncoding RNAs, and can be used to identify novel transcripts as well as quantify transcript abundance.
2. How much RNA do you need
The other side of having the right kind of RNA is having enough of it. How much? Once again, the answer depends on the experimental goals. Quantities ranging from 100 pg to 500 ng can be used for RNA-seq. Any good RNA-seq experiment (and every one carried out by Cofactor) will include a quality control step. Sufficient starting material is necessary to include proper QC on top of the sequencing itself.
Standard sequencing of poly(A)-mRNA at Cofactor requires 500 ng or more of starting RNA, with an absolute minimum of 100 ng. The same is true for whole-transcriptome analysis of total RNA, which includes random priming for a comprehensive look at the RNA profile of a sample. While these assays are useful for obtaining the average gene expression profile of a sample, many researchers are investigating changes at the level of sub-populations of cells or even single cells. Fortunately, RNA-seq has evolved along with cell sorting technology, and it is possible to carry out single-cell RNA-seq. For researchers with minimal starting material, Cofactor’s picoRNA offering requires only 100 pg of input RNA.
Another consideration when thinking about the amount of starting material is how the RNA was obtained. Formalin cross-links RNA and can cause other damage. As a result, there is a bit more complexity to the input material, so higher quantities of RNA from formalin-fixed paraffin-embedded (FFPE) samples are required.
It is important to accurately measure the concentration of RNA prior to submitting it for sequencing. This is an obvious statement, but different platforms read the material in different ways. Forys notes that many researchers use the spectrophotometric NanoDrop system, which is a good starting point. Cofactor, though, uses Qubit, a fluorometric-based system that unlike NanoDrop can distinguish RNA from DNA.
3. How do you get it
As noted above, the processing method used for the tissue sample will affect how the RNA is extracted and its quality. In general, though, the number one priority is to get the RNA out and stabilized as quickly as possible. Once that has happened, downstream storage and processing is more manageable.
Extracting RNA from FFPE tissue is possible, but not ideal. Formalin fixation damages RNA and causes modifications that can prevent PCR amplification and/or reverse transcription. Reversing these modifications reduces the quantity and quality of input material. Indeed, up until just a couple of years ago there was significant concern that genetic material from samples preserved in formalin would not provide consistent, accurate and quantitative results. Formalin cross linking can alter the structure of nucleic acids, and the yield from these samples was low due to inefficient extraction protocols. However, in 2015, “Quality control analyses of RNA sequencing data from FFPE tumor samples indicate[d] results that are comparable with results from patient-matched FF tumor samples.” This result was broadly consistent with a study the year before showing decent consistency between sequencing data from FF and FFPE samples: “RNA-Seq data showed high correlation of expression profiles in FF/FFPE pairs (Pearson Correlations of 0.90 +/- 0.05), irrespective of storage time (up to 244 months) and tissue type.” Additionally, the ability to use FFPE is important for clinical and discovery projects simply because formalin fixation followed by paraffin embedding is the most common method of preserving tissue samples.
However, there are better options when designing an experiment exclusively for RNA-seq. Stabilizing RNA by flash-freezing the tissue in liquid nitrogen immediately after extraction is one. Another option is to place the tissue in RNALater or an equivalent product. In this case, the sample should be small enough that the reagent can rapidly penetrate the sample. Homogenizing the sample in RNAzol or TRIzol with RNAse inhibitors is another option to deactivate RNAses, denature the RNA and stabilize it for later use. Once the sample is stabilized, the RNA is stable in nuclease-free water or sodium citrate buffer.
Additionally, it is extremely important to include a DNase treatment in any extraction protocol. Because RNA-seq is actually sequencing of cDNA, any DNA contamination can cause lower read alignments in the final product.
Note: Special considerations apply to RNA extraction from flow-sorted cells, so contact Cofactor for further information about those experiments.
4. Tools to get it
There are many high quality kits and reagents available for RNA extraction and stabilization. Qiagen RNA extraction kits are highly recommended by the Cofactor team for general use. As noted above, RNAlater from Thermo Fisher works well to stabilize RNA for future processing. When dealing with FFPE samples, Ambion offers the RecoverAll Total Nucleic Acid Isolation kit. In general though, the key is reproducibility and reliability. Forys says, “find [a kit] that’s user friendly and gives you reproducible RNA quantity and quality.”
Ultimately, the goal is to provide as much material of the highest quality possible. For specific details on the various requirements needed to submit a sample to Cofactor, take a look at the sample submission guidelines. And of course, please contact a Cofactor project scientist with any questions.