Friday, 6th of September (2019)

Introduction

NanoString™ nCounter®
&
R packages

NanoString™ nCounter®

NanoString™ nCounter®

The nCounter® platform provides a simple and cost effective solution for multiplex analysis of up to 800 RNA, DNA, or protein targets from your precious samples.

  • Save Time
    • Expertly curated pre-formatted panels for human, mouse and non-human primate
    • ~15-minutes total hands-on time with no amplification, cDNA conversion or library prep required
  • Save Sample
    • Combine RNA, DNA, and protein panels for a comprehensive 3D Biology™ view of each sample
    • Optimized performance on difficult sample types including FFPE, tissue, lysates and biofluid samples

NanoString™ nCounter®

  • Save Resources
    • Advanced analysis tools included with system reduce the need for Bioinformatics support

NanoString™ nCounter® nSolver™

NanoString™ nCounter® nSolver™

nSolver™ alternatives?

Our alternative: NACHO

  • NACHO on CRAN (Since 2019-04-28; Next 2019-10)

(NAnostring quality Control dasHbOard)

Overview

NACHO (NAnostring quality Control dasHbOard) is developed for NanoString nCounter data.

NACHO is able to load, visualise and normalise the exported NanoString nCounter data and facilitates the user in performing a quality control.

NACHO does this by visualising in an interactive web application:

  • quality control metrics
  • expression of control genes
  • principal components
  • sample specific size factors

The functions

RCC files are summarised and visualised using two functions:

  • The summarise() function is used to preprocess the data.
  • The visualise() function initiates a Shiny-based dashboard that visualises all relevant QC plots.

NACHO also includes a function normalise(), which (re)calculates sample specific size factors and normalises the data.

  • The normalise() function creates a list in which your settings, the raw counts and normalised counts are stored.

Let’s play a bit with

Get some RCC files from GEO

In this example we use a mRNA dataset from the study of Bruce et al. (2015) with the GEO accession number: GSE70970

library(GEOquery)
gse <- getGEO("GSE70970")
targets <- pData(phenoData(gse[[1]]))
getGEOSuppFiles(GEO = "GSE70970", baseDir = ".")
untar(
  tarfile = "./GSE70970/GSE70970_RAW.tar", 
  exdir = "./GSE70970/Data"
)
targets$IDFILE <- list.files("./GSE70970/Data")

Load RCC files with summarise()

The housekeeping_genes and normalisation_method arguments respectively indicate which housekeeping genes and normalisation method should be used.

library(NACHO)
GSE70970_sum <- summarise(
  data_directory = "./GSE70970/Data", 
  ssheet_csv = targets, 
  id_colname = "IDFILE",
  housekeeping_genes = NULL,
  housekeeping_predict = TRUE,
  normalisation_method = "GEO",
  n_comp = 5
)

Load RCC files with summarise()

GSE70970_sum
#> List of 13
#>  $ access              : chr "IDFILE"
#>  $ housekeeping_genes  : chr [1:5] "hsa-miR-103" "hsa-let-7e" "hsa-miR-1260" "hsa-miR-500+hsa-miR-501-5p" ...
#>  $ housekeeping_predict: logi TRUE
#>  $ housekeeping_norm   : logi TRUE
#>  $ normalisation_method: chr "GEO"
#>  $ remove_outliers     : logi FALSE
#>  $ n_comp              : num 5
#>  $ data_directory      : chr "/home/travis/build/mcanouil/NACHO_slides/GSE70970/Data"
#>  $ pc_sum              :'data.frame':    5 obs. of  4 variables:
#>  $ nacho               :'data.frame':    198170 obs. of  112 variables:
#>  $ outliers_thresholds :List of 6
#>  $ raw_counts          :'data.frame':    792 obs. of  265 variables:
#>  $ normalised_counts   :'data.frame':    792 obs. of  265 variables:
#>  - attr(*, "RCC_type")= chr "n1"
#>  - attr(*, "class")= chr "nacho"

(re)Normalise with normalise()

NACHO allows the discovery of housekeeping genes within your own dataset.

NACHO finds the five best suitable housekeeping genes, however, it is possible that one of these five genes might not be suitable.

The discovered housekeeping genes are saved in a global variable named predicted_housekeeping.

GSE70970_sum[["housekeeping_genes"]]
#> [1] "hsa-miR-103"                "hsa-let-7e"                
#> [3] "hsa-miR-1260"               "hsa-miR-500+hsa-miR-501-5p"
#> [5] "hsa-miR-1274b"

(re)Normalise with normalise()

Let’s say "GEO" is not the best normalisation for our dataset and we want to use "GLM" instead.

GSE70970_norm <- normalise(
  nacho_object = GSE70970_sum,
  normalisation_method = "GLM", 
  remove_outliers = TRUE
)

(re)Normalise with normalise()

GSE70970_norm
#> List of 13
#>  $ access              : chr "IDFILE"
#>  $ housekeeping_genes  : chr [1:5] "hsa-let-7e" "hsa-miR-1260" "hsa-miR-1274b" "hsa-miR-103" ...
#>  $ housekeeping_predict: logi TRUE
#>  $ housekeeping_norm   : logi TRUE
#>  $ normalisation_method: chr "GLM"
#>  $ remove_outliers     : logi TRUE
#>  $ n_comp              : num 5
#>  $ data_directory      : chr "/home/travis/build/mcanouil/NACHO_slides/GSE70970/Data"
#>  $ pc_sum              :'data.frame':    5 obs. of  4 variables:
#>  $ nacho               :'data.frame':    115271 obs. of  112 variables:
#>  $ raw_counts          :'data.frame':    792 obs. of  155 variables:
#>  $ normalised_counts   :'data.frame':    792 obs. of  155 variables:
#>  $ outliers_thresholds :List of 6
#>  - attr(*, "RCC_type")= chr "n1"
#>  - attr(*, "class")= chr "nacho"

It’s Shiny time (visualise())

visualise(GSE70970_sum)
#> [NACHO] Custom "outliers_thresholds" can be loaded for later use with:
#>   outliers_thresholds <- readRDS("outliers_thresholds.rds")

What’s next ?

v0.6.0 & v1.0.0

v0.6.0 (Release: October 2019)

NACHO includes two (three) additional functions:

  • The render() function renders a full quality-control report (HTML) based on the results of a call to summarise() or normalise() (using print() in a Rmarkdown chunk).

  • The autoplot() function draws any quality-control metrics from visualise() and render().

The autoplot() function

The autoplot() function provides an easy way to plot any quality-control from the visualise() function.

  • "BD" (Binding Density)
  • "FoV" (Imaging)
  • "PC" (Positive Control Linearity)
  • "LoD" (Limit of Detection)
  • "Positive" (Positive Controls)
  • "Negative" (Negative Controls)
  • "Housekeeping" (Housekeeping Genes)
  • "PN" (Positive Controls vs. Negative Controls)
  • "ACBD" (Average Counts vs. Binding Density)
  • "ACMC" (Average Counts vs. Median Counts)
  • "PCA12" (Principal Component 1 vs. 2)
  • "PCAi" (Principal Component scree plot)
  • "PCA" (Principal Components planes)
  • "PFNF" (Positive Factor vs. Negative Factor)
  • "HF" (Housekeeping Factor)
  • "NORM" (Normalisation Factor)

The autoplot() function

autoplot(GSE70970_sum, x = "BD")

The autoplot() function

autoplot(GSE70970_sum, x = "Positive")

The autoplot() function

autoplot(GSE70970_sum, x = "Housekeeping")

The autoplot() function

autoplot(GSE70970_sum, x = "NORM")

The render() function

The render() function renders (using print(..., echo = TRUE) a comprehensive HTML report which includes all quality-control metrics and description of those metrics.

render(
  nacho_object = GSE70970_sum,
  colour = "CartridgeID",
  output_file = "NACHO_QC.html",
  output_dir = "./GSE70970/",
  size = 0.5,
  show_legend = TRUE,
  clean = TRUE
)

The print() function

The underneath function print() can be used directly within any Rmakrdown chunk, setting the parameter echo = TRUE.

print(
  x = GSE70970_sum, 
  colour = "CartridgeID", 
  size = 0.5, 
  show_legend = TRUE, 
  echo = TRUE, 
  title_level = 3
)

v1.0.0

  • Code rewrite and optimisation

  • Normalisation method template

(Centralised and Automated Reporting Tools )

What is CARoT?

CARoT (Centralised and Automated Reporting Tools) is an under development set of Quality-Control reporting tools and some other functions.

Currently CARoT includes the following main functions:

  • estimate_ethnicity() Compute the genomic component (ethnicity)
  • pca_report() Compute a principal component analysis
  • qc_idats() QC of methylation array
  • qc_plink() QC of genotyping array
  • qc_impute() QC of imputated genotyping array

References

Bruce, J. P., Hui, A. B. Y., Shi, W., Perez-Ordonez, B., Weinreb, I., Xu, W., … Liu, F.-F. (2015). Identification of a microRNA signature associated with risk of distant metastasis in nasopharyngeal carcinoma. Oncotarget, 6(6), 4537–4550. https://doi.org/10.18632/oncotarget.3005