NACHO.utf8

Friday, 6^th of September (2019)

Introduction

NanoString™ nCounter®
&
R packages

NanoString™ nCounter®

The nCounter® platform provides a simple and cost effective solution for multiplex analysis of up to 800 RNA, DNA, or protein targets from your precious samples.

Save Time
- Expertly curated pre-formatted panels for human, mouse and non-human primate
- ~15-minutes total hands-on time with no amplification, cDNA conversion or library prep required
Save Sample
- Combine RNA, DNA, and protein panels for a comprehensive 3D Biology™ view of each sample
- Optimized performance on difficult sample types including FFPE, tissue, lysates and biofluid samples

NanoString™ nCounter®

Save Resources
- Advanced analysis tools included with system reduce the need for Bioinformatics support

NanoString™ nCounter® nSolver™

nSolver 4.0 ressources:

nSolver™ alternatives?

NanoStringNorm
- On CRAN
- Since 2011-09-16
- Last 2017-11-10
nanostringr
- On CRAN
- Since 2019-03-15
- Last 2019-04-24

NanoStringDiff
- On Bioconductor
- Since 2016-04-23
- Last 2018-01-24
NanoStringQCPro
- On Bioconductor
- Since 2015-08-13
- Last 2018-04-10

Our alternative: NACHO

NACHO on CRAN (Since 2019-04-28; Next 2019-10)

(NAnostring quality Control dasHbOard)

Overview

NACHO (NAnostring quality Control dasHbOard) is developed for NanoString nCounter data.

NACHO is able to load, visualise and normalise the exported NanoString nCounter data and facilitates the user in performing a quality control.

NACHO does this by visualising in an interactive web application:

quality control metrics
expression of control genes
principal components
sample specific size factors

The functions

RCC files are summarised and visualised using two functions:

The summarise() function is used to preprocess the data.
The visualise() function initiates a Shiny-based dashboard that visualises all relevant QC plots.

NACHO also includes a function normalise(), which (re)calculates sample specific size factors and normalises the data.

The normalise() function creates a list in which your settings, the raw counts and normalised counts are stored.

Let’s play a bit with

Get some RCC files from GEO

In this example we use a mRNA dataset from the study of Bruce et al. (2015) with the GEO accession number: GSE70970

library(GEOquery)
gse <- getGEO("GSE70970")
targets <- pData(phenoData(gse[[1]]))
getGEOSuppFiles(GEO = "GSE70970", baseDir = ".")
untar(
  tarfile = "./GSE70970/GSE70970_RAW.tar", 
  exdir = "./GSE70970/Data"
)
targets$IDFILE <- list.files("./GSE70970/Data")

Load RCC files with `summarise()`

The housekeeping_genes and normalisation_method arguments respectively indicate which housekeeping genes and normalisation method should be used.

library(NACHO)
GSE70970_sum <- summarise(
  data_directory = "./GSE70970/Data", 
  ssheet_csv = targets, 
  id_colname = "IDFILE",
  housekeeping_genes = NULL,
  housekeeping_predict = TRUE,
  normalisation_method = "GEO",
  n_comp = 5
)

Load RCC files with `summarise()`

GSE70970_sum

#> List of 13
#>  $ access              : chr "IDFILE"
#>  $ housekeeping_genes  : chr [1:5] "hsa-miR-103" "hsa-let-7e" "hsa-miR-1260" "hsa-miR-500+hsa-miR-501-5p" ...
#>  $ housekeeping_predict: logi TRUE
#>  $ housekeeping_norm   : logi TRUE
#>  $ normalisation_method: chr "GEO"
#>  $ remove_outliers     : logi FALSE
#>  $ n_comp              : num 5
#>  $ data_directory      : chr "/home/travis/build/mcanouil/NACHO_slides/GSE70970/Data"
#>  $ pc_sum              :'data.frame':    5 obs. of  4 variables:
#>  $ nacho               :'data.frame':    198170 obs. of  112 variables:
#>  $ outliers_thresholds :List of 6
#>  $ raw_counts          :'data.frame':    792 obs. of  265 variables:
#>  $ normalised_counts   :'data.frame':    792 obs. of  265 variables:
#>  - attr(*, "RCC_type")= chr "n1"
#>  - attr(*, "class")= chr "nacho"

(re)Normalise with `normalise()`

NACHO allows the discovery of housekeeping genes within your own dataset.

NACHO finds the five best suitable housekeeping genes, however, it is possible that one of these five genes might not be suitable.

The discovered housekeeping genes are saved in a global variable named predicted_housekeeping.

GSE70970_sum[["housekeeping_genes"]]

#> [1] "hsa-miR-103"                "hsa-let-7e"                
#> [3] "hsa-miR-1260"               "hsa-miR-500+hsa-miR-501-5p"
#> [5] "hsa-miR-1274b"

(re)Normalise with `normalise()`

Let’s say "GEO" is not the best normalisation for our dataset and we want to use "GLM" instead.

GSE70970_norm <- normalise(
  nacho_object = GSE70970_sum,
  normalisation_method = "GLM", 
  remove_outliers = TRUE
)

(re)Normalise with `normalise()`

GSE70970_norm

#> List of 13
#>  $ access              : chr "IDFILE"
#>  $ housekeeping_genes  : chr [1:5] "hsa-let-7e" "hsa-miR-1260" "hsa-miR-1274b" "hsa-miR-103" ...
#>  $ housekeeping_predict: logi TRUE
#>  $ housekeeping_norm   : logi TRUE
#>  $ normalisation_method: chr "GLM"
#>  $ remove_outliers     : logi TRUE
#>  $ n_comp              : num 5
#>  $ data_directory      : chr "/home/travis/build/mcanouil/NACHO_slides/GSE70970/Data"
#>  $ pc_sum              :'data.frame':    5 obs. of  4 variables:
#>  $ nacho               :'data.frame':    115271 obs. of  112 variables:
#>  $ raw_counts          :'data.frame':    792 obs. of  155 variables:
#>  $ normalised_counts   :'data.frame':    792 obs. of  155 variables:
#>  $ outliers_thresholds :List of 6
#>  - attr(*, "RCC_type")= chr "n1"
#>  - attr(*, "class")= chr "nacho"

It’s Shiny time (`visualise()`)

visualise(GSE70970_sum)

#> [NACHO] Custom "outliers_thresholds" can be loaded for later use with:
#>   outliers_thresholds <- readRDS("outliers_thresholds.rds")

What’s next ?

v0.6.0 & v1.0.0

v0.6.0 (Release: October 2019)

NACHO includes two (three) additional functions:

The render() function renders a full quality-control report (HTML) based on the results of a call to summarise() or normalise() (using print() in a Rmarkdown chunk).
The autoplot() function draws any quality-control metrics from visualise() and render().

The `autoplot()` function

The autoplot() function provides an easy way to plot any quality-control from the visualise() function.

"BD" (Binding Density)
"FoV" (Imaging)
"PC" (Positive Control Linearity)
"LoD" (Limit of Detection)
"Positive" (Positive Controls)
"Negative" (Negative Controls)
"Housekeeping" (Housekeeping Genes)
"PN" (Positive Controls vs. Negative Controls)
"ACBD" (Average Counts vs. Binding Density)
"ACMC" (Average Counts vs. Median Counts)
"PCA12" (Principal Component 1 vs. 2)
"PCAi" (Principal Component scree plot)
"PCA" (Principal Components planes)
"PFNF" (Positive Factor vs. Negative Factor)
"HF" (Housekeeping Factor)
"NORM" (Normalisation Factor)

The `autoplot()` function

autoplot(GSE70970_sum, x = "BD")

The `autoplot()` function

autoplot(GSE70970_sum, x = "Positive")

The `autoplot()` function

autoplot(GSE70970_sum, x = "Housekeeping")

The `autoplot()` function

autoplot(GSE70970_sum, x = "NORM")

The `render()` function

The render() function renders (using print(..., echo = TRUE) a comprehensive HTML report which includes all quality-control metrics and description of those metrics.

render(
  nacho_object = GSE70970_sum,
  colour = "CartridgeID",
  output_file = "NACHO_QC.html",
  output_dir = "./GSE70970/",
  size = 0.5,
  show_legend = TRUE,
  clean = TRUE
)

The `print()` function

The underneath function print() can be used directly within any Rmakrdown chunk, setting the parameter echo = TRUE.

print(
  x = GSE70970_sum, 
  colour = "CartridgeID", 
  size = 0.5, 
  show_legend = TRUE, 
  echo = TRUE, 
  title_level = 3
)

v1.0.0

Code rewrite and optimisation
Normalisation method template

(Centralised and Automated Reporting Tools )

What is CARoT?

CARoT (Centralised and Automated Reporting Tools) is an under development set of Quality-Control reporting tools and some other functions.

Currently CARoT includes the following main functions:

estimate_ethnicity() Compute the genomic component (ethnicity)
pca_report() Compute a principal component analysis
qc_idats() QC of methylation array
qc_plink() QC of genotyping array
qc_impute() QC of imputated genotyping array

+33 (0) 374 00 81 29

mickael.canouil@cnrs.fr

References

Bruce, J. P., Hui, A. B. Y., Shi, W., Perez-Ordonez, B., Weinreb, I., Xu, W., … Liu, F.-F. (2015). Identification of a microRNA signature associated with risk of distant metastasis in nasopharyngeal carcinoma. Oncotarget, 6(6), 4537–4550. https://doi.org/10.18632/oncotarget.3005

Introduction

NanoString™ nCounter®&R packages

NanoString™ nCounter®

NanoString™ nCounter®

NanoString™ nCounter®

NanoString™ nCounter® nSolver™

NanoString™ nCounter® nSolver™

nSolver™ alternatives?

Our alternative: NACHO

(NAnostring quality Control dasHbOard)

Overview

The functions

Let’s play a bit with

Get some RCC files from GEO

Load RCC files with summarise()

Load RCC files with summarise()

(re)Normalise with normalise()

(re)Normalise with normalise()

(re)Normalise with normalise()

It’s Shiny time (visualise())

What’s next ?

v0.6.0 & v1.0.0

v0.6.0 (Release: October 2019)

The autoplot() function

The autoplot() function

The autoplot() function

The autoplot() function

The autoplot() function

The render() function

The print() function

v1.0.0

(Centralised and Automated Reporting Tools )

What is CARoT?

References

NanoString™ nCounter®
&
R packages

Load RCC files with `summarise()`

Load RCC files with `summarise()`

(re)Normalise with `normalise()`

(re)Normalise with `normalise()`

(re)Normalise with `normalise()`

It’s Shiny time (`visualise()`)

The `autoplot()` function

The `autoplot()` function

The `autoplot()` function

The `autoplot()` function

The `autoplot()` function

The `render()` function

The `print()` function