-
Introduction to the R programming environment.
-
Basic Programing in R: Introduction to variables, data.frames lists, loops, conditional operators, functions.
-
Visualization of single/two-dimensional data using basic package and ggplot2
-
First steps in analysis of data - Loading, Filtering, subsetting, Looking at correlations. Statistical tests for differences (T-test, Wilcoxon, KS) and how they are done in R. Application to the RNA-seq data.
-
Control flow, conditionals, loops (for, apply, lapply, tapply, sapply,mapply), and functions
-
Text and regular expressions, grep, basic sequence analysis (seqLogo package)
-
Merging datasets - e.g. how to combine peaks from ChIP-seq with RNA-seq data in various ways.
-
Multi-dimensional data: normalization, clustering (hierarchical/biclustering), PCA, and visualization.
-
Bioconductor - introduction, some sample packages (SeqLogo? edgeR?). Models for significance of differential expression of RNA-seq data
-
Building interactive interfaces using Shiny
- Machine learning - what it means, what are classifiers. ROC curves. Feature selection. What to be aware of. How to train a simple SVM. Application on the RNA-seq data. Cross validation etc. Visualization of ROC curves.
- Modeling biological data.