# CRG PhD Course 2017 Introduction to Statistics in R

From Bioinformatics Core Wiki

## Contents

### Course Description

This introductory course to exploratory data analysis and R is offered in 3 two-hour consecutive modules (please see Course Syllabus below), each consisting of a hands-on practicum in a computer class, using R Studio.

### Course Objectives

To introduce or to refresh the basic concepts of descriptive statistics and how they can be applied to real-life datasets using R. The students will produce their first scripts that can be re-used when they start analyzing their own data. Knowledge of statistics or R is not required for taking this course. However, familiarity with the material in the previous modules is recommended if the modules are not taken in a sequence.

### Course Instructors

- Sarah Bonnin (Module I, II) sarah.bonnin@crg.eu
- Julia Ponomarenko (organizer, Module II, III) julia.ponomarenko@crg.eu

### Time and Location

- Oct 3, 4, 5, 2017. 11:00 - 13:00. PRBB Building. Boinformatics classroom. 468. 4th floor. The hotel wing.

### Course Syllabus, Schedule, and Materials

#### MODULE I. Introduction to R. Oct 3, 2017.

- Introduction to R:
- What is R?
- Why to use R?

- R studio:
- Local installation
- Understand and explore panels

- Basics of R language:
- Simple arithmetic in R console
- Syntax
- Objects

- Functions in R:
- Use functions
- Get help
- Arguments in functions

- Data types and data structures in R
- Types: numeric, character, logical.
- Structures: vectors, data frames, matrices.

- Slides for Module I: Open the pdf file.

#### MODULE II. Descriptive Statistics & Plots in R. Oct 4, 2017.

- Part 1:
- Packages in R: find, install, load, explore/find functions and documentation, get help on functions.
- Basic plots in R: bar-plots, histograms, box-plots, scatter-plots.

- Part 2:
- Exploratory data analysis and descriptive statistical functions: summary, mean, sd, min, max, quantile.

- Materials:
- Module II, Part 1: Open the pdf file
- Module II, Part 2: Download the zipped html-file for the practicum.
- Slides: Lecture on the topic given at CRG in June 2017.

#### MODULE III. Introduction to Statistical Inference. Oct 5, 2017.

- Part 1:
- Input / Output: Reading data from a file and writing data in the file.

- Part 2:
- Continue on exploratory data analysis: plots.

- Materials:
- Part 1: Download the slides for Input/Output
- Part 1: Download the Davis_car.txt file
- Part 2: Use the zipped-file of the Module II.

### External Resources

- Nature Web-collection "Statistics for Biologists"
- 100 Statistical Tests.pdf - ResearchGate - just search Google to get a link
- Book "Basics of Statistics" by Jarko Isotalo
- "Introduction to Probability and Statistics using R" by G. Jay Kerns
- R Tutorials by William B. King
- Tutorials "R for basic statistics"
- Blog "R-bloggers"
- StatsBlogs
- Blog "Learning R"
- Blog "R you ready?"
- "R-statistics blog"
- Guide and tool for design and analysis of biological experiments from the UK's National Center for the Replacement Refinement and Reduction of Animals in Research (NC3R), covering topics of control for cofounding variables, sample size, effect size, a standardised effect size, power of statistical tests, multiple testing.
- Sample/effect size online calculators for designing biomedical experiments from UC San Francisco
- Self-paced online courses from UC Berkeley: Descriptive Statistics. Probability. Inference.
- Online book recommended for the UC Berkeley courses
- Self-paced online course "Explore Statistics with R"
- Online course from Stanford "An Introduction to Statistical Learning with Applications in R"
- Self-paced online course from Microsoft "Intro to R programming"
- Self-paced online course from Harvard "Statistics and R"
- Self-paced online course from Harvard "Statistical Inference and Modeling for High-throughput Experiments"
- The Seeing Theory website visualizes the fundamental concepts covered in an introductory college statistics, using D3.jc