Difference between revisions of "BIST Introduction to Statistics 2016"

From Bioinformatics Core Wiki
(BIST "Introduction to Biostatistics" Course)
(BIST "Introduction to Biostatistics" Course)
Line 17: Line 17:
 
=== Course Syllabus, Schedule and Materials ===
 
=== Course Syllabus, Schedule and Materials ===
 
==== Module I. Descriptive statistics. May 6, 2016 ====
 
==== Module I. Descriptive statistics. May 6, 2016 ====
 +
* Lecture I. Exploratory data analysis: bar-plot, histogram, CDF, box-plot, scatter-plot, pie charts etc. Samples, measures of center and spread, percentiles, odds ratio. Outliers and robustness. Experiment versus observational study, confounding factors, simple random sample, other types of sampling, biases in sampling techniques.
 +
* Lecture II. Introduction to R programming language and R Studio: Data types, variables, packages, functions, handling files/scripts/projects.
 +
* Practicum: Basic plots in R using the ggplot2 package.
  
 +
==== Module II. Introduction to Probability. May 9, 2016 ====
 +
* Lecture. Independence, conditional probability, Bayes formula. Distributions, population mean and population variance, Binomial, Poisson, and Normal distribution. Central Limit theorem and the Law of large numbers. Continuity correction. Sampling with and without replacement. Correction for finite population size.
 +
* Practicum. Elementary probability problems in R, pdf and cdf functions, simulation explicating the law of large numbers.
  
==== External Resources ====
+
 
 +
=== External Resources ===
 
* [http://www.nature.com/collections/qghhqm Nature Web-collection "Statistics for Biologists"]  
 
* [http://www.nature.com/collections/qghhqm Nature Web-collection "Statistics for Biologists"]  
 
* 100 Statistical Tests.pdf - ResearchGate - just search Google to get a link
 
* 100 Statistical Tests.pdf - ResearchGate - just search Google to get a link

Revision as of 10:24, 28 April 2016

BIST "Introduction to Biostatistics" Course

Course Description

This introductory course to statistics and probability theory is modeled after the traditional university course Statistics 101 and will be given by the CRG staff and PhD students. The material is offered in 5 consecutive modules (please see Course Syllabus below), each containing a morning lecture and an afternoon practicum in a computer class. For practical exercises we will use R programming language and R Studio. However, this course is focused on statistics rather than R; therefore, each practicum is designed with the purpose to demonstrate and reinforce understanding of concepts introduced in the lecture rather than to provide a training in R.

Course Objectives

To introduce the basic concepts of statistics and probability and to demonstrate how they can be applied to real-life biological problems using R. Knowledge of statistics or R is not required for taking this course. However, familiarity with the material in the previous modules is recommend if the modules are not taken in a sequence.

Course Instructors

  • Dmitri Pervouchine (lectures) pervouchine@gmail.com
  • German Demidov (practicums III, V) german.demidov@crg.eu
  • Ande Gohr (practicum II) Andre.Gohr@crg.eu
  • Sarah Bonnin (lecture on R, practicum I) sarah.bonnin@crg.eu
  • Julia Ponomarenko (organizer, practicum III) julia.ponomarenko@crg.eu

Course Syllabus, Schedule and Materials

Module I. Descriptive statistics. May 6, 2016

  • Lecture I. Exploratory data analysis: bar-plot, histogram, CDF, box-plot, scatter-plot, pie charts etc. Samples, measures of center and spread, percentiles, odds ratio. Outliers and robustness. Experiment versus observational study, confounding factors, simple random sample, other types of sampling, biases in sampling techniques.
  • Lecture II. Introduction to R programming language and R Studio: Data types, variables, packages, functions, handling files/scripts/projects.
  • Practicum: Basic plots in R using the ggplot2 package.

Module II. Introduction to Probability. May 9, 2016

  • Lecture. Independence, conditional probability, Bayes formula. Distributions, population mean and population variance, Binomial, Poisson, and Normal distribution. Central Limit theorem and the Law of large numbers. Continuity correction. Sampling with and without replacement. Correction for finite population size.
  • Practicum. Elementary probability problems in R, pdf and cdf functions, simulation explicating the law of large numbers.


External Resources

Bioinformatics Core Facility @ CRG — 2011-2024