## Exercise 8 - Basic graphs # Exercise 8a – scatter plot # Create the script exercise8.R (in R Studio: File -> ) # and save it to the “Rintro/day4” directory: # you will save all the commands of exercise 8 in that script. # Remember you can comment the code using #. setwd("/nfs/users/bi/sbonnin/Rintro/day4") # 1. Read in file “/users/bi/sbonnin/gene_counts.txt” in object genes. # Note: this file contains a header. genes <- read.table("/users/bi/sbonnin/gene_counts.txt", header=T) # 2. Create a scatter plot showing sample1 (x-axis) vs sample2 (y-axis) of genes. plot(genes$sample1, genes$sample2) # 3. Change the point type and colors. # Note: see options pch and col. plot(genes$sample1, genes$sample2, col="blue", pch=3) # For col, you can use: # - integers (1, 2, 3 etc.) but limited to 8 different colors! # - or pick up the name of a color using: colors(); for color picked randomly, you can use sample(colors(), 1). # 4. Change axis labels to “Sample 1” and “Sample 2”. # Note: see options xlab and ylab. plot(genes$sample1, genes$sample2, col="blue", pch=3, xlab="Sample 1", ylab="Sample 2") # 5. Add a title to the plot. # Note: see option main. plot(genes$sample1, genes$sample2, col="blue", pch=3, xlab="Sample 1", ylab="Sample 2", main="scatter plot") # 6. Add a red vertical line at the median expression value of sample 1. Do it in two steps: # a. calculate the median expression of genes in sample 1. s1med <- median(genes$sample1) # b. plot a vertical line using abline(). plot(genes$sample1, genes$sample2, col="blue", pch=3, xlab="Sample 1", ylab="Sample 2", main="scatter plot") abline(v=s1med, col="red") # Note: plot(….) must be called before abline() is called. abline is not an argument but a function! # See abline help page. # Exercise 8b – bar plot + pie chart # 1. Read in file “/users/bi/sbonnin/gene_counts_significance.txt” in object de. # Note: this file contains a header. de <- read.table("/users/bi/sbonnin/gene_counts_significance.txt", header=T) # 2. The column updown describes whether a gene is up- (enriched) or down- (depleted) regulated, or not regulated (none).. # Produce a barplot that displays this information: how many genes are enriched, depleted, or not regulated. barplot(table(de$updown)) # 3. Color the bars of the boxplot, each in a different color (3 colors of your choice) barplot(table(de$updown), col=c("blue", "red", "grey")) # 4. Now use option names.arg in barplot() to rename the bars: change DEPLETED to Down, ENRICHED to Up, NONE to "Not significant" barplot(table(de$updown), col=1:3, names.arg=c("Down", "Up", "Not significant")) # 5. The las argument allows you to rotate labels for a better visibility. # Try value 2 for las: what happens? barplot(table(de$updown), col=1:3, names.arg=c("Down", "Up", "Not significant"), las=2) # 6. Create a pie chart of the same information (Enriched, Depleted, None) pie(table(de$updown)) # Note: Try arguments color, main and labels. pie(table(de$updown), col=1:3, main="pie chart", labels=c("Down", "Up", "Not significant")) # Exercise 8c – histogram # 1. Use genes object from exercise 9a to create a histogram of the gene expression distribution of sample 1. hist(genes$sample1) # 2. Repeat the histogram but change argument breaks to 50. hist(genes$sample1, breaks=50) # What is the difference? # 3. Color this histogram in light blue. # Note: there is color called “lightblue” hist(genes$sample1, breaks=50, col="lightblue") # 5. “Zoom” in the histogram: show only the distribution of expression values from 7 to 12 (x-axis). # Note: use xlim option. Adjust also ylim if necessary for a better visibility. hist(genes$sample1, breaks=50, col="lightblue", xlim=c(7, 12), ylim=c(0, 300)) # 6. Save plot in a pdf file. # a. Try with RStudio Plots window (Export) # b. Try in the console: pdf("myhistogram.pdf") hist(genes$sample1, breaks=50, col="lightblue", xlim=c(7, 12)) dev.off()