Skip to content

sdemiriz/yeast-transcriptomics

 
 

Repository files navigation

Transcriptomics in yeast

Yeasts (saccharomyces cerevisiae) are used in the production of some of our diet's most cherished choices such as bread (not mentioning wine and beer). There are many other biotechnology applications that use yeast, some of which include complex pharmaceuticals production. Yeasts are also great model organisms because of their simple and small genome consisting of approximately 6000 genes. Being single celled organisms also make them great for transcriptome analyses as gene expression is homogenous. As part of our Vancouver-based hackathon (hackseq19) project (October 18 to 20, 2019) we will be examining a yeast transcriptome dataset. In this data, a number of yeast strains have been treated with various stimuli, and gene expression changes are the readout. Expression is normalized to Transcript Per Million (TPM). As a team, we will be cleaning, analyzing, and communicating the results of our explorations through clear documentation of procedures and an interactive application.

Data

The data in this project includes gene expression values for 92 yeast strains treated with various stimuli. RNA expression levels are normalized to TPM (transcripts per million), following a default normalization procedure. Data is stored in data folder.

  • The SC_expression.csv file contains gene expression of yeast strains in the experiments.
  • The labels.csv files pertain to gene validation status and molecular function (MF), cellular component (CC), and biological processes (BP) of those genes.
  • The conditions_annotation.csv file explain the yeast strains and experimental conditions.

Data Source

This project is inspired by the yeast-omics dataset shared as a kaggle competition, and the original data can be found here.

Goal

The goal of this project is to unravel the genetic mechanisms involved in yeast stress adaptation through building a visualization platform (e.g. shiny application) that allows scientists to explore the data interactively.

The tasks we will attempt in the hackathon include:

  • Perform unsupervised and supervised clustering of expression data
  • Perform gene set enrichment analyses (GSEA, ReactomePA)
  • Create visualizations of results (heatmaps, barplots, etc.)
  • Phylogenetic/Taxonomic trees for yeast strains

Skills

You will need an introductory knowledge of R, RNA expression analysis workflows, design thinking, and coding skills. Coding in R (or Python) and basic knowledge of git/github is an asset. Team communications will happen in person and in slack.

Software

Please have the latest versions of R and R Bioconductor installed on your laptop prior to the workshop. RStudio (the free version) is highly recommended as well, since we will be teaching in this environment.

Suggested preparatory lessons

Hackathon Schedule

Friday, October 18, 2019

Time Event Location
8:30 AM Coffee and snacks LSI, UBC
9:00 AM hackseq kickoff LSI, UBC
9:30 AM Team meet + hacking LSI, UBC
11:30 AM Intro to Git/Github Workshop (Optional) room 1330
1:00 PM Lunch LSI, UBC
2:00 PM Continue hacking LSI, UBC
5:00 PM Team Scrum + wrap-up LSI, UBC
5:30 PM Unofficial Evening Social The Gallery, UBC

Saturday, October 19, 2019

Time Event Location
8:30 AM Coffee and snacks LSI, UBC
9:00 AM Hacking! LSI, UBC
12:00 PM Lunch LSI, UBC
2:00 PM Hacking… LSI, UBC
5:00 PM Team Scrum + wrap-up LSI, UBC
5:30 PM hackseq Evening Social BierCraft, UBC

Sunday, October 20, 2019

Time Event Location
8:30 AM Coffee and snacks LSI, UBC
9:00 AM Red-eye hacking LSI, UBC
12:00 PM Lunch LSI, UBC
2:00 PM Desperate bug-fix time LSI, UBC
3:30 PM Team Project Presentations LSI, UBC
4:30 PM Coffee Break LSI, UBC
4:30 PM Abcellera talk LSI, UBC
5:00 PM hackseq19 wrap-up LSI, UBC

Team Members

Noushin Nabavi, Matthew Emery, Alex Morin, Zuhaib Ahmed, Casey Engstrom, Sedat Demiriz, Shinta Thio, Saelin Bjornson

About

Hackseq Hackathon Project, UBC, October 2019

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 74.6%
  • TeX 22.6%
  • CSS 2.8%