diff --git a/.github/action-release-drafter.yml b/.github/action-release-drafter.yml
index 7de5b1ff7..0d0979115 100644
--- a/.github/action-release-drafter.yml
+++ b/.github/action-release-drafter.yml
@@ -10,12 +10,7 @@ categories:
change-template: '- $TITLE (#$NUMBER)'
change-title-escapes: '\<*_&' # You can add # and @ to disable mentions, and add ` to disable code blocks.
include-labels:
- - Oct-2020
- - Dec-2020
- - Mar-2021
- - Jun-2021
- - Sept-2021
- - Dec-2021
+ - Training-Release
template: |
## Updated
diff --git a/TrainingPlanV3.md b/TrainingPlanV3.md
new file mode 100644
index 000000000..d80d9877c
--- /dev/null
+++ b/TrainingPlanV3.md
@@ -0,0 +1,259 @@
+## Training and Evaluation Plan v3.0 / CFDE / December 2020
+
+**Training Plan v3.0 / CFDE / December 2020**
+
+
+## Introduction
+
+This Common Fund Data Ecosystem Coordination Center (CFDE-CC) training plan lays out our plan of action for 2021 as well as our overarching goals. Over the coming year, we will issue periodic reports that provide progress updates on our efforts, assessment and evaluation results, and detail the next steps for training.
+
+The goals of the CFDE training effort are threefold. First, we want to work with specific CFDE Data Coordinating Centers (DCCs) to develop and run DCC-specific and targeted cross-DCC-data set training programs to help their users make improved use of their data. Second, we want to provide broad-based training on data analysis in the cloud, to help the entire CFDE user base shift into a more sustainable long term approach. And third, we expect that broad and deep engagement with a wide range of users will help us identify new use cases for data reuse that can be brought back to the CF DCCs and the CFDE. **Collectively, our training program will train users in basic bioinformatics and cloud computing, help the DCCs lower their support burden, improve user experience, and identify new use cases for data reuse and data integration within and across DCCs.**
+
+In this training plan, we have no specific plans to interface with training efforts outside the Common Fund. However, we are aware of a number of training efforts with similar goals, including Broad’s Terra training program and ANViL’s training focus. The underlying technologies and approaches we are using in our trainings and materials (see below) are entirely compatible with these programs and are designed to allow access and re-use across efforts and teams.
+
+All training materials produced by the CFDE-CC will be made available through the central nih-cfde.org web site, under CC0 or CC-BY licenses, which will allow them to be used and remixed by any other stakeholders without limitations. Assessment and iteration on the materials will be carried out by the CFDE-CC’s training team during the pilot period, which we expect to be the first 1-2 months of development for any given lesson; we will engage with external assessment and evaluation as our efforts expand.
+
+The CFDE-CC’s training component is run by Dr. Titus Brown and Dr. Amanda Charbonneau, three training postdocs, and two staff training coordinators. The training component is closely integrated with the engagement plan, and we expect training to interface with user experience evaluation and iteration across the entire CF and CFDE as well as use case creation and refinement.
+
+## The training plan for 2021
+
+#### In-person training vs online training
+
+Our initial plan was to run a series of in-person workshops during 2020. However, our plan has pivoted to an online strategy because of the COVID-19 pandemic still sweeping the world. In particular, we expect there to be no in-person meetings for the foreseeable future. While we believe we can leverage online training effectively, setting up the system has required a great deal of experimentation with formats, technologies, and teaching styles. As of late 2020, we have started running pilot workshops to test our training materials in an online setting, and will continue to host larger and more workshops as we refine our lessons and teaching strategies.
+
+Online training is very different from in-person training. In our experience, in-person training offers a natural focus for many and can support an extended (~4-6 hrs/day) engagement with materials. Moreover, technology problems on the learner’s side can often be fixed by in-person helpers who have direct access to the learner’s computer. Finally, the intensity of in-person workshops combines well with the higher cost of travel: in the past we have successfully run many in-person workshops, lasting between 2 days and 2 weeks, where either the instructors or the students travel significant distances to attend the workshop.
+
+Online training requires different affordances. Learner attention span in the absence of interpersonal interaction is much shorter. Remote debugging is possible but much less effective than in-person debugging. And both instructors and learners must manage more technology, including teleconferencing software and chat, often on the same screen size as before. These challenges, among others, have limited the effectiveness of online training efforts, including MOOCs (Massive Open Online Course); several studies of MOOCs have shown that most learners drop out quickly, and that benefits gained have mostly been to those who already had experience with the material.
+
+In exchange for these downsides, online training offers some opportunities. By using asynchronous delivery of material, different schedules can be accommodated among the learners, and there is much more time for offline experimentation and challenge experiments. Moreover, online training can offer somewhat more scalability and can potentially be offered more cheaply, since it involves no travel or local facilities.
+
+#### Online lesson development approach
+
+We have transitioned our initial materials for in-person workshops to lessons that can be delivered online. For many lessons, we accomplished this by breaking lessons up into 5-10 minute video chunks, or “vidlets”, that showcase concepts and technical activities. These chunks can be viewed in “flipped classroom” or offline mode, and will be interspersed with opportunities for virtual attendees to seek technical help, explore their own interests, and ask questions in an individual or group setting. In some lessons, we opted for an entirely written approach, with a number of interactive text elements and screen shots. All training materials used for workshops are available online (https://training.nih-cfde.org/en/latest/) as written step-by-step tutorials, providing learners multiple ways to approach the material.
+
+In contrast to in-person materials, which require instructor notes but rely in large part on the presenter, both videos and screenshots, are laborious. A ~1 hour in-person lesson might take 2-3 days to develop and write out while the same lesson as an online walkthrough will likely take a week or more. An online lesson will likely require dozens of formatted screenshots as well as more detailed explanations and teaching tips to help users advance through the lesson. Vidlets may reduce the need for detailed documentation, but require a great deal of time-consuming planning and editing.
+
+These materials also require much more upkeep than in-person lessons. With in-person materials, changes to the Kids First interface, for example, would only matter if they changed the functionality of the portal, and lesson content for new features could generally be developed and added to a lesson without overhauling the materials. However, even minor color and placement changes to the Kids First interface can render our materials useless. Vidlets need to be completely re-recorded with nearly every update, and the screenshots from walkthroughs generally all need to be re-taken. For 2021, we are evaluating the pros and cons of each of these methods, as well as continually exploring new ways to deliver online content more efficiently.
+
+After our initial materials revamp, we have started to work on offering online 'in person' lessons via Zoom. We deliver each lesson within the training team, and then expand to groups outside our team. Each delivery is a walkthrough of an entire lesson with users and will result in an iteration to change the materials to reflect discussion during the walkthrough. After 2-3 iterations are delivered to beta users and CF program members, we will set up a formal registration system and encourage adventurous biomedical scientists to attend sessions.
+
+As of late 2020, we have tested two lessons, offered to a larger audience as pilot workshops. Here too, we are experimenting with the exact approach we will use. Online learning, especially for people with slow internet connections or limited screen size, can be extremely difficult. We expect to combine Zoom teleconferences, live streaming, and helpdesk sessions via our CFDE training helpdesk, but will continue to assess how well these are working for our learners and update accordingly. We are conducting assessments and evaluating our overall approach, as well as next steps for specific lesson development with each session.
+
+This lesson development approach is slow and cautious, and provides plenty of opportunity to improve the materials in response to lived experience of both instructors and learners. During the lesson development and delivery period, we will work closely with each partner DCC to make sure our lessons align with their best practices, as well as convey any technical challenges with user experience back to the DCCs in order to identify potential improvements in DCC portals. We expect to be able to develop 2-3 new lessons per website release, as well as updating existing content. Our [training website release plan](https://github.com/nih-cfde/training-and-engagement/blob/stable/docs/TrainingRepoReleasePlan/TrainingRepo-Release-Plan.md) provides a timeline for posting new tutorials. In addition to release timelines, the release plan describes the different stages between releases, the internal CFDE training material review process along with format and tags for the release documentation both for the public facing website and the GitHub repo.
+
+#### Assessment approach
+
+Our assessments for the coming year will focus on improving our impact by better understanding the needs of our learners, areas where our materials can be improved, and techniques for better online delivery of our materials. We will do extensive curriculum review following each training with the goal of improving both our materials and instruction. For each lesson, we will evaluate the training by applying a variety of formative and summative assessment techniques including within-training check-points, pre- and post-training surveys, live observation evaluations of lessons, and, in later 2021, remote interviews with learners and trainers both before and after training. We will also work with DCCs to measure continued use by learners as one of our longer-term metrics.
+
+Throughout 2021, we will issue periodic training reports that describe the lessons learned from each training, as well as summaries of anonymized survey results. The results from these reports will also be used to develop larger-scale instruments that we can use to standardize summative assessment.
+
+## Specific Activities
+
+#### 1. Collaborate with DCCs to build program-specific training materials
+
+We will work with DCCs to build training materials that help their current and future users make use of their data sets. Our primary goals here are to (a) create and expand materials for users, (b) offer regular trainings in collaboration with the DCCs, (c) provide expanded help documentation for users to lower the DCC support burden, and (d) work with the DCCs over time to refine the user experience and further lower the DCC support burden.
+
+In addition to developing DCC-specific training material, we will also test and provide bug reports for new DCC technology/infrastructure/tools and accompanying documentation for GTEx, Ex-RNA, IDG, LINCS, HuBMAP, SPARC, Kids First, and Metabolomics. For example, we plan to help LINCS by testing and reviewing of tutorials, documentation, and manuals. The materials will focus on access and use of LINCS resources in the cloud with use cases focused on combining data from other CF programs. We will also be testing use cases and generating associated tutorials for DCC developed resources.
+
+#### Tutorial development:
+
+We have begun a Whole Genome Sequencing (WGS) and RNAseq tutorial using data from Kids First and have worked with Kids First and Cavatica (the Kids First data analysis platform) to improve their interface so that it can be used for training. We have added multiple lessons on setup and use of the Kids First data portal, linking to Cavatica analysis platform, as well as uploading data to Cavatica. With Kids First, we have conducted two pilot trainings, and plan to both re-offer these and expand our trainings for 2021.
+
+As part of GTEx's 2021 workplan, we will be developing specific workshop materials, testing the materials, helping to host the workshops, and compiling workshop assessments for GTEx. One workshop will be aimed at using their gene expression datasets to conduct analysis and the second workshop will focus on java tutorials for visualization. We plan to develop and make public RNAseq tutorials for the Kids First/Cavatica platform as well as the GTEx/ANViL/Terra platform.
+
+The exact timelines for these lessons, and others, will depend on the schedules of the host DCCs, how we recruit participants, and how quickly our lesson development proceeds. We will be incorporating different pieces (user-led walk-throughs, video lessons, live virtual sessions) and assessing the materials and configurations to deploy the best possible lesson implementations.
+
+#### Tutorial Requirements:
+
++ Persistent, user-led walkthrough documents
++ Accompanying short videos of difficult sections
++ Materials are graduate level, research scientist-focused
++ Materials available at https://training.nih-cfde.org web site, under CC0 or CC-BY licenses
++ Lessons align with DCC best practices
++ Self-Guided Materials Assessment
+ + Elicited user feedback in the webpage interface
+ + Analysis of web analytics to determine user engagement
++ Instructor Guided Materials Assessment
+ + Materials contain breaks for checking understanding/formative assessment
+ + Pre-training surveys on prior knowledge on data sets and techniques, specific learning goals, and self-confidence;
+ + Post-training surveys on improved knowledge, learning goals, tutorial format and content, and use case gaps in the training materials.
+ + Conduct remote interviews with learners both before and after training
+ + Secure any approvals for human data collection
+ + Collect contact information from learners
+
+#### Vidlet Requirements:
+
++ Persistent video lessons
++ Videos are accessible
+ + Include written transcripts
+ + Include closed-captioning
++ Materials are graduate level, research scientist-focused
++ Materials available at https://training.nih-cfde.org web site, under CC0 or CC-BY licenses
++ Lessons align with DCC best practices
++ Assessment
+ + Elicited user feedback in the webpage interface
+ + Analysis of web analytics to determine user engagement
+
+#### 2. Develop general purpose bioinformatics training materials for the cloud
+
+We will develop online training materials for biomedical scientists that want to analyze data in the cloud. Many future NIH Common Fund plans for large scale data analysis rely on analyzing the data on remote-hosted cloud platforms, be they commercial clouds such as Amazon Web Services and Google Compute Platform (GTEx, Kids First) or on-premise hosting systems like the Pittsburgh Supercomputing Center (HuBMAP). Working in these systems involves several different technologies around data upload, automated workflows, and statistical analysis/visualization on remote platforms.
+
+Since most biomedical scientists have little or no training in these areas, they will need substantial support to take advantage of cloud computing platforms to do large scale data analysis.
+
+#### Tutorial development:
+
+We have run a pilot workshop for connecting to and running BLAST analysis on AWS and anticipate providing materials for several more workshops on cloud bioinformatics in 2021. These workshops consist of a number of different pieces including user-led tutorials, custom-made video lessons, virtual forums, and live teaching sessions. The exact timeline and number of workshops will depend on how we recruit participants and how quickly our lesson development proceed.
+
+On our training website, we have tutorials on setting up and connecting to a virtual computer using Amazon Web Services (AWS) and conducting basic bioinformatic analyses on AWS (Genome-wide association study, BLAST sequence similarity analysis). We also have a workflow management with Snakemake tutorial that runs on a Google Cloud Platform compute environment server (binder). We plan to add tutorials for setting up and connecting to the Google Cloud Platform in 2021.
+
+For workflows, there are two primary workflow systems in use, WDL (used by Terra) and CWL (used by Cavatica). At least one
+of these (and sometimes both) is supported by every CF program that uses cloud workflow systems. We will develop initial
+training materials for data-analysis focused on biomedical scientists to make use of these workflow systems, based on our existing workflow materials.
+
+For statistics/visualization, there are two commonly used analysis systems, R/RStudio and Python/Jupyter, that are used by
+almost all of the CF programs. We already have in-person training material for these systems, and will adapt them to online delivery.
+
+#### Tutorial Requirements:
+
++ Persistent, user-led walkthrough documents
++ Accompanying short videos of difficult sections
++ Materials are graduate level, research scientist-focused
++ Materials available at https://training.nih-cfde.org web site, under CC0 or CC-BY licenses
++ Lessons align with DCC best practices
++ Self-Guided Materials Assessment
+ + Elicited user feedback in the webpage interface
+ + Analysis of web analytics to determine user engagement
++ Instructor Guided Materials Assessment
+ + Materials contain breaks for checking understanding/formative assessment
+ + Pre-training surveys on prior knowledge on data sets and techniques, specific learning goals, and self-confidence;
+ + Post-training surveys on improved knowledge, learning goals, tutorial format and content, and use case gaps in the training materials.
+ + Conduct remote interviews with learners both before and after training
+ + Secure any approvals for human data collection
+ + Collect contact information from learners
+
+#### Vidlet Requirements:
+
++ Persistent video lessons
++ Videos are accessible
+ + Include written transcripts
+ + Include closed-captioning
++ Materials are graduate level, research scientist- focused
++ Materials available at https://training.nih-cfde.org web site, under CC0 or CC-BY licenses
++ Lessons align with community best practices
++ Assessment
+ + Elicited user feedback in the webpage interface
+ + Analysis of web analytics to determine user engagement
+
+#### 3. Develop CFDE internal training material
+
+As the CFDE community grows, there will be in increased need for training materials to guide members on how to work within the Ecosystem. These resources will cover a broad array of topics that relate to the CFDE project management infrastructure including: GitHub, ZenHub, Google, groups.io, Slack, and onboarding. We also need to offer tutorials that guide new members on how to update and improve the Ecosystem such as how to make changes in CFDE owned web-sites, how to create and join working groups, and tutorials on interacting with the CFDE search portal as a DCC collaborator and uploading new data to the CFDE portal.
+
+#### Tutorial development:
+
+Currently, we have a variety of materials for internal training, that are variously housed publicly in the training site, or in member-only portions of GitHub and our Google Drive. Our public training site has topics such as how to use Github branches, edit CFDE websites from Github, and contribute to the training website, as well as training on how to format data for inclusion in the CFDE Portal. The site also hosts a [style guide](https://cfde-training-and-engagement.readthedocs-hosted.com/en/latest/CFDE-Internal-Training/Website-Style-Guide/0index/)
+to ensure consistency across lessons on the training website. The style guide includes documentation for required lesson
+components and optional resources (vidlets, binder compute environments, demo Github repos), as well as format guidelines and template files. Internal resources such as how to create and report on working groups, how to interact with our project management system, and how to complete NIH reporting requirements are stored in non-public facing spaces, but are readily available to onboarded members of the CFDE. All of these resources are continually updated, and new tutorials are created whenever a member of the Ecosystem needs them.
+
+#### Tutorial Requirements:
+
++ Persistent, user-led walkthrough documents
++ Lessons align with CFDE best practices
++ Materials are laymen level
+
+#### 4. Develop training materials for public CFDE resources
+
+The CFDE portal is still in alpha release and not yet available to the general public, however in 2021 we expect that the portal will be more highly publicized and go into more widespread use. As the portal accumulates more users, there will be a greater need for trainings that are specific to this CFDE resource. We also anticipate that as usage increases and new use cases are discovered, the functionality of the portal will grow and change, requiring additional training resources.
+
+#### Tutorial development:
+
+We have created two lessons to demonstrate how to extract a manifest containing subsets of DCC metadata from multiple CF programs in the CFDE data portal, which can be used to subsequently search for the data at the originating CF data portal.
+
+
+#### Tutorial Requirements:
+
++ Persistent, user-led walkthrough documents
++ Accompanying short videos of difficult sections
++ Materials are graduate level, research scientist-focused
++ Materials available at https://training.nih-cfde.org web site, under CC0 or CC-BY licenses
++ Lessons align with DCC best practices
++ Self-Guided Materials Assessment
+ + Elicited user feedback in the webpage interface
+ + Analysis of web analytics to determine user engagement
++ Instructor Guided Materials Assessment
+ + Materials contain breaks for checking understanding/formative assessment
+ + Pre-training surveys on prior knowledge on data sets and techniques, specific learning goals, and self-confidence;
+ + Post-training surveys on improved knowledge, learning goals, tutorial format and content, and use case gaps in the training materials.
+ + Conduct remote interviews with learners both before and after training
+ + Secure any approvals for human data collection
+ + Collect contact information from learners
+
+#### Vidlet Requirements:
++ Persistent video lessons
++ Videos are accessible
+ + Include written transcripts
+ + Include closed-captioning
++ Materials are graduate level, research scientist- focused
++ Materials available at https://training.nih-cfde.org web site, under CC0 or CC-BY licenses
++ Lessons align with CFDE best practices
++ Assessment
+ + Elicited user feedback in the webpage interface
+ + Analysis of web analytics to determine user engagement
+
+
+#### 5. Engage CFDE members and other researchers for training and use case development
+
+In tandem with the specific workshops above, we will engage with biomedical scientists who are interested in reusing CF data. We will include members of the CF communities, biomedical scientists who attend our training sessions, and biomedical scientists recruited via social media for targeted discussions as well as to build an online forum. Discussions will be used to inform future use case development for data analysis and integration, as well as for continuing training engagement. GTEx in particular is in close contact with their end user community, and has suggested that their user base would be available for kickstarting this engagement.
+
+Although Common Fund Programs have many high level goals in common, they each have distinct user bases, specific mandates, and areas of expertise. As such, many of the people who work at these programs know little or nothing about the other programs. As one goal of the CFDE is to foster cross DCC collaboration on scientific projects, we are working to engage these programs with each other, and to help them find common goals and shared interests.
+
+#### Community development:
+
+Starting in September 2020, we began hosting weekly Cross Pollination events with DCCs to introduce the CFDE portal, discuss data harmonization, and allow conversation between Common Fund programs. These cross-pollination events are continuing on a monthly basis from December 2020 through 2021, with most talks being done by DCCs.
+
+In addition to Cross Pollination events, we have also created a framework for DCCs to create interest based working groups. These working groups will allow member DCCs to guide important CFDE decisions such as what terms should be included in the CFDE portal, and how to harmonize them across groups. To facilitate broad consensus building, we have also created a Request For Comments (RFC) system that allows any working group to write a short description of a standard technology or tool that they would like to be used broadly within the CFDE, so it can be distributed for consideration to all members.
+
+#### Community Requirements:
++ Online community space for learner engagement
+ + Formal registration system
+ + Code of Conduct
+ + Moderator Group
+ + DCC and other expert volunteers to answer questions
++ Promotion of materials
++ Promotion of online community
+
+#### Cross pollination Requirements:
++ Open to all Common Fund programs
++ Formal registration system
++ Code of Conduct
++ Promotion of materials
++ Promotion of online community
+
+#### 6. Develop an overarching assessment program
+
+The goals of our assessment program are to simultaneously improve our training and outreach offerings as well as improving the teaching techniques of our instructors. We will accomplish this by iteratively trying, adopting, and assessing new training technology and methods to improve specific trainings as well as overall training program and technology platforms. In addition to a variety of surveys, we plan to do open ended interviews with learners in late 2021. These interviews will leave space for open ended conversations to discover new challenges, unmet training needs, negatives and positives about current training efforts, and to discover things not covered by surveys.
+
+We will explore a number of technologies to measure within-lesson engagement and do formative assessment. While asynchronous online training challenges traditional “stop-and-quiz” approaches, low-stakes multiple-choice quizzes can be incorporated into online lessons easily and provide valuable feedback to learners and trainers. Faded examples that learners can fill in on their own time and submit via a common interface can be used to provide feedback asynchronously. More dynamic documentation, supporting both quizzes and executable code, could be used to provide engaging exercises. However, all of these require experimentation and evaluation in order to determine which choices work best within the context of the platforms we choose to host videos and tutorials. We will also assess overall confidence metrics for both “is this training potentially relevant/useful based on its description” and self-confidence in actualizing bioinformatics analyses. See https://carpentries.org/assessment/ for some examples. This experimentation is an ongoing part of our training work, and will be reviewed in the periodic 2021 training assessments.
+
+#### Assessment development:
+
+In our first few pilot workshops, we have used within-training check-points, pre- and post-training surveys, and live observation evaluations of lessons to gather information from our learners. While our results are currently still limited, we have used this information to make changes for re-running the pilots in 2021.
+
+#### _Metrics we are measuring and enhancing for collection and analysis of training data:_
+
++ For online workshops (now in pilot phase)
+ + Number of people that show initial interest in training (in progress)
+ + Number of return trainees within a lesson
+ + Number of return trainees across lessons
+ + Number of trainees that indicate interest in additional as-yet-undeveloped training events
++ For web sites/documentation
+ + Site visit metrics
+ + Page visit metrics
++ For forums
+ + Number of registrations
+ + Number of logins
+ + Number of posts
+ + Number of repeat engagements (e.g. follow ups to posts)
++ For videos
+ + Video watch statistics
+ + Video completion statistics
+ + Web site hosting stats
diff --git a/custom/assets/homepg-images/8.png b/custom/assets/homepg-images/8.png
new file mode 100644
index 000000000..f4d35ce09
Binary files /dev/null and b/custom/assets/homepg-images/8.png differ
diff --git a/custom/assets/homepg-images/9.png b/custom/assets/homepg-images/9.png
new file mode 100644
index 000000000..eb026e542
Binary files /dev/null and b/custom/assets/homepg-images/9.png differ
diff --git a/custom/assets/stylesheets/overrides.min.css b/custom/assets/stylesheets/overrides.min.css
index 06049e5b1..e12997f69 100644
--- a/custom/assets/stylesheets/overrides.min.css
+++ b/custom/assets/stylesheets/overrides.min.css
@@ -65,8 +65,13 @@
break-inside:avoid
}
+.md-announce {
+ background-color: #204060
+}
+
.md-announce a, .md-announce a:focus, .md-announce a:hover {
- color:currentColor;
+ color: currentColor;
+
}
.md-announce strong {
@@ -75,7 +80,7 @@
.md-announce .email {
margin-left: .6em;
- color:#C3E1E6
+ color:#ffffff
}
.md-announce .twitter {
margin-left: .6em;
@@ -94,8 +99,10 @@
[data-md-color-scheme=slate] .tx-container {
background: url("data:image/svg+xml;utf8,") repeat bottom, linear-gradient(to bottom, #336699, #33cccc 99%)
+
}
+
}
.content {
@@ -121,7 +128,7 @@
}
.content__bottom {
- align-items: flex-start;
+ align-items: stretch;
display: flex;
display: -webkit-flex;
flex-direction: row;
@@ -137,6 +144,16 @@
width: 100%;
}
+.data_content_left {
+ flex: 1 1 auto;
+ overflow-y: auto;
+ height: 400px;
+ padding-top: 10px;
+ padding-left: 15px;
+ padding-right: 15px;
+ padding-bottom: 5px;
+ float: center
+ }
@media (min-width:768px) {
.content__bottom {
@@ -199,8 +216,8 @@
.tx-hero__image {
margin-top: 1rem;
order: 1;
- max-width: 32rem;
- transform:translateX(6rem)
+ max-width: 25rem;
+ /*transform:translateX(1rem)*/
}
@@ -223,13 +240,14 @@
@media screen and (min-width: 76.25em) {
.tx-hero__image {
- transform:translateX(6rem)
+ transform:translateX(1rem)
}
}
html {
box-sizing: inherit;
+
}
*,
@@ -264,8 +282,8 @@ body {
.slider {
position: relative;
- width: 700px;
- height: 350px;
+ width: 50vw;
+ height: 80%;
max-width: 100vw;
margin: auto;
box-shadow: 0 2px 2px 0 rgba(0, 0, 0, 0.14), 0 1px 5px 0 rgba(0, 0, 0, 0.12),
@@ -288,7 +306,7 @@ body {
}
.slider input[type="radio"] {
- position: absolute;
+ position:absolute;
top: 0;
left: 0;
opacity: 0;
@@ -351,7 +369,7 @@ body {
display: flex;
justify-content: space-between;
- padding: 20px;
+ padding: 10px;
width: 100%;
height: 100%;
@@ -363,7 +381,10 @@ body {
.slide-content {
- width: 350px;
+ width: 30vw;
+ padding-left: 20px;
+ padding-right: 15px;
+
}
.slide-title {
@@ -396,9 +417,9 @@ body {
}
.slide-image img {
+ max-height: 80%;
max-width: 100%;
- max-height: 100%;
- padding-right:10%;
+
}
/* Slide animations */
@@ -437,6 +458,11 @@ body {
opacity: 1;
}
+#btn-8:checked ~ .slides .slide:nth-child(8) {
+ transform: translatex(0);
+ opacity: 1;
+}
+
#btn-1:not(:checked) ~ .slides .slide:nth-child(1) {
animation-name: swap-out;
animation-duration: 300ms;
@@ -483,6 +509,14 @@ body {
}
+#btn-8:not(:checked) ~ .slides .slide:nth-child(8) {
+ animation-name: swap-out;
+ animation-duration: 300ms;
+ animation-timing-function: linear;
+
+}
+
+
@keyframes swap-out {
0% {
diff --git a/custom/overrides/home.html b/custom/overrides/home.html
index 0059a1468..8b7dc5605 100644
--- a/custom/overrides/home.html
+++ b/custom/overrides/home.html
@@ -1,4 +1,5 @@
{% extends "overrides/main.html" %}
+
{% block tabs %}
{{ super() }}
@@ -26,74 +27,84 @@
+
+
+
New at CFDE Training
-
Explore our latest tutorial additions and discover lessons across a broad spectrum of technical topics.
+
Explore our latest tutorial additions and discover lessons across a broad spectrum of technical topics.
Introduction to the Kids First Data Resource Portal
Dec 17, 2020 • Virtual Workshop
Walk-through of the Kids First Portal to search, filter and visualize data followed by discussion session for next steps to use Kids First data in one's own research. Workshop Resources: Setup • Explore • Queries
-
-
-
Introduction to Amazon Web Services / Elastic Cloud Computing
Dec 1, 2020 • Virtual Workshop
A hands on workshop to introduce cloud computing concepts, create a virtual machine, and run a small job via Amazon Web Services (AWS). Workshop Resources: AWS • BLAST
-
Introduction to Amazon Web Services / Elastic Cloud Computing
Feb 25, 2021 • Virtual Workshop
A hands-on workshop to introduce cloud computing concepts, create a virtual machine, and run a small job via Amazon Web Services (AWS). Workshop Resources: AWS • BLAST
+
+
+
Introduction to Amazon Web Services / Elastic Cloud Computing
Jan 29, 2021 • Virtual Workshop
A hands-on workshop to introduce cloud computing concepts, create a virtual machine, and run a small job via Amazon Web Services (AWS). Workshop Resources: AWS • BLAST
+
+
+
Introduction to the Kids First Data Resource Portal
Dec 17, 2020 • Virtual Workshop
Walk-through of the Kids First Portal to search, filter and visualize data followed by discussion session for next steps to use Kids First data in one's own research. Workshop Resources: Setup • Explore • Queries
+
+
+
Introduction to Amazon Web Services / Elastic Cloud Computing
Dec 1, 2020 • Virtual Workshop
A hands-on workshop to introduce cloud computing concepts, create a virtual machine, and run a small job via Amazon Web Services (AWS). Workshop Resources: AWS • BLAST
+
+
diff --git a/custom/overrides/main.html b/custom/overrides/main.html
index c2b2d54fc..4f1ac403b 100644
--- a/custom/overrides/main.html
+++ b/custom/overrides/main.html
@@ -1,5 +1,4 @@
{% extends "base.html" %}
-
{% block announce %}
For suggestions and questions, contact us!
@@ -18,7 +17,6 @@
{% endblock %}
-
{% block footer %}
{% import "partials/language.html" as lang with context %}
{% endblock %}
+{% block analytics %}
+ {{ super() }}
+
+
+
+{% endblock %}
diff --git a/custom/partials/nav.html b/custom/partials/nav.html
deleted file mode 100644
index d3e23f63f..000000000
--- a/custom/partials/nav.html
+++ /dev/null
@@ -1,27 +0,0 @@
-{#-
- This file was automatically generated - do not edit
--#}
-{% set site_url = config.site_url | default(nav.homepage.url, true) | url %}
-{% if not config.use_directory_urls and site_url == "." %}
- {% set site_url = site_url ~ "/index.html" %}
-{% endif %}
-
diff --git a/docs/Bioinformatics-Skills/.pages b/docs/Bioinformatics-Skills/.pages
index a54431279..a42bad900 100644
--- a/docs/Bioinformatics-Skills/.pages
+++ b/docs/Bioinformatics-Skills/.pages
@@ -4,7 +4,9 @@ nav:
- CFDE Portal Tutorials: CFDE-Portal
- install_conda_tutorial.md
- Introduction_to_Amazon_Web_Services
+ - Introduction-to-GCP
- Command-Line-BLAST
- GWAS in the Cloud: GWAS-in-the-cloud
- Snakemake Workflow Management: Snakemake
- - Simulate_Illumina_Reads.md
\ No newline at end of file
+ - Simulate_Illumina_Reads.md
+ - RNAseq-on-Cavatica
diff --git a/docs/Bioinformatics-Skills/BLAST-Command-Line/BLAST1.md b/docs/Bioinformatics-Skills/BLAST-Command-Line/BLAST1.md
index 441aab312..259a5cb44 100644
--- a/docs/Bioinformatics-Skills/BLAST-Command-Line/BLAST1.md
+++ b/docs/Bioinformatics-Skills/BLAST-Command-Line/BLAST1.md
@@ -1,9 +1,14 @@
---
layout: page
title: BLAST Overview
+hide:
+ - toc
---
-# Running Command-Line BLAST
+
+Running Command-Line BLAST
+=============================
+
BLAST is the **B**asic **L**ocal **A**lignment **S**earch large sequence databases; It starts by finding small matches between the two sequences and extending those matches. For in-depth information on how BLAST works and the different BLAST functionality, check out the [resources page](https://blast.ncbi.nlm.nih.gov/Blast.cgi).
diff --git a/docs/Bioinformatics-Skills/CFDE-Portal/.pages b/docs/Bioinformatics-Skills/CFDE-Portal/.pages
index 80f258c60..f1d9b6e4a 100644
--- a/docs/Bioinformatics-Skills/CFDE-Portal/.pages
+++ b/docs/Bioinformatics-Skills/CFDE-Portal/.pages
@@ -1,3 +1,4 @@
nav:
+ - index.md
- Use Case 1: Blood-Cancer
- - Use Case 2: Movement-Related-Disorders
\ No newline at end of file
+ - Use Case 2: Movement-Related-Disorders
diff --git a/docs/Bioinformatics-Skills/CFDE-Portal/index.md b/docs/Bioinformatics-Skills/CFDE-Portal/index.md
index 13963de91..c6d3b793e 100644
--- a/docs/Bioinformatics-Skills/CFDE-Portal/index.md
+++ b/docs/Bioinformatics-Skills/CFDE-Portal/index.md
@@ -1,7 +1,13 @@
---
layout: page
+title: CFDE Portal Overview
+hide:
+ - toc
+---
+
+CFDE Portal Use Cases
+====================================================================
-**CFDE Portal Use Cases**
The [NIH Common Fund (CF)](https://commonfund.nih.gov) has funded a wide variety of data types and studies that are of interest to clinical and biomedical researchers, however those datasets are hosted on an equally large number of websites, with varying query systems. The [Common Fund Data Ecosystem (CFDE) Portal](https://app.nih-cfde.org) is a unified system for searching across the entire CF portfolio in a single search, and is the first step in addressing the goal of making CF data more [Find-able, Accessible, Interoperable and Reusable (FAIR)](https://www.nih-cfde.org/product/fair-cookbook/). The wide range of data types, models and formats used by Common Fund Programs are being harmonized using well-defined metadata, common controlled vocabularies using the [Crosscut Metadata Model](https://www.nih-cfde.org/product/cfde-c2m2/).
diff --git a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/aws_instance_setup.md b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/aws_instance_setup.md
index f6ca949d3..7840d1a14 100644
--- a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/aws_instance_setup.md
+++ b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/aws_instance_setup.md
@@ -6,46 +6,45 @@ title: Set up an AWS Instance
Setting up an AWS Instance
==========================
-Amazon offers a cloud computing platform called Amazon Web Services (AWS). AWS is not free, however, you receive the benefits of the Free Tier automatically for 12 months after you sign up for an AWS account. If you are no longer eligible for the Free Tier, you're charged at the [standard billing rate](https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/free-tier-eligibility.html) for your AWS usage.
+Amazon offers a cloud computing platform called Amazon Web Services (AWS). Please check out our [full lessons on AWS](../Introduction_to_Amazon_Web_Services/introtoaws1.md) for more details!
+
+AWS is not free, however, you can use the Free Tier for 12 months after you sign up for an AWS account. If you are no longer eligible for the Free Tier, you're charged at the [standard billing rate](https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/free-tier-eligibility.html) for your AWS usage.
Let's set up a Free Tier AWS Ubuntu instance!
!!! Important
AWS periodically updates its user experience. We strive to keep our tutorials up to date with AWS's constantly changing user interface. However, these updates may take some time to become incorporated into our tutorial. **Please watch this space for updates**.
- To follow along, please ensure that the "New EC2 User Experience" toggle located on the [EC2 instances list](https://us-east-2.console.aws.amazon.com/ec2/v2/home?region=us-east-2#Instances:sort=instanceId) page is set to "off".
- ![](./images-gwas/GWAS_General_AWS_Updates.png "AWS New EC2 Experience switch")
-
## Step 1: Create AWS account
-* Go to and click on the "create an AWS account" button located on the top right. If you have an existing AWS account, click the "Sign in to an existing AWS account" option below the "continue" option on the sign-up page and log in to your account as a root user.
+* Go to and click on the create an AWS account button located on the top right. If you have an existing AWS account, click the Sign in to an existing AWS account option below the continue option on the sign-up page. Log in to your account as a root user.
!!! note "New Account"
- * To create a new account, fill in your email, (create a) password and choose a name for your AWS account. Click "Continue".
+ * To create a new account, fill in your email, (create a) password and choose a name for your AWS account. Click Continue.
* On the next page, fill in your name, phone number and address. Check the AWS customer agreement box.
- * Once you click "Create Account and Continue", you will be redirected to a payment info page. Fill in your credit card info. Account approval/creation requires two factor authentication and may take a few mins (to hour or days). When you receive the code, enter it and click "Verify Code"
+ * Once you click Create Account and Continue, you will be redirected to a payment info page. Fill in your credit card info. Account approval/creation requires two factor authentication and may take a few mins (to hour or days). When you receive the code, enter it and click Verify Code
* You can now log in and launch an instance!
## Step 2: Configure and launch the virtual machine
-* Next, click on the "Launch a virtual machine" option as shown in the image:
+* Next, click on the Launch a virtual machine option as shown in the image:
![](./images-gwas/GWAS_General_Launch.png "Launch virtual machine")
### Step 2A: Select the right geographical region
-* For this tutorial, it is important to select the "Ohio" amazon machine image. The geographical region of your remote machine is displayed on the top right of this page (shown in image below)."
+* For this tutorial, it is important to select the Ohio amazon machine image. The geographical region of your remote machine is displayed on the top right of this page (shown in image below)."
![](./images-gwas/GWAS_General_aws_ohio.png "Machine location Ohio")
-* If it does not say "Ohio", click on the drop down arrow and select: `US East (Ohio)`.
+* If it does not say "Ohio", click on the drop down arrow and select: US East (Ohio).
![](./images-gwas/GWAS_General_aws_ohio_selection.png "Machine location dropdown menu")
### Step 2B: Select the right Ubuntu image
@@ -64,24 +63,31 @@ Let's set up a Free Tier AWS Ubuntu instance!
![](./images-gwas/GWAS_General_AWS_Free_Tier.png "t2micro instance type")
-Then click "Review and launch" --> "Launch". You should see a pop-up window like this:
+Then click Review and launch --> Launch. You should see a pop-up window like this:
![](./images-gwas/GWAS_General_KeyPair.png "AWS key pair")
* Key pair for AWS:
- - If this is your first time using AWS or creating a key pair: Choose the "Create a new key pair" option from the drop down menu. Under key pair name, type "amazon" and click "save". The default location for saving files on a Mac is the "Downloads" folder -- that's where your key pair can be found. **Next time you launch an instance, you can reuse the key pair you just generated.**
- - If you have a previously generated key pair, you can reuse it to launch an instance. For this tutorial, it may be helpful to rename the key pair "amazon.pem".
+ - If this is your first time using AWS or creating a key pair: Choose the Create a new key pair option from the drop down menu.
+ - For this tutorial, under key pair name, type "amazon" and click save.
+ - The default location for saving files on a Mac is the "Downloads" folder -- that's where you'll find your key pair "amazon.pem".
+ - **Next time you launch an instance, you can reuse the key pair you just generated by selecting Choose an existing key pair.**
+ - If you have a previously generated key pair, you can reuse it to launch an instance.
-* Then check the acknowledgement box and click "Launch Instance". You should see this:
+* Then check the acknowledgement box and click Launch Instance . You should see this:
![](./images-gwas/GWAS_General_launching.png "Launch status page")
-* Click on this first hyperlink: `i-038c58bfbe9612c57`. Your page should look like this:
+* Click on the Instance ID link (in the screenshot above, `i-038c58bfbe9612c57`) in the green box and on the next page, select your instance by checking the box. Your page should look like this:
![](./images-gwas/GWAS_General_aws_instances_list.png "Instance dashboard")
-* This page shows you a list of all your active instances. Users may launch as many instances as they wish. Just remember that every instance costs money if you don't quality for Free Tier.
+* This page shows you a list of all your active instances. You can launch multiple instances. Just remember that every instance costs money if you don't qualify for Free Tier.
+
+You have now successfully launched your AWS instance!
+
+!!! Important
-You have now successfully launched your AWS instance! You will need some information from this amazon webpage to access your AWS computer, so do not close the page yet. If you happen to close the webpage on accident, click on this link:
+ You will need some information from this amazon webpage to access your AWS computer, so **keep this page open**! If you happen to close the webpage by accident, click on this link:
diff --git a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/download_accessAWS.md b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/download_accessAWS.md
index 681d34fe5..1f2e27c55 100644
--- a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/download_accessAWS.md
+++ b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/download_accessAWS.md
@@ -62,11 +62,11 @@ OK, so you've created a [running computer on the cloud](aws_instance_setup.md).
The information you will need lives on the [AWS page that lists your active instances](https://us-east-2.console.aws.amazon.com/ec2/v2/home?region=us-east-2#Instances:).
-* On this webpage, select your instance of interest and click the "Connect" button on the top of the page.
+* On this webpage, select your instance of interest and click Connect on the top of the page.
![](./images-gwas/GWAS_General_publicDNS.png "Connect to instance button")
-* A pop up window will appear. Copy the line of code under "Example:", starting with the `ssh` command.
+* On the new page, select the SSH tab (as shown in the image) and copy the line of code under "Example:", starting with the `ssh` command.
![](./images-gwas/GWAS_General_aws_connect_your_instance.png "ssh command")
@@ -95,7 +95,7 @@ The information you will need lives on the [AWS page that lists your active inst
![](./images-gwas/GWAS_General_AWS_Connected.png "instance terminal")
!!! Note
- My terminal window is black, but yours may not be! Users can [customize their terminal](https://www.maketecheasier.com/customize-mac-terminal/) by right clicking on the terminal window and selecting "Inspector". I've chosen the "Pro" theme.
+ My terminal window is black, but yours may not be! Users can [customize their terminal](https://www.maketecheasier.com/customize-mac-terminal/) by right clicking on the terminal window and selecting Inspector. I've chosen the Pro theme.
* You have now successfully logged in as user "ubuntu" to the machine "ec2-18-216-20-166.us-east-2.compute.amazonaws.com" using the "amazon.pem" authentication key.
@@ -180,7 +180,7 @@ You need lots of other helper utilities to run today's pipeline so now is a good
```
sudo apt-get install autoconf autogen g++ gcc make automake pkg-config zlib1g-dev curl gdebi-core -y ghostscript-x
```
-
+
!!! Important
Installing helper utilities is VERY important. All sorts of errors in installations/plotting happen if it's not run! For example, you will install "vcftools" in later parts of this tutorial which absolutely needs "autoconf", "autogen", and "make" to be preinstalled. zlib is a library implementing the deflate compression method found in gzip and PKZIP. gdebi lets you install local deb packages resolving and installing its dependencies. And to run plotting functions in R, you will need Ghostscript, an interpreter of the PDF format.
diff --git a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_Terminate_AWS.png b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_Terminate_AWS.png
index f81c6c85a..c4de89e12 100644
Binary files a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_Terminate_AWS.png and b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_Terminate_AWS.png differ
diff --git a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_aws_connect_your_instance.png b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_aws_connect_your_instance.png
index cee9ca3a5..e723dbddb 100644
Binary files a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_aws_connect_your_instance.png and b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_aws_connect_your_instance.png differ
diff --git a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_aws_instances_list.png b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_aws_instances_list.png
index c863417bd..01017b07b 100644
Binary files a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_aws_instances_list.png and b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_aws_instances_list.png differ
diff --git a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_publicDNS.png b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_publicDNS.png
index 5dd732908..f03698902 100644
Binary files a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_publicDNS.png and b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/images-gwas/GWAS_General_publicDNS.png differ
diff --git a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/index.md b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/index.md
index 831a034b0..753d82ad5 100644
--- a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/index.md
+++ b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/index.md
@@ -1,10 +1,12 @@
---
layout: page
title: GWAS Tutorial Overview
+hide:
+ - toc
---
How to do GWAS in the cloud using Amazon Web Services
-=====================================================
+============================================
**Genome-wide association studies (GWAS)** offer a way to rapidly scan entire genomes and find genetic variation associated with a particular disease condition.
diff --git a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/terminate_aws.md b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/terminate_aws.md
index 34af6a29d..5dd30ac0d 100644
--- a/docs/Bioinformatics-Skills/GWAS-in-the-cloud/terminate_aws.md
+++ b/docs/Bioinformatics-Skills/GWAS-in-the-cloud/terminate_aws.md
@@ -12,9 +12,8 @@ When you are done with all the analyses, be sure to terminate the AWS instance.
Upon termination, you will lose all installations and data. Be sure to download all useful data before you terminate the instance!
- Log in to AWS and navigate to the instances page. Then select the instance and click
+ Log in to AWS and navigate to the instances page. Then select the instance and click on Instance State and Terminate instance. Then click Terminate
- "Actions" --> "Instance State" --> "Terminate"
![](./images-gwas/GWAS_General_Terminate_AWS.png "Terminate instance")
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/.pages b/docs/Bioinformatics-Skills/Introduction-to-GCP/.pages
new file mode 100644
index 000000000..5c1e4c3ce
--- /dev/null
+++ b/docs/Bioinformatics-Skills/Introduction-to-GCP/.pages
@@ -0,0 +1,6 @@
+nav:
+ - index.md
+ - gcp1.md
+ - gcp2.md
+ - gcp3.md
+ - gcp4.md
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp1.md b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp1.md
new file mode 100644
index 000000000..be330926e
--- /dev/null
+++ b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp1.md
@@ -0,0 +1,93 @@
+# Setting up GCP account
+
+In this tutorial, you will learn how to create your own Google Cloud Platform (GCP) billing account using your Google account and personal credit card.
+
+!!! important "Chrome Web-browser"
+
+ Please use the [**Chrome web browser**](https://www.google.com/chrome/) for setting up and connecting to a GCP virtual machine. Some GCP features do not work on other web browsers.
+
+## Step 1: Sign in to GCP
+
+- Open a Chrome web browser and go to [https://cloud.google.com/](https://cloud.google.com/)
+
+- Click Sign in on the top right corner and sign in with a Google account
+
+![](./gcp_images/gcp_login.png "GCP sign in button")
+
+- After successful sign in, click on Console on the top right corner.
+
+![](./gcp_images/gcp_console.png "GCP console button")
+
+## Step 2: Create a GCP billing account
+
+- Click on ACTIVATE to start setting up the billing account.
+
+![](./gcp_images/gcp_activatefreetrial.png "GCP activate free trial button")
+
+This is a 2-step process:
+
+- On the first page, you must agree to the Terms of Service, then click Continue.
+
+- On the second page, you'll create your payments profile. For this tutorial, we selected the individual account type (versus business account) and entered address and billing information. Be sure to click Start free trial.
+
+- A message will confirm that you set up the free trial - click Got it.
+
+!!! info "Free Credits"
+
+ For new users, Google offers a free 3-month $300 trial account. While you still have to enter a valid credit card to set up the billing account, you will not be charged during the trial period, nor will you be automatically charged when the trial ends, unless you turn on automatic billing.
+
+## Step 3: Check billing account information
+
+When you sign up for the GCP free trial, you should get an account confirmation email to the email address you used to sign in.
+
+- To check the billing account, click on the three horizontal lines to open the navigation menu and scroll down to Billing.
+
+![](./gcp_images/gcp_billingtab.png "GCP billing tab")
+
+- Select MANAGE BILLING ACCOUNTS
+
+![](./gcp_images/gcp_billingsetup.png "GCP billing setup")
+
+- Explore the **Billing** page
+
+![](./gcp_images/gcp_billingaccountmember.png "GCP billing account information")
+
+a) Check the box next to your billing account to show member information on right-hand panel.
+
+b) You can add members by clicking on ADD MEMBER.
+
+!!! note "Member Roles"
+
+ You can add members with roles such as "Billing Account User" if, for example, you want others to have access to your GCP billing account. This role is good for sharing your account with team/lab members or other platforms that use the Google cloud, such as the [Terra platform](https://app.terra.bio/).
+
+c) By default, as the owner of the billing account, you are designated the **Billing Account Administrator** role. [Read more about the different member and role options on the GCP](https://cloud.google.com/billing/docs/how-to/billing-access).
+
+Now that the billing account is set up, you can use GCP resources!
+
+!!! note "Optional: Rename billing account"
+
+ - Click on the current billing account name to navigate to the billing overview page:
+
+ ![](./gcp_images/gcp_billingoverview.png "GCP billing overview")
+
+ - Click Account management tab on the left panel and subsequently click on the pencil icon. Enter the new name in the pop up window:
+
+ ![](./gcp_images/gcp_billinrename.png "GCP billing account rename")
+
+## Centralized billing account
+
+Alternatively, a centralized billing account can be set up to share with a team using a G Suite organization and a Google Billing Account linked to the G Suite organization host's credit card. The Google documentation provides a [quick start guide](https://cloud.google.com/resource-manager/docs/quickstart-organizations) to set this up.
+
+In brief, if a hypothetical user, Jon, decided to start working with the GCP, he and his administrator, Janice, would use the following workflow:
+
+- Jon would create an Organization in G Suite and add Janice as a User to the organization.
+- As the administrator, Janice has access to the credit card information, so she must set up the organization's Google billing account. Janice would log in to the GCP console using her G Suite organization Google account to set up the billing account and link a credit card in the same way mentioned [above](#create-billing).
+- Janice must add Jon to the billing account as a member with the role "Billing Account Administrator" to give him the ability to edit and use the billing account.
+
+## Monitoring billing account
+
+Keep track of GCP service charges from the billing account section of the GCP console (e.g., invoice information can be found in the Transactions section). For more information and tutorials on monitoring expenses, see [GCP documentation](https://cloud.google.com/billing/docs).
+
+Tracking the exact and estimated costs on GCP can be challenging. See [this blog post by Lukas Karlsson](https://medium.com/@lukwam/reconcile-your-monthly-gcp-invoice-with-bigquery-billing-export-b36ae0c961e) for an explanation of monitoring computing costs. The [Google Cloud pricing calculator](https://cloud.google.com/products/calculator/#id=) is also a helpful tool to estimate the cost of GCP compute resources.
+
+Next, let's spin up a GCP virtual machine!
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp2.md b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp2.md
new file mode 100644
index 000000000..b3ad97b75
--- /dev/null
+++ b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp2.md
@@ -0,0 +1,241 @@
+# Setting up a GCP instance
+
+In this section, we'll create a project, configure a GCP virtual machine (VM) instance, and connect to the VM.
+
+## Step 1: Create a project
+
+- Click on the three horizontal lines to open the navigation menu
+- Scroll down to IAM & Admin
+- Select the Manage Resources page.
+
+![](./gcp_images/gcp_project1.png "Manage resources tab")
+
+!!! tip "Pin Tabs"
+
+ The navigation menu allows you to pin any tab which will appear on the top under Home. Hover over any navigation menu item and click on the pin icon next to it.
+
+- Click on CREATE PROJECT
+
+![](./gcp_images/gcp_project2.png "Create Project button")
+
+- Enter a unique project name ("My First Project" in the example below).
+- Click CREATE.
+
+![](./gcp_images/gcp_createproject.png "Create Project")
+
+!!! info
+
+ There is a quota for number of projects allowed per billing account. For the free trial account it is set to 12 but can be increased by submitting a request. [Learn more about project quota](https://support.google.com/cloud/answer/6330231).
+
+ The "Location" entry can be left as "No organization". If a centralized billing account is set up, there would be options to set an "Organization" and "Location" with the G Suite organization name.)
+
+- The new project is now listed in the table, along with an auto-generated project ID. Check the box next to the project name to select it.
+
+![](./gcp_images/gcp_projectid.png "Project ID")
+
+## Step 2: Configure custom VM
+
+- On the navigation menu, scroll down to Compute Engine and select VM instances. It may take a few minutes to load.
+
+![](./gcp_images/gcp_vm.png "VM instances")
+
+- The first time you create a VM, you'll need to click Enable billing. In the popup window select SET ACCOUNT.
+
+![](./gcp_images/gcp_billingenable.png "Enable billing")
+
+- Click the Create on the VM instance window. The compute engine setup may take a few minutes before this option becomes clickable.
+
+![](./gcp_images/gcp_createvm.png "Create VM")
+
+!!! tip
+
+ If the create VM step is taking more than 5 minutes, try refreshing the web browser page.
+
+
+Follow along with the video and written steps below to set up instance configuration options:
+
+
+
+![](./gcp_images/gcp_vmconfig1.png "VM configuration name and region")
+
+### a. Name your VM
+
+- Type your VM name in the text box.
+- Names must be in lowercase letters or numbers.
+- Use hyphens `-` instead of spaces.
+
+### b. Choose a Region
+
+- Select a region from the dropdown menu. In this tutorial, we selected the "us-west1 (Oregon)" region.
+- Zone is auto-selected based on choice of region.
+
+!!! note "Machine regions"
+
+ It's often best to select the region closest to your physical geographic region, but region choice depends on other factors. For example, if you are downloading data that is stored at a data center in a different region from yours, it may be more expensive to move data between regions. In this case, you'd want to select the same region as the data center. Read more about [regions and zones](https://cloud.google.com/compute/docs/regions-zones?_ga=2.51208687.-79235869.1601574422).
+
+ This tool can help to choose the region closest to you (). You may need to refresh several times.
+
+### c. Choose machine type and configuration
+
+![](./gcp_images/gcp_vmconfig2.png "VM configuration machine type")
+
+- For this tutorial, select Series **`E2`** and Machine type **`e2-micro`**.
+- This machine type is recommended for day-to-day low cost computing.
+- The estimated monthly cost for each machine type is shown on the top right side panel.
+- Depending on the tasks you will use the VM for, you may need to choose a machine with more CPUs and memory.
+
+!!! note "Machine types"
+
+ See [GCP documentation](https://cloud.google.com/compute/docs/machine-types) for more information about machine type and [recommendations](https://cloud.google.com/compute/docs/machine-types#recommendations_for_machine_types) based on workload.
+
+### d. Customize boot disk
+
+![](./gcp_images/gcp_vmconfig3.png "VM configuration boot disk")
+
+- Click on Change.
+- The default operating system is Debian, change it to **Ubuntu**.
+- Select version **Ubuntu 20.04 LTS**.
+- For this tutorial, we'll leave the [persistent disk storage](https://cloud.google.com/persistent-disk) as the default 10Gb. You can increase the storage amount based on task requirements.
+- Click Select
+
+![](./gcp_images/gcp_bootdisk.png "Boot disk")
+
+!!! note "Persistent disk storage"
+
+ This type of block storage allows more flexibility for computing - for example, it can be resized or accessed even after a GCP VM instance is in use or deleted. Here's a quick GCP [youtube video](https://www.youtube.com/watch?v=zovhVfou-DI&vl=en) that highlights Google's persistent disk storage features.
+
+- Click Create to initiate the VM. This step may take a few seconds to complete.
+
+### e. VM states
+
+![](./gcp_images/gcp_vm_runoptions2.png "VM controls")
+
+- Successful VM creation will be indicated by a green check mark next to the VM name.
+- The icons on the VM instances bar allow for control of VM states (left to right):
+
+=== "Refresh"
+
+ Refresh the instance.
+
+=== "Reset"
+
+ "[Reset](https://cloud.google.com/compute/docs/instances/instance-life-cycle#resetting_an_instance) performs a hard reset on the instance, which wipes the memory contents of the machine and resets the virtual machine to its initial state."
+
+=== "Suspend"
+
+ [Suspending](https://cloud.google.com/compute/docs/instances/instance-life-cycle#suspending_an_instance) "the instance will preserve its running state, similar to closing a laptop. You'll be billed for storage of the suspended VM state and any persistent disks it uses."
+
+ Note that `E2` VMs currently do not support the "Suspend" operation. Use "Stop" state instead.
+
+=== "Stop"
+
+ [Stopping](https://cloud.google.com/compute/docs/instances/instance-life-cycle#stopping_an_instance) the instance is similar to suspending it, but Google doesn't charge you for VM resources while it's stopped.
+
+=== "Start/Resume"
+
+ Start/resume a suspended or stopped instance to open it again.
+
+=== "Delete"
+
+ When you are completely finished working with the VM it can be deleted. This will remove all VM configurations and work you did in the VM.
+
+!!! tip
+
+ If you need to pause during this tutorial and want to save your VM instance and any work you did in the instance (e.g., files downloaded), click on the Stop icon to stop the VM. You'll be able to Start/Resume to start from your last session.
+
+## Step 3: Connect to your VM
+
+![](./gcp_images/gcp_connect1.png "VM details")
+
+### a. Check box by VM
+
+### b. Open Google Cloud Shell
+
+- Click on the Activate Cloud Shell icon.
+- A new panel will open on the bottom half of your screen.
+- Agree to the Google Cloud terms of service and privacy policy (first time only).
+- After starting the shell, it may take a few minutes to connect.
+- The Google Cloud Shell command prompt format will show: **`@cloudshell:~ ()$`**.
+
+![](./gcp_images/gcp_shell1.png "Project ID in shell")
+
+!!! info "Google Cloud Shell"
+
+ The GCP console provides a free Google Cloud Shell. This shell environment is useful for small tasks that do **not** require a lot of CPU or memory (as most bioinformatic analyses do). For example, it is a good place to learn how to use the Google shell environment without incurring cost or to access Google Cloud services (e.g., a Google Storage bucket or GCP virtual machine). [Learn more about using the Google Cloud Shell](https://cloud.google.com/shell/docs/using-cloud-shell).
+
+### c. Connect to VM
+
+- Use the `gcloud compute` command to connect to your virtual machine (`gcloud` is a tool from the Google Cloud SDK toolkit). Follow along with the video and written steps below.
+
+
+
+- You will need your **project ID, zone, and instance name**.
+- In the example command below, the project ID is `fleet-space-303706`, the zone is `us-west1-b`, and the instance name is `test-vm`.
+- The `ssh` flag indicates we are accessing the VM with [Secure Shell](https://www.ssh.com/ssh/) protocol, which we'll set up below.
+
+!!! important
+
+ Remember to **replace these values** to run the command for your virtual machine!
+
+=== "Input"
+
+ Usage:
+ ```
+ gcloud compute --project "" ssh --zone
+ ```
+
+ Example:
+
+ ```
+ gcloud compute --project "fleet-space-303706" ssh --zone us-west1-b test-vm
+ ```
+
+If you get an error, you may need to configure `gcloud`:
+
+#### Configure Google Cloud SDK
+
+To connect to the GCP VM from the command line of the Google Cloud Shell or your local machine, you need to setup [Google Cloud Software Development Kit (SDK)](https://cloud.google.com/sdk). The Cloud SDK provides a number of important tools like `gcloud` that are used for accessing GCP services.
+
+[Follow the instructions](https://cloud.google.com/sdk/docs/quickstart) to download the appropriate SDK based on your Operating System. Then set up authorization:
+
+- In the terminal/command line, enter:
+
+```
+gcloud auth login
+```
+
+![](./gcp_images/gcp_authorise_shell2.png "enter verification code")
+
+- Click on the Google link. A new browser tab will open.
+- Log in to the Google account you used to set up the GCP console.
+- Click Allow to allow Google Cloud SDK to access your Google account.
+- The next page will provide a verification code. Copy/paste the code back in the terminal next to "Enter verification code:".
+
+### d. Authorize Cloud Shell
+
+- The first time you open the shell to access a VM, you will need to authorize the cloud shell.
+- Click on Authorise:
+
+![](./gcp_images/gcp_authorise_shell.png "authorise cloud shell")
+
+### e. Set up SSH keys
+
+Next, set up SSH public/private keys (first time only). This step provides an extra layer of security to protect access to your instance.
+
+![](./gcp_images/gcp_shell2.png "Setup ssh")
+
+Follow the prompts in the terminal:
+
+- "Do you want to continue (Y/n)?": type ++y++
+- "Enter passphrase (empty for no passphrase)": you can create a passphrase or it can be left empty, type ++enter++ to move on
+- "Enter same passphrase again": type passphrase if you created one, type ++enter++ to move on
+- After the VM is added as a known host you will be prompted to enter the paraphrase if one was set: "Enter passphrase for key"
+- On successful login, your command prompt in the terminal should switch to **`@:~$`**. You're now in the VM space!
+
+!!! tip
+
+ If the Google Cloud Shell times out, a popup banner will appear where you can click Reconnect.
+
+ ![](./gcp_images/gcp_reconnect.png "Reconnect instance")
+
+Continue to the [next lesson](./gcp3.md) to run an example analysis using the VM we just configured!
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp3.md b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp3.md
new file mode 100644
index 000000000..ac40f8970
--- /dev/null
+++ b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp3.md
@@ -0,0 +1,289 @@
+# Example 1: BLAST analysis
+
+You now have a custom configured VM! Let's test our new GCP virtual machine with a protein sequence BLAST search. This is a shortened version from the full [command-line BLAST tutorial](../Command-Line-BLAST/BLAST1.md), which ran the BLAST search on the Amazon AWS cloud platform. Check out the full tutorial for more details about the specific BLAST commands.
+
+Run the following steps from the Google Cloud Shell terminal.
+
+## Step 1: Install BLAST
+
+```
+sudo apt-get update && sudo apt-get -y install python ncbi-blast+
+```
+
+- Check version:
+
+=== "Input"
+
+ ```
+ blastp -version
+ ```
+
+=== "Expected Output"
+
+ ```
+ blastp: 2.9.0+
+ Package: blast 2.9.0, build Sep 30 2019 01:57:31
+ ```
+
+
+## Step 2: Set up file permissions and temporary directory
+
+`/mnt` is a temporary file system that is large enough to run data analysis in, but is deleted when the VM is shut down.
+
+```
+sudo chmod a+rwxt /mnt
+cd /mnt
+```
+
+## Step 3: Download data
+
+- Use the `curl` command to download data files stored on an OSF repository:
+
+```
+curl -o mouse.1.protein.faa.gz -L https://osf.io/v6j9x/download
+curl -o zebrafish.1.protein.faa.gz -L https://osf.io/68mgf/download
+```
+
+- Type `ls -lht` to list the downloaded files:
+
+![](./gcp_images/gcp_blastfiles.png "List of uploaded files")
+
+- Uncompress the files
+
+```
+gunzip *.faa.gz
+```
+
+## Step 4: Blast search
+
+- Make a smaller data subset
+
+```
+head -n 11 mouse.1.protein.faa > mm-first.faa
+```
+
+- Make a database to search query sequences against
+
+```
+makeblastdb -in zebrafish.1.protein.faa -dbtype prot
+```
+
+- Run protein `blastp` search and save output
+
+```
+blastp -query mm-first.faa -db zebrafish.1.protein.faa -out mm-first.x.zebrafish.txt -outfmt 6
+```
+
+- Look at output results:
+
+=== "Input"
+ ```
+ less mm-first.x.zebrafish.txt
+ ```
+
+=== "Expected Output"
+
+ Output file looks like:
+ ```
+ YP_220550.1 NP_059331.1 69.010 313 97 0 4 316 10 322 1.24e-150 426
+ YP_220551.1 NP_059332.1 44.509 346 188 3 1 344 1 344 8.62e-92 279
+ YP_220551.1 NP_059341.1 24.540 163 112 3 112 263 231 393 5.15e-06 49.7
+ YP_220551.1 NP_059340.1 26.804 97 65 2 98 188 200 296 0.10 35.8
+ ```
+
+ Type ++q++ to go back to the terminal
+
+## Step 5: Configure Google toolkit
+
+To save output files or download local files from your computer to the VM environment, we will use the Google Cloud Software Development Kit (SDK), which provides tools to securely access the GCP platform. The GCP VM already has the toolkit installed, but it needs to be configured and linked to your Google account. We will be using the `gcloud` and `gsutil` tools in this tutorial. More information about this process is available in the GCP support documentation on [gcloud configuration and authorization steps](https://cloud.google.com/sdk/docs/initializing#:~:text=%20gcloud%20init%20performs%20the%20following%20setup%20steps%3A,active%20account%20from%20the%20step%20above%2C...%20More%20).
+
+In the VM terminal, enter:
+```
+gcloud init --console-only
+```
+
+- "Choose the account you would like to use to perform operations for
+this configuration": enter ++2++ to "Log in with a new account"
+- "Do you want to continue (Y/n)?": enter ++y++
+- Click on link under "Go to the following link in your browser". A new browser tab will open. Log in with the same Google account used to sign in to the GCP console.
+- After sign in, a new page will open with a verification code. Copy this code and paste into the terminal after "Enter verification code:"
+- "Pick cloud project to use": enter the number that corresponds to the option that lists your project ID
+- "Do you want to configure a default Compute Region and Zone? (Y/n)?": enter ++n++
+
+The `gcloud` configuration should now be complete. Run the following command to check:
+
+=== "Input"
+
+ ```
+ gcloud config configurations list
+ ```
+
+=== "Expected Output"
+
+ The output should list your GCP Google account email and project ID.
+
+ ```
+ NAME IS_ACTIVE ACCOUNT PROJECT COMPUTE_DEFAULT_ZONE COMPUTE_DEFAULT_REGION
+ default True
+ ```
+
+## Step 6: Create Google Storage bucket
+
+The GCP uses Google Storage buckets as file repositories. We can copy files from the VM to the bucket so they can be downloaded to a local computer, or we can upload local files to the bucket that can be copied to the VM environment. We will practice both use cases in this tutorial.
+
+Buckets can be managed from the graphical user interface (GUI) section of the GCP console or on the command line. We will create a bucket and move files on command line and then check them on the GUI. To ensure secure transfer of data to/from the cloud, we will use the `gsutil` tool.
+
+- Make a bucket by using the `gsutil mb` command. Google bucket paths always begin with "gs://". You must enter a unique name for your bucket (do not use spaces in the name). Follow along with the video and written steps below.
+
+
+
+=== "Input"
+
+ Usage:
+ ```
+ gsutil mb gs://
+ ```
+
+ Example:
+ ```
+ gsutil mb gs://forblast
+ ```
+
+=== "Expected Output"
+
+ ```
+ Creating gs://forblast/...
+ ```
+
+- Bucket names, particularly auto-generated names, can get very long and complicated. To make it easier to type, we can create an alias for the bucket.
+
+=== "Input"
+
+ Usage:
+ ```
+ export BUCKET="gs://"
+ ```
+
+ Example:
+ ```
+ export BUCKET="gs://forblast"
+ ```
+
+=== "Expected Output"
+
+ There should be no output if alias assignment is successful. To check that the alias works, enter:
+
+ ```
+ echo $BUCKET
+ ```
+
+ The output for our example is:
+
+ ```
+ gs://forblast
+ ```
+
+## Step 7: Copy file to bucket
+
+- We will copy the blast output file to the bucket so it can be downloaded to your computer.
+
+=== "Input"
+
+ ```
+ gsutil cp mm-first.x.zebrafish.txt $BUCKET
+ ```
+
+=== "Expected Output"
+
+ ```
+ Copying file://mm-first.x.zebrafish.txt [Content-Type=text/plain]...
+ / [1 files][ 268.0 B/ 268.0 B]
+ Operation completed over 1 objects/268.0 B.
+ ```
+
+- Now go to the navigation menu, scroll down to Storage, and click Browser. You should see your bucket listed ("my-bucket-blastresults" in this example). Click on the bucket name.
+
+![](./gcp_images/gcp_storage1.png "Google Storage tab")
+
+- You should see the file we copied over. Check the box in the file row to download the file to your computer or delete the file. There is a Download button above the table to download multiple files, or the download arrow icon at the end of the file row to download individual files. You can also select files to Delete from the storage bucket.
+
+![](./gcp_images/gcp_storage2.png "Download file")
+
+## Step 8: Upload files
+
+Finally, you can upload files to the bucket to use in the VM. Follow along with the video and written steps below.
+
+
+
+Click on Upload Files, choose file(s) to upload from your computer and click Open or drag and drop a file (shown in video). You should now see them in the bucket file list. To demonstrate, right-click or ++ctrl++ click on this link - ["testfile.txt"](./testfile.txt) - and save the text file to your computer. Then upload it to the bucket:
+
+![](./gcp_images/gcp_storage3.png "Upload files")
+
+Back in the VM terminal, use the `gsutil cp` command again to copy the file to the VM home directory. Replace the values below for file/folder name and the location to save them.
+
+=== "Input"
+
+ Usage:
+ ```
+ # for 1 file
+ gsutil cp $BUCKET/
+
+ # for multiple files in a folder
+ # the -r is a flag that tells gsutil to recursively copy all specified files
+ # the * is a wildcard pattern that tells gsutil to copy all files in the folder
+ gsutil cp -r $BUCKET//*
+ ```
+
+ Example:
+
+ - We copy the "testfile.txt" file from the Google bucket to the current directory location we're in at the terminal, which is represented by "./".
+ ```
+ gsutil cp $BUCKET/testfile.txt ./
+ ```
+
+=== "Expected Output"
+
+ ```
+ Copying gs://forblast/testfile.txt...
+ / [1 files][ 4.0 B/ 4.0 B]
+ Operation completed over 1 objects/4.0 B.
+ ```
+
+ Our VM terminal now has the test file! Check by typing `ls -lht`
+
+ ```
+ total 134M
+ -rw-rw-r-- 1 840 Nov 24 22:08 mm-first.faa
+ -rw-rw-r-- 1 268 Nov 24 22:12 mm-first.x.zebrafish.txt
+ -rw-rw-r-- 1 49M Nov 24 22:05 mouse.1.protein.faa
+ -rw-rw-r-- 1 4 Nov 24 23:17 testfile.txt
+ -rw-rw-r-- 1 41M Nov 24 22:05 zebrafish.1.protein.faa
+ -rw-rw-r-- 1 6.8M Nov 24 22:08 zebrafish.1.protein.faa.phr
+ -rw-rw-r-- 1 415K Nov 24 22:08 zebrafish.1.protein.faa.pin
+ -rw-rw-r-- 1 37M Nov 24 22:08 zebrafish.1.protein.faa.psq
+ ```
+
+## Step 9: Exit VM
+
+To exit the VM:
+
+- type "exit" to logout
+- type "exit" to close the VM connection
+
+This brings you back to the Google Cloud Shell terminal. Type "exit" one more time to completely close the shell panel.
+
+!!! tip
+
+ Closing the VM does not stop the instance!
+
+## Step 10: Stop or delete the instance
+
+When you're finished using the virtual machine, be sure to stop or delete it, otherwise it will continue to incur costs.
+
+There are two options (click on the three vertical dots):
+
+- You can "Stop" the instance. This will pause the instance, so it's not running, but it will still incur storage costs. This is a good option if you want to come back to this instance (click "Start/Resume") without having to reconfigure and download files every time.
+
+- If you're completely done with the instance, you can "Delete" it. This will delete all files though, so [download](#files-to-bucket) any files you want to keep!
+
+![](./gcp_images/gcp_vmstop.png "Stop or delete VM")
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp4.md b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp4.md
new file mode 100644
index 000000000..277de6ed4
--- /dev/null
+++ b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp4.md
@@ -0,0 +1,171 @@
+# Example 2: Download SRA data
+
+In this example, we'll configure a new VM and learn how to download fastq files from the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database.
+
+## Step 1: Set up VM
+
+We need a new VM for this example; you can use the same project. Follow the steps from the [previous section](./gcp2.md), with these modifications:
+
+- choose a Region that begins with "us-" because the NCBI SRA data is located in the United States (any is fine, i.e., `us-west1 (Oregon)`)
+- select an **`e2-medium`** instance. We need a machine with a bit more memory than the `e2-micro` we used in the previous example.
+
+[Connect](./gcp2.md) to the VM with the Google Cloud Shell (authorise shell and set up SSH keys if necessary).
+
+## Step 2: Install conda for Linux
+
+The VM we set up is using an Ubuntu operating system. We will use conda to install the SRA toolkit for Ubuntu.
+
+In the cloud shell, enter the following to download and install Miniconda for Linux:
+
+```
+curl -LO https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
+bash Miniconda3-latest-Linux-x86_64.sh
+```
+
+Follow the prompts to complete conda set up - answer `yes` to all the questions!
+
+!!! note
+
+ The [SRA Github installation instructions](https://github.com/ncbi/sra-tools/wiki/02.-Installing-SRA-Toolkit) downloads and installs the toolkit with a different approach, however it requires interactive configuration steps and as of February 2021 there is an error with data downloads, so we are showing the conda installation method.
+
+## Step 3: Install SRA toolkit
+
+We will create a conda environment and install SRA toolkit version 2.10.9 in it, where the conda channels and toolkit version are defined in a yaml file.
+
+Make the "environment.yml" file:
+
+```
+nano environment.yml
+```
+
+Copy and paste the text below into the nano text editor:
+
+```
+channels:
+ - conda-forge
+ - bioconda
+ - defaults
+dependencies:
+ - sra-tools=2.10.9
+```
+
+Save with ++ctrl++ ++o++ and exit with ++ctrl++ ++x++ the editor.
+
+Create the conda environment:
+
+```
+conda env create -n sratest -f environment.yml
+```
+
+Activate the environment:
+
+```
+conda activate sratest
+```
+
+Let's check that the installation worked. The command `fasterq-dump` (a faster version of `fastq-dump`) is used to specify NCBI accessions to download.
+
+=== "Input"
+
+ Take a look at the help documentation for a list of the options associated with this command:
+
+ ```
+ fasterq-dump -h
+ ```
+
+=== "Expected Output"
+
+ The top of the help documentation:
+
+ ```
+ Usage: fasterq-dump [ options ] [ accessions(s)... ]
+ Parameters:
+ accessions(s) list of accessions to process
+ Options:
+ -o|--outfile full path of outputfile (overrides usage
+ of current directory and given accession)
+ ...
+ ```
+
+## Step 4: Download fastq files
+
+Let's download fastq data files from an [*E. coli* sample](https://www.ncbi.nlm.nih.gov/sra/SRR5368359). We need the "SRR" ID:
+
+![](./gcp_images/sra_example_sample.png "NCBI SRR sample page")
+
+Download the file using the `fasterq-dump` command and specify the ouput (`-O`) directory as `./`, which sets it to save outputs in the current directory:
+
+=== "Input"
+
+ ```
+ fasterq-dump SRR5368359 -O ./
+ ```
+
+=== "Expected Output"
+
+ When the command completes, the output in the shell should look like this:
+ ```
+ spots read : 2,116,469
+ reads read : 4,232,938
+ reads written : 4,232,938
+ ```
+
+There should be two fastq files in our directory that can be used for analysis!
+
+=== "Input"
+
+ ```
+ ls -lh
+ ```
+
+=== "Expected Output"
+
+ ```
+ total 1.5G
+ -rw-rw-r-- 1 767M Jan 5 02:40 SRR5368359_1.fastq
+ -rw-rw-r-- 1 767M Jan 5 02:40 SRR5368359_2.fastq
+ ```
+
+Take a look at the file!
+
+=== "Input"
+
+ ```
+ head -n 4 SRR5368359_1.fastq
+ ```
+
+=== "Expected Output"
+
+ ```
+ @SRR5368359.1 1 length=151
+ CTATATTGGTTAAAGTATTTAGTGACCTAAGTCAATAAAATTTTAATTTACTCACGGCAGGTAACCAGTTCAGAAGCTGCTATCAGACACTCTTTTTTTAATCCACACAGAGACATATTGCCCGTTGCAGT
+ CAGAATGAAAAGCTGAAAAA
+ +SRR5368359.1 1 length=151
+ C@@FFEFFHHHHHJJGIIIIIJIJJJJJJJJJIJJJJJJGJJJJJJJJJJJJJJJJIIIJI=FHGIHIEHIJJHHGHHFFFFFDEEEDEDDDDCDDDDBDDCCCDDDDDDDDDDDDC@CCCDDD>ADDCDD
+ DDCDDDDDDDDDDDDD@CDB
+ ```
+
+## Step 5: Exit VM
+
+To exit the VM:
+
+- type "exit" to logout
+- type "exit" to close the VM connection
+
+This brings you back to the Google Cloud Shell terminal. Type "exit" one more time to completely close the shell panel.
+
+!!! tip
+
+ Closing the VM does not stop the instance!
+
+## Step 6: Stop or delete the instance
+
+When you're finished using the virtual machine, be sure to stop or delete it, otherwise it will continue to incur costs.
+
+There are two options (click on the three vertical dots):
+
+- You can "Stop" the instance. This will pause the instance, so it's not running, but it will still incur storage costs. This is a good option if you want to come back to this instance (click "Start/Resume") without having to reconfigure and download files every time.
+
+- If you're completely done with the instance, you can "Delete" it. This will delete all files though, so [download](./gcp3.md#files-to-bucket) any files you want to keep!
+
+![](./gcp_images/gcp_vmstop.png "Stop or delete VM")
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_activatefreetrial.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_activatefreetrial.png
new file mode 100644
index 000000000..2cdc6427b
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_activatefreetrial.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_authorise_shell.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_authorise_shell.png
new file mode 100644
index 000000000..e35a712d0
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_authorise_shell.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_authorise_shell2.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_authorise_shell2.png
new file mode 100644
index 000000000..308ccc829
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_authorise_shell2.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billingaccountmember.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billingaccountmember.png
new file mode 100644
index 000000000..766050854
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billingaccountmember.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billingenable.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billingenable.png
new file mode 100644
index 000000000..6bafd93f3
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billingenable.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billingoverview.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billingoverview.png
new file mode 100644
index 000000000..fd9d965be
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billingoverview.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billingsetup.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billingsetup.png
new file mode 100644
index 000000000..12c2ba1ee
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billingsetup.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billingtab.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billingtab.png
new file mode 100644
index 000000000..337628f47
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billingtab.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billinrename.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billinrename.png
new file mode 100644
index 000000000..8b65d2e89
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_billinrename.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_blastfiles.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_blastfiles.png
new file mode 100644
index 000000000..00a2029bd
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_blastfiles.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_bootdisk.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_bootdisk.png
new file mode 100644
index 000000000..0ec8fd6dd
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_bootdisk.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_connect1.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_connect1.png
new file mode 100644
index 000000000..527fc26ba
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_connect1.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_console.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_console.png
new file mode 100644
index 000000000..299affa92
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_console.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_createproject.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_createproject.png
new file mode 100644
index 000000000..e1a65ab97
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_createproject.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_createvm.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_createvm.png
new file mode 100644
index 000000000..91a677109
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_createvm.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_gcshell.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_gcshell.png
new file mode 100644
index 000000000..b327a5afc
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_gcshell.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_login.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_login.png
new file mode 100644
index 000000000..9cfe20423
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_login.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_project1.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_project1.png
new file mode 100644
index 000000000..cc5918e2c
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_project1.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_project2.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_project2.png
new file mode 100644
index 000000000..a4a4f5a88
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_project2.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_projectid.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_projectid.png
new file mode 100644
index 000000000..8a5b9033a
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_projectid.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_reconnect.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_reconnect.png
new file mode 100644
index 000000000..4195b15cc
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_reconnect.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_shell1.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_shell1.png
new file mode 100644
index 000000000..8aec90429
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_shell1.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_shell2.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_shell2.png
new file mode 100644
index 000000000..b11480049
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_shell2.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_storage1.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_storage1.png
new file mode 100644
index 000000000..f1ee55e01
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_storage1.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_storage2.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_storage2.png
new file mode 100644
index 000000000..0401e0ebe
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_storage2.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_storage3.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_storage3.png
new file mode 100644
index 000000000..79f6f79e8
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_storage3.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vm.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vm.png
new file mode 100644
index 000000000..7c6ff0215
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vm.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmGCS.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmGCS.png
new file mode 100644
index 000000000..2f55a382a
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmGCS.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vm_runoptions.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vm_runoptions.png
new file mode 100644
index 000000000..8ad662986
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vm_runoptions.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vm_runoptions2.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vm_runoptions2.png
new file mode 100644
index 000000000..f9558fd31
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vm_runoptions2.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmconfig.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmconfig.png
new file mode 100644
index 000000000..b8a585882
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmconfig.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmconfig1.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmconfig1.png
new file mode 100644
index 000000000..a7059a0e4
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmconfig1.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmconfig2.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmconfig2.png
new file mode 100644
index 000000000..46284a086
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmconfig2.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmconfig3.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmconfig3.png
new file mode 100644
index 000000000..eb0ef4528
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmconfig3.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmconfig4.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmconfig4.png
new file mode 100644
index 000000000..906fc7170
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmconfig4.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmssh.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmssh.png
new file mode 100644
index 000000000..6001518af
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmssh.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmstop.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmstop.png
new file mode 100644
index 000000000..2438c7f61
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmstop.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmterminal.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmterminal.png
new file mode 100644
index 000000000..2dedaa9b1
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/gcp_vmterminal.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/sra_config1.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/sra_config1.png
new file mode 100644
index 000000000..0e6c00557
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/sra_config1.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/sra_config2.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/sra_config2.png
new file mode 100644
index 000000000..6aab6f7b0
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/sra_config2.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/sra_config3.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/sra_config3.png
new file mode 100644
index 000000000..04293216e
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/sra_config3.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/sra_config4.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/sra_config4.png
new file mode 100644
index 000000000..7a93bf8d2
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/sra_config4.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/sra_example_sample.png b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/sra_example_sample.png
new file mode 100644
index 000000000..d37798c87
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction-to-GCP/gcp_images/sra_example_sample.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/index.md b/docs/Bioinformatics-Skills/Introduction-to-GCP/index.md
new file mode 100644
index 000000000..2bb384775
--- /dev/null
+++ b/docs/Bioinformatics-Skills/Introduction-to-GCP/index.md
@@ -0,0 +1,40 @@
+---
+layout: page
+title: GCP Overview
+hide:
+ - toc
+---
+
+An Introduction to the Google Cloud Platform
+============================================
+
+The Google Cloud Platform (GCP) provides a number of services such as cloud-based computation and storage. All services are available through the platform's console page, which also monitors account billing and user permissions/roles. GCP cloud computing resources are useful for conducting large-scale genomic analyses that would otherwise take too long or crash local computers.
+
+In this tutorial, we'll set up a GCP billing account and demonstrate how to use two GCP services: Google Compute Engine and Google Storage buckets.
+
+Est. Time | Lesson name | Description
+--- | --- | ---
+15 mins | [Setting up a GCP account](./gcp1.md) | How to set up a GCP billing account?
+30 mins | [Setting up a GCP instance](./gcp2.md) | How to start a GCP instance?
+30 mins | [Example 1: BLAST analysis](./gcp3.md) | How to do a BLAST search on GCP?
+30 mins | [Example 2: Download SRA data](./gcp4.md) | How to download SRA data on GCP?
+
+!!! note "Learning Objectives"
+
+ - introduce researchers to cloud computing resources and how to use them
+
+ - learn how to connect to an instance (virtual server) on the Google Cloud Platform to do a simple protein BLAST search and download SRA data.
+
+ - learn how to transfer files to or from the instance using Google Storage buckets
+
+ - learn how to terminate an instance
+
+=== "Est. Cost"
+
+ < $1.00 to run both examples
+
+=== "Prerequisites"
+
+ - Technology: Users must be comfortable with using a terminal window (GCP shell provided in the interface). Please use the **Chrome web browser** for setting up and connecting to a GCP virtual machine. Some GCP features do not work on other web browsers.
+
+ - Financial: Users need a valid credit card to set up a GCP billing account. However, new users are eligible for a free 3-month $300 trial, which can be activated during billing account set up.
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/sratoolkit_config_steps.md b/docs/Bioinformatics-Skills/Introduction-to-GCP/sratoolkit_config_steps.md
new file mode 100644
index 000000000..37d59fcd8
--- /dev/null
+++ b/docs/Bioinformatics-Skills/Introduction-to-GCP/sratoolkit_config_steps.md
@@ -0,0 +1,73 @@
+# Extra configuration steps
+
+This is a holding space for configuration steps that have been needed at some point, but are no longer requirements in set up/examples.
+
+1. The BLAST example used to require a VM with Firewall settings, but does not seem to anymore. SRA toolkit does not use Firewall settings. To streamline the VM set up steps, took Firewall config out.
+2. SRA toolkit - `wget` download of SRA toolkit requires manual config. The config step is broken as of Feb 2021. Switched to using conda install steps that do not require config.
+
+## 1. Firewall config
+### e. Firewall setting
+
+![](./gcp_images/gcp_vmconfig4.png "VM configure firewall")
+
+- Check the box by **Allow HTTP traffic** which opens port 80 (HTTP) and allows you to access the virtual machine.
+
+!!! important
+
+ If the Firewall settings are not enabled, you may get an error message when trying to connect to the VM:
+ `Insufficient Permission: Request had insufficient authentication scopes`
+
+
+## 2. Configure SRA toolkit
+
+There are three configuration steps. The configuration instructions are detailed on the [SRA Github page](https://github.com/ncbi/sra-tools/wiki/03.-Quick-Toolkit-Configuration).
+
+For the second step, we need an empty directory to store cached files. This makes a directory called "test":
+
+```
+mkdir test
+```
+
+Now, enter:
+
+```
+vdb-config -i
+```
+
+A new panel will open.
+
+### Enable remote access
+
+This setting tells sra-tools to look for data from remote servers at NCBI, Amazon Web Services (AWS), or GCP.
+
+The first configuration is to ensure there is an ++x++ in the brackets by "Enable Remote Access":
+
+![](./gcp_images/sra_config1.png "SRA configure remote access")
+
+### Configure cache
+
+This configuration sets up a persistent cache for downloaded files, so they do not need to be accessed remotely multiple times.
+
+- Type ++c++ to open the "Cache" tab.
+- An ++x++ should be by “enable local file-caching”. If not, type ++i++ to select it.
+- Type ++o++ to choose the “location of user-repository”. A green and yellow panel will open up.
+- Use the ++down++ on your keyboard or your mouse to navigate to the empty directory, "test", we made in the previous step, type ++enter++.
+- When your directory is selected, click the “OK” button (a red bar will appear next to "OK") and type ++enter++.
+- Type ++y++ to change the location to the "test" directory
+
+![](./gcp_images/sra_config2.png "SRA configure cache")
+
+Check that the correct directory is printed under "location of user-repository":
+
+![](./gcp_images/sra_config3.png "SRA configure cache complete")
+
+### Report cloud instance identity
+
+This setting tells sra-tools that we are using a GCP instance. Using this setting also improves file download speed.
+
+- Type ++g++ to open the GCP tab.
+- Type ++r++ to select “report cloud instance identity”.
+
+![](./gcp_images/sra_config4.png "SRA configure cloud instance")
+
+Type ++s++ to save the settings and ++o++ to select ok. Then type ++x++ to exit the configuration page to return to the cloud shell. Configuration is complete!
diff --git a/docs/Bioinformatics-Skills/Introduction-to-GCP/testfile.txt b/docs/Bioinformatics-Skills/Introduction-to-GCP/testfile.txt
new file mode 100644
index 000000000..7739993cf
--- /dev/null
+++ b/docs/Bioinformatics-Skills/Introduction-to-GCP/testfile.txt
@@ -0,0 +1 @@
+I am a test file!
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/.pages b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/.pages
new file mode 100644
index 000000000..ea4865665
--- /dev/null
+++ b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/.pages
@@ -0,0 +1,7 @@
+nav:
+ - introtoaws1.md
+ - introtoaws2.md
+ - introtoaws3.md
+ - Connect to Instance: introtoaws4.md
+ - Screen Command: introtoaws5_Screen.md
+ - Terminate Instance: introtoaws5.md
\ No newline at end of file
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/Terminate.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/Terminate.png
index d2ed7758d..f4a0283fb 100644
Binary files a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/Terminate.png and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/Terminate.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_12.PNG b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_12.PNG
deleted file mode 100644
index c33c05cb2..000000000
Binary files a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_12.PNG and /dev/null differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_12.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_12.png
new file mode 100644
index 000000000..bce9d5bcc
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_12.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_2.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_2.png
index 22623d9f5..c8c7d65d9 100644
Binary files a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_2.png and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_2.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_3.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_3.png
index cd9905069..5d74b7ec5 100644
Binary files a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_3.png and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_3.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_4.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_4.png
index 6e3f84bc6..d0f12aa61 100644
Binary files a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_4.png and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_4.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_6.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_6.png
index a36b91b7e..49e26efa5 100644
Binary files a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_6.png and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_6.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_7.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_7.png
index d90a5863a..37f7b14d6 100644
Binary files a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_7.png and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_7.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_9.PNG b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_9.PNG
deleted file mode 100644
index 895628b09..000000000
Binary files a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_9.PNG and /dev/null differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_9.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_9.png
new file mode 100644
index 000000000..d04ce6fb1
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_9.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_launch.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_launch.png
index 008ec7f5e..4e5844b81 100644
Binary files a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_launch.png and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/aws_launch.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/connect_1.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/connect_1.png
new file mode 100644
index 000000000..bdce8c090
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/connect_1.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/connect_2.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/connect_2.png
new file mode 100644
index 000000000..a17186675
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/connect_2.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/connect_3.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/connect_3.png
new file mode 100644
index 000000000..9ab22bd74
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/connect_3.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/connect_4.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/connect_4.png
new file mode 100644
index 000000000..259f4869e
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/connect_4.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_1.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_1.png
index 4bec89482..466a35259 100644
Binary files a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_1.png and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_1.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_2.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_2.png
index 2300da290..228d71f6d 100644
Binary files a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_2.png and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_2.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_3.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_3.png
index e62e84082..b2960b2b4 100644
Binary files a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_3.png and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_3.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_3_2.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_3_2.png
index da01276e0..35b48708c 100644
Binary files a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_3_2.png and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/mobaxterm_3_2.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/original_ssh_terminal.png b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/original_ssh_terminal.png
new file mode 100644
index 000000000..0df822718
Binary files /dev/null and b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/images-aws/original_ssh_terminal.png differ
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws1.md b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws1.md
index 6c97a670b..3dba534e2 100644
--- a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws1.md
+++ b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws1.md
@@ -1,20 +1,29 @@
---
layout: page
+title: AWS Overview
+hide:
+ - toc
---
-# Introduction to Amazon Web Services
+Introduction to Amazon Web Services
+====================================
+
Amazon Web Services (AWS) is a subsidiary of Amazon that provides on-demand cloud computing platforms and APIs to individuals, companies and governments, on a metered pay-as-you-go basis. Subscribers can pay for a single virtual AWS computer, a dedicated physical computer, or clusters of either. AWS cloud computing resources are useful for conducting large-scale genomic analyses that would otherwise take too long or crash local computers.
Est. time | Lesson name | Description
--- | --- | ---
30 mins | [Setting up an AWS instance](./introtoaws3.md) | How to start and configure an AWS instance?
-10 mins | [Connect to an instance](./introtoaws4.md) | How to begin working on your AWS instance?
+15 mins | [Connecting to an instance](./introtoaws4.md) | How to begin working on your AWS instance?
+20 mins | [Using the Screen Command](./introtoaws5_Screen.md) | How to run multiple screens and switch between tabs?
+10 mins | [Terminating an instance](./introtoaws5.md) | How to stop or terminate your AWS instance?
!!! note "Learning Objectives"
- - introduce researchers to cloud computing resources and how to use them
-
+ - introduce researchers to cloud computing resources
+ - learn to set up a cloud computer
+ - learn to connect to the cloud computer
+ - learn to run multiple screen sessions and switch between them
- learn how to terminate an instance
=== "Prerequisites"
@@ -25,7 +34,8 @@ Est. time | Lesson name | Description
- Time: You need to have an Amazon Web Services account. AWS account setup needs approval by AWS, and approval times can range from minutes to days.
-
=== "Tutorial Resources"
- [Vidlet: Setting up an AWS instance](./introtoaws2.md)
\ No newline at end of file
+ - [Vidlet: Setting up an AWS instance](./introtoaws2.md)
+
+ - [Screen cheat sheet](../../Cheat-Sheets/screen_cheatsheet.md)
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws2.md b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws2.md
index 2ff0eab2a..bdc9e8883 100644
--- a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws2.md
+++ b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws2.md
@@ -1,24 +1,12 @@
# Setting up an AWS instance video walk-through
-=== "Windows"
- You may choose to follow the walkthrough tutorial video first or proceed to follow the [step-by-step tutorial](./introtoaws3.md). This tutorial includes directions for connecting via __Windows__.
-
-
-=== "Mac Os"
- You may choose to follow the walkthrough tutorial video first or proceed to follow the [step-by-step tutorial](./introtoaws3.md). This tutorial includes directions for connecting via __MAC OS__.
-
-
-
-
-
-
-
-
-
-
-
-
+=== "Windows :fontawesome-brands-windows:"
+ You may choose to follow the walkthrough tutorial video first or proceed to follow the [step-by-step tutorial](./introtoaws3.md). This tutorial includes directions for connecting via **Windows**.
+
+=== "macOS :fontawesome-brands-apple:"
+ You may choose to follow the walkthrough tutorial video first or proceed to follow the [step-by-step tutorial](./introtoaws3.md). This tutorial includes directions for connecting via **macOS**.
+
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws3.md b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws3.md
index 5cfa019f5..38e14bbfc 100644
--- a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws3.md
+++ b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws3.md
@@ -1,26 +1,31 @@
-# Setting up an AWS instance
+---
+layout: page
+title: Setting up an AWS instance
+---
-To start, we will learn how to set-up Ubuntu 20.04 Pro LTS open source software operating system. Ubuntu 20.04 Pro LTS is one of the programs offered in the Amazon Free Tier as well as one of the most popular open source operating systems.
+In this lesson, we will learn to set-up an instance with Ubuntu, an open source software operating system and part of the Amazon Free Tier program.
Follow along with these steps and/or watch our [walk-through tutorial](./introtoaws2.md) to get started!
-### Step 1: Log in to an AWS account
+## Step 1: Log in to an AWS account
-Go to [Amazon Web Services](https://aws.amazon.com) in a web browser. Select the "My Account" menu option "AWS Management Console". Log in with your username & password.
+* Go to [Amazon Web Services](https://aws.amazon.com)
+* Click on My Account
+* Select AWS Management Console from the drop down menu.
+* Alternatively, click on Sign In to the Console.
+* Log in with your username & password as a **Root user**.
![AWS Management Console](./images-aws/aws_1.png "AWS my account button")
-!!! Note
+!!! info "Account Setup"
- If you need to create an account, please follow the [AWS instructions for creating an account](https://aws.amazon.com/premiumsupport/knowledge-center/create-and-activate-aws-account/).
+ If you need to create an account, please follow the [AWS instructions for creating an account](https://aws.amazon.com/premiumsupport/knowledge-center/create-and-activate-aws-account/). You will need a credit card to set up the account. New accounts could take up to 24 hours to be activated.
-!!! Warning
- If you are creating a new account, it could take up to 24 hours to be activated. You'll need a credit card to set up the account.
+## Step 2: Select region
-### Step 2: Choose virtual machine
-
-For this tutorial, select the AWS region that is closest to your current geographic location. The AWS region of your remote machine is displayed on the top right of this page. Click on it and choose the location that best describes the region you are currently located.
+* Select the AWS region of your remote machine that is closest to your current geographic location. It is displayed on the top right corner.
+* Click on it and choose the location that best describes the region you are currently located. In this tutorial, we have selected **N.California**.
![AWS Dashboard](./images-aws/aws_2.png "AWS amazon machine selection")
@@ -28,81 +33,125 @@ For this tutorial, select the AWS region that is closest to your current geograp
The default region is automatically displayed in the AWS Dashboard. The [choice of region](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-region.html) has implications on fees, speed, and performance.
-Click on "Services" (upper left):
+## Step 3: Choose virtual machine
+
+* Click on Services (upper left corner):
![AWS Services](./images-aws/aws_3.png "AWS Services button")
-Click on "EC2":
+* Click on EC2:
![EC2](./images-aws/aws_4.png "AWS EC2 button")
-!!! Note
+!!! Note "Amazon EC2"
- Amazon Elastic Cloud Computing features virtual computing environments called instances. These instances can vary in configurations of CPU, memory, storage, networking capacity. For the purposes of future tutorials, we will launch Ubuntu 20.04 Pro LTS. LTS releases are the ‘enterprise grade’ releases of Ubuntu and are utilized the most.
+ [Amazon Elastic Cloud Computing (Amazon EC2)](https://aws.amazon.com/ec2/?ec2-whats-new.sort-by=item.additionalFields.postDateTime&ec2-whats-new.sort-order=desc) features virtual computing environments called instances. They have varying combinations of CPU, memory, storage, and networking capacity, and give you the flexibility to choose the appropriate mix of resources for your applications.
-Click on "Launch Instance":
+* Click on Launch Instance:
![Launch Instance](./images-aws/aws_5.png "AWS launch button")
-Select "AWS Marketplace" on the left hand side tab:
-![AWS Marketplace](./images-aws/aws_6.png "AWS marketplace button")
+## Step 3: Choose an Amazon Machine Image (AMI)
-### Step 3: Choose an Amazon Machine Image (AMI)
+An Amazon Machine Image provides the template for the root volume of an instance (operating system, application server, and applications). It is akin to the Operating Sytem (OS) on a computer.
-An Amazon Machine Image is a special type of virtual appliance that is used to create a virtual machine within the Amazon Elastic Compute Cloud. It is a template for the root volume of an instance (operating system, application server, and applications).
+* Select AWS Marketplace on the left hand side tab:
+
+![AWS Marketplace](./images-aws/aws_6.png "AWS marketplace button")
-Type `Ubuntu 20.04 Pro LTS` in the search bar. Click "Select":
+* Type `Ubuntu Pro` in the search bar. Choose `Ubuntu Pro 20.04 LTS` by clicking Select:
![AMI](./images-aws/aws_7.png "AWS Ubuntu AMI")
-Click "Continue":
+!!! info "Ubuntu 20.04 AMI"
-![Ubuntu Pro](./images-aws/aws_9.PNG "Ubuntu Pro information")
+ `Ubuntu 20.04` was released in 2020 and is the latest version. This is a **Long Term Support (LTS)** release which means it will be equipped with software updates and security fixes. Since it is a `Pro` version the support will last for ten years until 2030.
-### Step 4: Choose an instance type
-Amazon EC2 provides a wide selection of instance types optimized to fit different use cases. Instances are virtual servers that can run applications. They have varying combinations of CPU, memory, storage, and networking capacity, and give you the flexibility to choose the appropriate mix of resources for your applications. Learn more about instance types and how they can meet your computing needs.
+* Click Continue on the popup window:
-Select the row with `t2.micro`, the free tier eligible option:
+![Ubuntu Focal](./images-aws/aws_9.png "Ubuntu Focal information")
+
+## Step 4: Choose an instance type
+
+Amazon EC2 provides a wide selection of instance types optimized to fit different use cases. You can consider instances to be similar to the hardware that will run your OS and applications. [Learn more about instance types and how they can meet your computing needs](https://aws.amazon.com/ec2/instance-types/).
+
+* For this tutorial we will select the row with `t2.micro` which is free tier eligible:
![t2.micro](./images-aws/aws_8.png "t2 micro instance type")
-!!! Note
+!!! Note "Free Tier Eligible"
+
+ The Free tier eligible tag lets us know that this particular operating system is covered by the [Free Tier program](https://aws.amazon.com/free/?all-free-tier.sort-by=item.additionalFields.SortRank&all-free-tier.sort-order=asc) where you use (limited) services without being charged. Limits could be based on how much storage you have access to and/or how many hours of compute you can perform in a one month.
+
+* You can proceed to launch the instance with default configurations by clicking on Review and Launch.
- The Free Tier Eligible tag lets us know that this particular operating system is covered by the Free Tier program where you use (limited) services without being charged. Limits could be based on how much storage you have access to and/or how many hours of compute you can perform in a one month.
+## Step 5: Optional configurations
-### Step 5: Optional Configurations
+There are several optional set up configurations.
-There are several optional set up configurations. You can either click "Review and Launch" now to start the instance we've configured thus far in the tutorial without these additional configurations or as necessary, click on the following tabs to continue configuring. Start the first option by clicking "Next: Configure Instance Details" on the AWS page.
+* Start the first option by clicking Next: Configure Instance Details on the AWS page.
-=== "Configure Instance Details"
+=== "Configure Instance"
- Configure the instance to suit your requirements. You can launch multiple instances from the same AMI, request Spot instances to take advantage of the lower pricing, assign an access management role to the instance, and more.
+ [Configure the instance to suit your requirements](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Configure_Instance.html). You can:
- A Spot Instance is an unused EC2 instance that is available for less than the On-Demand price. Because Spot Instances enable you to request unused EC2 instances at steep discounts, you can lower your Amazon EC2 costs significantly.
+ * change number of instances to launch
+ * select the subnet to use
+ * modify Stop or Terminate behaviors
+ * control if you would like the instance to update with any patches when in use
+ * request Spot Instances
+
+ !!! info "Spot Instance"
+
+ A [Spot Instance](https://aws.amazon.com/ec2/spot/?cards.sort-by=item.additionalFields.startDateTime&cards.sort-order=asc) is an unused EC2 instance that is available for less than the On-Demand price. Because Spot Instances enable you to request unused EC2 instances at steep discounts, you can lower your Amazon EC2 costs significantly.
=== "Add Storage"
- Your instance will be launched with the following storage device settings. You can attach additional EBS volumes and instance store volumes to your instance, or edit the settings of the root volume. You can also attach additional EBS volumes after launching an instance, but not instance store volumes. Learn more about storage options in Amazon EC2.
+ * Your instance comes with a in built storage called **instance store** and is useful for temporary data storage. The default root volume on a `t2.micro` is 8 GB.
+ * For data you might want to retain longer or use across multiple instances or encrypt it is best to use the [**Amazon Elastic Block Store volumes (Amazon EBS)**](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AmazonEBS.html).
+ * Attaching EBS volumes to an instance are similar to using external hard drives connected to a computer.
+ * Click on Add New Volume for additional storage.
+
+ !!! info "Free Storage"
+
+ You can get upto 30 GB of EBS general purpose (SSD) or Magnetic storage when using Free tier instances.
=== "Add Tags"
- A tag consists of a case-sensitive key-value pair. For example, you could define a tag with key = Name and value = Webserver. A copy of a tag can be applied to volumes, instances or both. Tags will be applied to all instances and volumes. Learn more about tagging your Amazon EC2 resources.
+ * Tags are useful to categorize your AWS resources: instances and volumes.
+ * A tag consists of a case-sensitive key-value pair. Some examples: GTEx-RNAseq, General-GWAS, KF-GWAS.
+ * [Learn more about tagging your Amazon EC2 resources](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Using_Tags.html).
=== "Configure Security Group"
- A security group is a set of firewall rules that control the traffic for your instance. On this page you can add rules to allow specific traffic to reach your instance. You can create a new security group or select from an existing one.
+ * Similar to setting up a firewall through which we would modify connection of external world and the EC2 instance.
+ * Blocks/allow connections based on port number and IP.
+ * You can create a new security group or select from an existing one.
+ * [Learn more about Security groups for EC2 instances](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-security-groups.html).
+
+## Step 6: Review and launch instance
-### Step 6: Review and Launch instance
+The last tab in setup is **Review** which summarizes all the selected configurations for the instance.
-After configuration settings are complete, click "Review and Launch" and "Launch". If you are launching an AWS instance for the first time, you will need to generate a key pair.
+* Click Launch after review.
![launch instance](./images-aws/aws_launch.png "launch the instance")
-Choose the "Create a new key pair" option from the drop down menu. Under key pair name, type "amazon" and click "save". The default location for saving files on a Mac is the "Downloads" folder -- that's where your key pair can be found. Next time you launch an instance, you can reuse the key pair you just generated.
+### Step 6a: Key pair
+If you are launching an AWS instance for the first time, you will need to generate a key pair.
-If you have a previously generated key pair, you can reuse it to launch an instance. For this tutorial, we are calling the key pair "amazon.pem".
+* Choose the Create a new key pair option from the drop down menu.
+* Type any name under **Key pair name**. In this tutorial we are naming it `amazon.pem`.
+* Click Download Key Pair to obtain the `.pem` file to your local machine. You can access the `.pem` file from the `Downloads` folder which is typically the default location for saving files. Next time you launch an instance, you can reuse the key pair you just generated.
+* If you have a previously generated key pair, you can reuse it to launch an instance using Choose an existing key pair option.
+
+!!! warning
+
+ Do not select **Proceed without a key pair** option since you will not be able to connect to the instance.
+
+* Check the acknowledgement box, and click Launch Instances.
![pem key](./images-aws/aws_10.png "key pair set up")
@@ -110,20 +159,25 @@ If you have a previously generated key pair, you can reuse it to launch an insta
For security purposes, the SSH (Secure Shell) protocol uses encryption to secure the connection between a client and a server. All user authentication, commands, output, and file transfers are encrypted to protect against attacks in the network. With SSH protocol (secure Shell) public key authentication improves security as it frees users from remembering complicated passwords.
-Then select your key pair, check the acknowledgement box, and click "Launch Instance". Now you should see:
+### Step 6b: Launch status
+
+You will be directed to the **Launch Status** page where the green colored box on top indicates a successful launch!
+
+* Click on this first hyperlink, which is the instance ID. Your hyperlink may be different.
![SSH](./images-aws/aws_11.png "Instance ID link")
-Click on this first hyperlink, in the image above, "i-038c58bfbe9612c57". Your hyperlink may be different.
-![Remote Host](./images-aws/aws_12.PNG "AWS instance running page")
+### Step 6c: Public DNS
-This page shows you a list of all your active instances. Users may launch as many instances as they wish. Just remember that every instance costs money if you don't qualify for the Free Tier. On this page, there is a "Public DNS" address, with the format `ec2-XXX-YYY-AAA.compute-1.amazon.aws.com`. You'll need this address to connect to your AWS computer.
+The instance console page shows you a list of all your active instances. Users may launch as many instances as they wish. Just remember that every instance costs money if you don't qualify for the Free Tier.
+* Obtain the **Public DNS** address with the format `ec2-XX-XX-X-XXX.us-yyyy-y.compute-1.amazon.aws.com` located under the Details tab.
-You have now successfully launched your AWS instance! You will need the Public DNS address from this amazon webpage to access your AWS computer, so do not close the page yet.
+![Remote Host](./images-aws/aws_12.png "AWS instance running page")
-If you happen to close the webpage on accident, [click on this link](https://us-west-1.console.aws.amazon.com/ec2/v2/home?region=us-west-1#Instances:sort=instanceId)
+You have now successfully launched your AWS instance! You will need the Public DNS address from this amazon webpage to access your AWS instance, so do not close the page yet.
-Continue on to the next lesson to learn how to connect to your AWS computer!
+If you happen to close the webpage on accident, [click on this link](https://us-west-1.console.aws.amazon.com/ec2/v2/home?region=us-west-1#Instances:sort=instanceId).
+Continue on to the next lesson to learn how to connect to your AWS instance!
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws4.md b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws4.md
index 0f228d132..26a2fab8a 100644
--- a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws4.md
+++ b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws4.md
@@ -1,6 +1,51 @@
-# Connect to your Instance
+---
+layout: page
+title: Connecting to your Instance
+hide:
+ - toc
+---
-=== "Windows Instruction"
+=== "Web Browser"
+
+ This is an OS-agnostic way of connecting to your AWS instance. The advantage of using this method is that Windows users do not need to worry about downloading an SSH client such as [MobaXterm](https://mobaxterm.mobatek.net/). The main disadvantage is that you cannot download files from the instance to your local machine via this web terminal interface.
+
+ ## Step 1: Find your launched instance
+ * Go to the page that lists all your instances:
+
+ ## Step 2: Select the instance
+
+ * Check the box next to your running instance.
+
+ ![](images-aws/connect_1.png)
+
+ ## Step 3: Connect
+
+ * Click on the Connect button on the top of the screen.
+
+ ![](images-aws/connect_2.png)
+
+ ## Step 4: EC2 instance connect tab
+
+ * On this page, make sure the **EC2 Instance Connect** tab is selected (orange highlight).
+ * Click theConnect button located at the bottom of the page.
+ * Do not change the default username. It should read **ubuntu**.
+
+ ![](images-aws/connect_3.png)
+
+ ## Step 5: Web browser terminal tab
+
+ * A terminal window will open up in a new tab.
+
+ ![](images-aws/connect_4.png)
+
+ !!! bug "Timeout"
+
+ The web browser terminal will become unresponsive after some inactivity. If that happens, make sure to close the terminal window and reconnect by following Steps 3 & 4.
+
+ Congratulations! You have successfully connected to your remote computer. You can download files onto your instance and install software programs via this web browser terminal.
+
+
+=== "Windows :fontawesome-brands-windows:"
Ok, so you've created a running computer. How do you get to it?
@@ -14,74 +59,66 @@
## Step 2: Start a new session
+ * Click on Session located in top left hand corner
+ * Choose SSH
+ * Click OK.
+
![mobaxterm1](./images-aws/mobaxterm_1.png "start new session")
![Remote Host](./images-aws/mobaxterm_2.png "select SSH session type")
- ## Step 3: Set up session settings
-
- ### Specify the session key
+ ## Step 3: Set up ssh settings
- Enter the public DNS address from the [AWS instance](https://us-west-1.console.aws.amazon.com/ec2/v2/home?region=us-west-1#Instances:sort=instanceId) page in the "Remote host" box. It will look something like this: `ec2-XXX-YYY-AAA.compute-1.amazon.aws.com`. Enter `ubuntu` for "Specify username".
+ * Enter the public DNS address from the [AWS instance](https://us-west-1.console.aws.amazon.com/ec2/v2/home?region=us-west-1#Instances:sort=instanceId) page in the **Remote host** box. It will look something like this: `ec2-XXX-YYY-AAA.compute-1.amazon.aws.com`.
+ * Enter **ubuntu** for **Specify username**.
![Hostname](./images-aws/mobaxterm_3.png "remote host ec2 address")
- Under "Advanced SSH settings", check the box by "Use private key" and search for the path to your "amazon.pem" key pair file/
+ * Under **Advanced SSH settings**, check the box by **Use private key** and search for the path to your **amazon.pem** key pair file.
![Private Key](./images-aws/mobaxterm_3_2.png "use private key file path")
- Click "OK" to complete session set up.
+ * Click OK to complete session set up.
If you see this screen and `ubuntu@ip-###-##-#-##:~$` as the command prompt, your AWS instance computer is ready for use!
![Ubuntu Terminal](./images-aws/mobaxterm_4.png "ubuntu command prompt")
- You can now use the AWS instance to run command line programs and run analyses. With MobaXterm, you can transfer files between your local computer and the remote instance by dragging and dropping files between MobaXterm's "SCP" tab (located on the left-hand side of the MobaXterm window) and your local computer's file explorer.
+ ## Step 4: Transferring files
+
+ With MobaXterm, you can transfer files between your local computer and the remote instance by dragging and dropping files between MobaXterm's **SCP** tab (located on the left-hand side of the MobaXterm window) and your local computer's file explorer.
![SCP Tab](./images-aws/Mobaxterm_transfer1.png "SCP tab")
![Transfer File](./images-aws/Mobaxterm_transfer2.png "transfer file windows")
- ## Step 4: Terminating the Instance
-
- Once you have completed your tasks and are sure you do not need the instance any longer, you may terminate the instance by returning to [AWS Management Console](https://us-west-1.console.aws.amazon.com/ec2/v2/home?region=us-west-1#Instances:sort=instanceId).
- !!! Warning
-
- If you simply close the MobaXterm terminal, the instance will continue to run and incur cost; it has not been terminated. You must go the instance on the AWS webpage to terminate it. Terminating an instance will erase all the work you have done on the instance! Be sure to download files from the remote instance to your local computer or other storage space before terminating the instance.
-
- - Click on "Services"
- - Click "EC2"
- - Click "Instance" on the left hand side bar menu and it should bring you to the list of running instances on your account.
- - Click on the instance you would like to terminate
- - Click "Actions"
- - Click "Instance State"
- - Select "Terminate"
-
- ![Terminate](./images-aws/Terminate.png "terminate instance button")
-
-
-=== "Mac OS"
+=== "macOS :fontawesome-brands-apple:"
Ok, so you've created a running computer. How do you get to it?
- The main thing you'll need is the network name of your new computer. To retrieve this, go to the [AWS instance view](https://us-west-1.console.aws.amazon.com/ec2/v2/home?region=us-west-1#Instances:sort=instanceId), click on the instance, and find the "Public DNS". This is the public name of your computer on the internet.
+ The main thing you'll need is the network name of your new computer. To retrieve this, go to the [AWS instance view](https://us-west-1.console.aws.amazon.com/ec2/v2/home?region=us-west-1#Instances:sort=instanceId), click on the instance, and find the **Public DNS** under the Details tab. This is the public name of your computer on the internet.
## Step 1: Locate private key
- Find the private key file; it is the .pem file you downloaded when starting up the EC2 instance. We called it "amazon.pem". It should be in your Downloads folder. Move it onto your desktop.
+ Find the private key file; it is the `.pem` file you downloaded when starting up the EC2 instance. We called it **amazon.pem**. It should be in your Downloads folder. In this lesson, we move it to the desktop for ease of access and compatibility with our lesson commands.
## Step 2: Login to remote instance
- Start Terminal and change the permissions on the .pem file for security purposes (removes read, write, and execute permissions for all users except the owner (you)):
+ * Start Terminal and change the permissions on the `.pem` file for security purposes. Your private key must not be publicly visible.
+ * Run the following command so that only the owner i.e. you can read the file.
+
```
- chmod og-rwx ~/Desktop/amzon.pem
+ chmod 400 ~/Desktop/amazon.pem
+
```
- Connect to remote instance:
+ * Connect to remote instance:
+
```
ssh -i ~/Desktop/amazon.pem ubuntu@ ec2-???-???-???-???.compute-1.amazonaws.com
```
+
where `ec2-???-???-???-???.compute-1.amazonaws.com` is the Public DNS we copied earlier.
@@ -92,7 +129,7 @@
- To use `scp` (secure copy) with a key pair use the following command:
```
- scp -i /directory/to/amazon.pem ubuntu@ec2-xx-xxx-xxx.compute-1.amazonaws.com:path/to/file /your/local/directory/files/to/download
+ scp -i ~/Desktop/amazon.pem ubuntu@ec2-xx-xxx-xxx.compute-1.amazonaws.com:path/to/file /your/local/directory/files/to/download
```
- You may also download a file from the remote instance download folder by archiving it:
@@ -104,43 +141,17 @@
- You can download all archived files from the remote instance by entering:
```
- scp - i/directory/to/amazon.pem ubuntu@ec2-xx-xx-xxx-xxx.compute-1.amazonaws.com:~/* /your/local/directory/files/to/download
+ scp -i ~/Desktop/amazon.pem ubuntu@ec2-xx-xx-xxx-xxx.compute-1.amazonaws.com:~/* /your/local/directory/files/to/download
```
### Copying files from local computer to remote instance
- - Your private key must not be publicly visible. Run the following command so that only the root user can read the file:
-
- ```
- chmod 400 amazon.pem
- ```
-
- To use `scp` with a key pair use the following command:
```
- scp -i /directory/to/amazon.pem /your/local/file/to/copy ubuntu@ec2-xx-xx-xxx-xxx.compute-1.amazonaws.com:path/to/file
+ scp -i ~/Desktop/amazon.pem /your/local/file/to/copy ubuntu@ec2-xx-xx-xxx-xxx.compute-1.amazonaws.com:path/to/file
```
!!! note
You need to make sure that the user "user" has the permission to write in the target directory. In this example, if ~/path/to/file was created by you, it should be fine.
-
-
- ## Step 4: Terminating the Instance
-
- Once you have completed your tasks and are sure you do not need the instance any longer, you may terminate the instance by returning to [AWS Management Console](https://us-west-1.console.aws.amazon.com/console/home?region=us-west-1).
-
-
- !!! warning
-
- If you simply close the MobaXterm terminal, the instance will continue to run and incur cost; it has not been terminated. You must go the instance on the AWS webpage to terminate it. Terminating an instance will erase all the work you have done on the instance! Be sure to download files from the remote instance to your local computer or other storage space before terminating the instance.
-
- - Click on "Services"
- - Click "EC2"
- - Click "Instance" on the left hand side bar menu and it should bring you to the list of running instances on your account.
- - Click on the instance you would like to terminate
- - Click "Actions"
- - Click "Instance State"
- - Select "Terminate"
-
- ![Terminate](./images-aws/Terminate.png "terminate instance button")
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws5.md b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws5.md
new file mode 100644
index 000000000..d173f1c50
--- /dev/null
+++ b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws5.md
@@ -0,0 +1,19 @@
+# Terminating an Instance
+
+Once you have completed your tasks and are sure you do not need the instance any longer, you may terminate the instance by returning to [AWS Management Console](https://us-west-1.console.aws.amazon.com/console/home?region=us-west-1).
+
+
+!!! warning
+
+ If you simply close the MobaXterm terminal, the instance will continue to run and incur cost; it has not been terminated. You must go the instance on the AWS webpage to terminate it.
+
+ Terminating an instance will erase all the work you have done on the instance! Be sure to download files from the remote instance to your local computer or other storage space before terminating the instance.
+
+- Click on Services
+- Click EC2
+- Click Instance on the Instances tab of the left side bar menu and it should bring you to the list of running instances on your account.
+- Select the instance you would like to terminate
+- Click Instance state
+- Select Terminate instance
+
+![Terminate](./images-aws/Terminate.png "terminate instance button")
diff --git a/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws5_Screen.md b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws5_Screen.md
new file mode 100644
index 000000000..0b6c11dcc
--- /dev/null
+++ b/docs/Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws5_Screen.md
@@ -0,0 +1,204 @@
+# Using the Screen Command
+
+Screen or GNU Screen is a terminal multiplexer. You can start a terminal session and then open multiple screens inside that session. Processes running in Screen will continue to run when their window is not visible, even if your session gets disconnected/times out - you can even close the terminal/connection to AWS instance and commands running on screens will continue!
+
+The goal of this tutorial is to learn how to run and switch between multiple screen sessions. Check out the [screen cheat sheet](../../Cheat-Sheets/screen_cheatsheet.md) for commonly used commands.
+
+## Video Walkthrough
+
+
+## Installing the Screen
+
+To install screen, run the following command:
+
+=== "AWS Instance Code"
+ ```
+ sudo apt update
+ sudo apt-get install screen
+ ```
+
+=== "Expected Output"
+ ```
+ ubuntu@ip-172-31-7-6:~$ sudo apt-get install screen
+ Reading package lists... Done
+ Building dependency tree
+ Reading state information... Done
+ screen is already the newest version (4.8.0-1).
+ 0 upgraded, 0 newly installed, 0 to remove and 7 not upgraded.
+ ```
+
+The first line of code updates your instance to all latest software configurations. The second line of code does the actual installing.
+
+## Running Screen
+
+To help tell the different terminal screens apart, type this command on the original terminal window (before you try out `screen`):
+
+=== "AWS Instance Code"
+
+ ```
+ echo "this is the original terminal window"
+ ```
+
+!!! Warning
+ Do not clear the screen. We want to make sure the "this is the original terminal window" text lingers when we toggle back to this window later.
+
+ This message may not be visible in the AWS browser terminal.
+
+
+Then, type the following into the same AWS instance to start a new screen session.
+
+=== "AWS Instance Code"
+ ```
+ screen
+ ```
+=== "Expected Output"
+ ```
+ GNU Screen version 4.08.00 (GNU) 05-Feb-20
+
+ Copyright (c) 2018-2020 Alexander Naumov, Amadeusz Slawinski
+ Copyright (c) 2015-2017 Juergen Weigert, Alexander Naumov, Amadeusz Slawinski
+ Copyright (c) 2010-2014 Juergen Weigert, Sadrul Habib Chowdhury
+ Copyright (c) 2008-2009 Juergen Weigert, Michael Schroeder, Micah Cowan,
+ Sadrul Habib Chowdhury
+ Copyright (c) 1993-2007 Juergen Weigert, Michael Schroeder
+ Copyright (c) 1987 Oliver Laumann
+
+ This program is free software; you can redistribute it and/or modify it under
+ the terms of the GNU General Public License as published by the Free Software
+ Foundation; either version 3, or (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful, but WITHOUT
+ ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+ FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License along with
+ this program (see the file COPYING); if not, see
+
+ [Press Space for next page; Return to end.]
+ ```
+Press ++space++ (twice) or ++return++ to get to the command prompt. You are now on a new screen!
+
+!!! Note
+ If you plan to work with multiple screens, it helps to give each screen a unique name to tell them apart. You can name your screen by adding the `-S` flag and a name:
+ ```
+ screen -S
+ ```
+
+## Using screen
+
+Let's run a program in the new screen window to test it out:
+
+=== "AWS Instance Code"
+ ```
+ top
+ ```
+The `top` command is used to show the Linux processes. It provides a dynamic real-time view of the running system. Usually, this command shows the summary information of the system and the list of processes or threads which are currently managed by the Linux Kernel.
+
+While the top command is still running, create a screen tab by clicking ++ctrl+a+c++. You should see a new blank terminal.
+
+Let's run a command in this blank terminal:
+
+=== "AWS Instance Code"
+ ```
+ ping google.com
+ ```
+=== "Expected Output"
+ ```
+ PING google.com (172.217.4.206) 56(84) bytes of data.
+ 64 bytes from ord37s19-in-f14.1e100.net (172.217.4.206): icmp_seq=1 ttl=100 time=17.4 ms
+ 64 bytes from ord37s19-in-f14.1e100.net (172.217.4.206): icmp_seq=2 ttl=100 time=17.4 ms
+ 64 bytes from ord37s19-in-f14.1e100.net (172.217.4.206): icmp_seq=3 ttl=100 time=17.5 ms
+ 64 bytes from ord37s19-in-f14.1e100.net (172.217.4.206): icmp_seq=4 ttl=100 time=17.5 ms
+ .....
+ ```
+
+A ping test is a method of checking if the computer is connected to a network. It is used to ensure that a host computer which your computer tries to access is operating. It is run for troubleshooting to know connectivity as well as response time.
+
+Now, we have three running "screens":
+
+- Screen 1) The original ssh terminal you saw at login. You typed the `echo` command in it.
+- Screen 2) A screen that's running the `top` command.
+- Screen 3) A screen that's running the `ping google.com` command.
+
+!!! tip
+
+ Think of the first screen command entered into your original terminal like opening a new browser window (i.e., our screen 2).
+
+ - After starting a screen, you can open multiple tabs within the same window using ++ctrl+a+c++ (i.e., our screen 3).
+
+ - You can toggle between the multiple tabs using ++ctrl+a+p++. You can have commands running in each tab.
+
+ - Detaching the screen allows you to go back to the original window/panel.
+
+ - If you start another screen from the original terminal window, it would be like opening another browser window, instead of adding a tab to an existing window.
+
+### Switching between tabs
+
+ To switch between the two new tabs, i.e., our screen 2 and screen 3, type ++ctrl+a+p++. You cannot toggle to the original terminal screen (i.e. screen 1) with this shortcut.
+
+### Detaching screens
+
+ To detach a screen session and return to your original SSH terminal, type ++ctrl+a+d++. You will be taken back to the terminal window with the `echo` command:
+
+![](./images-aws/original_ssh_terminal.png)
+
+To list your current screen sessions type:
+
+=== "AWS Instance Code"
+ ```
+ screen -ls
+ ```
+
+=== "Expected Output"
+ ```
+ ubuntu@ip-172-31-7-6:~$ screen -ls
+ There is a screen on:
+ 2683.pts-0.ip-172-31-7-6 (02/09/21 19:41:19) (Detached)
+ 1 Socket in /run/screen/S-ubuntu.
+ ```
+
+!!! Note
+ Typing `screen -ls` on the original terminal shows you only one screen ID because you have **one screen (i.e. browser window) with two tabs**. You can open another screen window by typing `screen` into the original terminal.
+
+From the output of `screen -ls` above, we use the screen id to reconnect to the screen:
+
+=== "AWS Instance Code"
+
+ ```
+ screen -r 2683.pts-0.ip-172-31-7-6
+ ```
+ or
+ ```
+ screen -r 2683
+ ```
+
+You should see screen 2 that you previously created. Once again you can toggle between screen 2 and screen 3 tabs by typing ++ctrl+a+p++
+
+## Quitting screens
+
+You can quit the `top` output by typing `q` or ++ctrl+c++
+
+To end a screen session, toggle into the session and type:
+
+=== "AWS Console Code"
+ ```
+ exit
+ ```
+
+If no other screen sessions are open, you will fall back to the original SSH terminal (screen 1). If another screen session is open, you will enter that screen session. You can type `exit` until all screen sessions are closed.
+
+!!! Warning
+ - Typing `exit` into a screen permanently closes that screen session and all the analyses that are conducted in it.
+
+ - You may need to type `exit` twice to get the "screen is terminating" message. But be sure to check!
+
+ ```
+ exit
+ There are stopped jobs.
+
+ exit
+ [screen is terminating]
+ ```
+ Typing `exit` too many times will exit the entire terminal!
+
+Don't forget to check out our [screen cheat sheet](../../Cheat-Sheets/screen_cheatsheet.md) if you need a quick reference to screen commands.
diff --git a/docs/Bioinformatics-Skills/Kids-First/Advanced-KF-Portal-Queries/.pages b/docs/Bioinformatics-Skills/Kids-First/Advanced-KF-Portal-Queries/.pages
index 830a9ff3b..788abd21e 100644
--- a/docs/Bioinformatics-Skills/Kids-First/Advanced-KF-Portal-Queries/.pages
+++ b/docs/Bioinformatics-Skills/Kids-First/Advanced-KF-Portal-Queries/.pages
@@ -2,4 +2,5 @@ nav:
- KF_9_AdvancedQuery.md
- KF_10_AndOr.md
- KF_11_JointQuery.md
- - KF_12_CheckingQueries.md
\ No newline at end of file
+ - KF_12_CheckingQueries.md
+ - KF_13_SavingQueries.md
diff --git a/docs/Bioinformatics-Skills/Kids-First/Advanced-KF-Portal-Queries/KF_13_SavingQueries.md b/docs/Bioinformatics-Skills/Kids-First/Advanced-KF-Portal-Queries/KF_13_SavingQueries.md
new file mode 100644
index 000000000..7817ad67e
--- /dev/null
+++ b/docs/Bioinformatics-Skills/Kids-First/Advanced-KF-Portal-Queries/KF_13_SavingQueries.md
@@ -0,0 +1,71 @@
+---
+layout: page
+title: Saving Queries
+---
+
+Saving Queries
+================
+
+There are a few options for saving and sharing your query searches. These options save time re-entering filters and are great for sharing search results with collaborators!
+
+!!! note
+
+ We ran these queries in January 2021. If you are doing this tutorial later, you may see more participants.
+
+## Save participant sets
+
+On the Explore Data section of the portal, you can save specific **lists of participants**. These lists can be applied in future queries as filters. Since this option saves a particular set of participants, it **does not update** with new data automatically.
+
+For example, select all participants between 0 and 5 years of age at the time of their diagnosis:
+
+- Click Clinical tab
+- Select Age at Diagnosis filter, enter 0 and 5 in the value fields with year as unit and click Apply
+- To save this particular set of participants, click Save participants set and Save as new set
+
+![](../images-kf/KF_saveset1.png "Save participants set")
+
+Give your set a name (here, "age0to5") and click Save:
+
+![](../images-kf/KF_saveset2.png "Name participants set")
+
+The saved set will be added to the My Participants Sets panel on the portal Dashboard:
+
+![](../images-kf/KF_saveset3.png "Dashboard saved participants sets")
+
+The saved participants set are useful if you want to add this particular list of participants to a query. Remember that if new data is added to the portal, the saved set will not be updated. To include new participants between 0-5 years old, we would have to redo the Age at Diagnosis filter.
+
+To add the saved participants list to a query in the Explore Data page:
+
+- First clear all existing filters by clicking :fontawesome-regular-trash-alt: icon and subsequently click DELETE
+- Select My sets, use the dropdown menu to select the saved set ("age0to5")
+- Click Add to query
+
+![](../images-kf/KF_saveset4.png "My sets filter")
+
+The filter field is now updated. Instead of saying "Age at Diagnosis", it says the filter is for "Participant ID is any of age0to5".
+
+![](../images-kf/KF_saveset5.png "Participants ID query")
+
+## Save virtual studies
+
+On the Explore Data page, we can save query searches as a **virtual study**. This option will save all the filters in your query. Unlike the participant sets, this virtual study **will update** if new data is added to the portal and fits the query filters.
+
+For example, let's save a joint query virtual study that [we built in our previous lesson](./KF_11_JointQuery.md). After entering query terms, click on Save:
+
+![](../images-kf/KF_savevirtualstudy1.png "Save virtual study")
+
+Give the virtual study a name and description, then click Save:
+
+![](../images-kf/KF_savevirtualstudy2.png "Name and study description")
+
+The saved virtual study will be added to the My Saved Queries panel on the portal Dashboard:
+
+![](../images-kf/KF_savevirtualstudy3.png "My Saved Queries")
+
+## Share virtual studies
+
+On the Explore Data page, we can also share a URL for virtual studies! This is a great way to share queries with collaborators. In order to view the URL, collaborators also need an [account on the Kids First portal](../Portal-Setup-And-Permissions/KF_3_KF_Registration.md).
+
+Click on share and copy short URL
+
+![](../images-kf/KF_savevirtualstudy4.png "Share virtual study")
diff --git a/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_saveset1.png b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_saveset1.png
new file mode 100644
index 000000000..34ffdddd3
Binary files /dev/null and b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_saveset1.png differ
diff --git a/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_saveset2.png b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_saveset2.png
new file mode 100644
index 000000000..5b5e5ba06
Binary files /dev/null and b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_saveset2.png differ
diff --git a/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_saveset3.png b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_saveset3.png
new file mode 100644
index 000000000..cb1680ecd
Binary files /dev/null and b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_saveset3.png differ
diff --git a/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_saveset4.png b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_saveset4.png
new file mode 100644
index 000000000..62c0178fd
Binary files /dev/null and b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_saveset4.png differ
diff --git a/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_saveset5.png b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_saveset5.png
new file mode 100644
index 000000000..773deecf7
Binary files /dev/null and b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_saveset5.png differ
diff --git a/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_savevirtualstudy1.png b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_savevirtualstudy1.png
new file mode 100644
index 000000000..f3016097a
Binary files /dev/null and b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_savevirtualstudy1.png differ
diff --git a/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_savevirtualstudy2.png b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_savevirtualstudy2.png
new file mode 100644
index 000000000..e65fb4367
Binary files /dev/null and b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_savevirtualstudy2.png differ
diff --git a/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_savevirtualstudy3.png b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_savevirtualstudy3.png
new file mode 100644
index 000000000..31640a44b
Binary files /dev/null and b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_savevirtualstudy3.png differ
diff --git a/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_savevirtualstudy4.png b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_savevirtualstudy4.png
new file mode 100644
index 000000000..6808f73cf
Binary files /dev/null and b/docs/Bioinformatics-Skills/Kids-First/images-kf/KF_savevirtualstudy4.png differ
diff --git a/docs/Bioinformatics-Skills/Kids-First/index.md b/docs/Bioinformatics-Skills/Kids-First/index.md
index 3c7c0546b..9c68301e3 100644
--- a/docs/Bioinformatics-Skills/Kids-First/index.md
+++ b/docs/Bioinformatics-Skills/Kids-First/index.md
@@ -1,9 +1,13 @@
---
layout: page
title: KF Portal Overview
+hide:
+ - toc
---
-**Kids First (KF) Lessons**
+Kids First & Cavatica Lessons
+===================================
+
The [NIH Common Fund-supported Gabriella Miller Kids First Data Resource
Center](https://kidsfirstdrc.org/) (KFDRC) enables researchers, clinicians, and
@@ -46,16 +50,17 @@ Est. Time | Lesson name | Description
10 mins | [Ands & Ors](Advanced-KF-Portal-Queries/KF_10_AndOr.md) | Use conditional statements to filter
10 mins | [Joint Queries](Advanced-KF-Portal-Queries/KF_11_JointQuery.md) | Link multiple filters as joint queries
10 mins | [Checking Queries](Advanced-KF-Portal-Queries/KF_12_CheckingQueries.md) | Interpret query results
+10 mins | [Saving Queries](Advanced-KF-Portal-Queries/KF_13_SavingQueries.md) | Save participant sets or search term combinations
Importing and downloading KF Data:
Est. Time | Lesson name | Description
--- | --- | ---
10 mins | [Push to Cavatica](KF_7_PushToCavatica.md) | Move KF data to Cavatica
-10 mins | [Data Download Options](Download_Data/index.md) | Options for downloading data to your local computer
+10 mins | [Data Download Options](Download_Data/index.md) | Options for downloading data to your local computer
10 mins | [Data Download via KF Portal](Download_Data/Data-Download-Via-KF-Portal.md) | Local download from the KF portal
20 mins | [Data Download via Cavatica](Download_Data/Data-Download-Via-Cavatica.md) | Local download from Cavatica
-30 mins | [Data Upload to Cavatica](Upload_Data.md) | Using the Command Line Uploader tool to move files from AWS to Cavatica
+30 mins | [Data Upload to Cavatica](Upload_Data.md) | Using the Command Line Uploader tool to move files from AWS to Cavatica
!!! note "Learning Objectives"
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/.pages b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/.pages
new file mode 100644
index 000000000..77c9ce86e
--- /dev/null
+++ b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/.pages
@@ -0,0 +1,11 @@
+nav:
+ - rna_seq_1.md
+ - rna_seq_2.md
+ - rna_seq_3.md
+ - rna_seq_4.md
+ - rna_seq_5.md
+ - rna_seq_6.md
+ - rna_seq_7.md
+ - rna_seq_8.md
+
+
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/10_Cavatica.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/10_Cavatica.png
new file mode 100644
index 000000000..a694d0828
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/10_Cavatica.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/11_Cavatica.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/11_Cavatica.png
new file mode 100644
index 000000000..2d86beb95
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/11_Cavatica.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/12_Cavatica.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/12_Cavatica.png
new file mode 100644
index 000000000..7627cadbb
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/12_Cavatica.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/13_Cavatica.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/13_Cavatica.png
new file mode 100644
index 000000000..e821339da
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/13_Cavatica.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/14_Cavatica.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/14_Cavatica.png
new file mode 100644
index 000000000..91e0a43ef
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/14_Cavatica.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/15_Cavatica.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/15_Cavatica.png
new file mode 100644
index 000000000..a2b6ed71d
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/15_Cavatica.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/16_Cavatica.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/16_Cavatica.png
new file mode 100644
index 000000000..46f9858a0
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/16_Cavatica.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/17_Cavatica.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/17_Cavatica.png
new file mode 100644
index 000000000..2f952801b
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/17_Cavatica.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/18_Cavatica.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/18_Cavatica.png
new file mode 100644
index 000000000..7b967d3ae
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/18_Cavatica.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/19_Cavatica.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/19_Cavatica.png
new file mode 100644
index 000000000..5c2c8d0f7
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/19_Cavatica.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/1_KFDRC.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/1_KFDRC.png
new file mode 100644
index 000000000..67a6dae7c
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/1_KFDRC.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/20_Cavatica.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/20_Cavatica.png
new file mode 100644
index 000000000..afea4f4cc
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/20_Cavatica.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/21_Cavatica.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/21_Cavatica.png
new file mode 100644
index 000000000..2cc0296a7
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/21_Cavatica.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/22_Cavatica.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/22_Cavatica.png
new file mode 100644
index 000000000..3cfb5ae86
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/22_Cavatica.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/2_KFDRC.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/2_KFDRC.png
new file mode 100644
index 000000000..abdbcbaa1
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/2_KFDRC.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/3_KFDRC.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/3_KFDRC.png
new file mode 100644
index 000000000..87ba7793c
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/3_KFDRC.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/4_KFDRC.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/4_KFDRC.png
new file mode 100644
index 000000000..bb579d2ea
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/4_KFDRC.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/5_KFDRC.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/5_KFDRC.png
new file mode 100644
index 000000000..121b06702
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/5_KFDRC.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/6_KFDRC.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/6_KFDRC.png
new file mode 100644
index 000000000..cf4897669
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/6_KFDRC.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/7_KFDRC.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/7_KFDRC.png
new file mode 100644
index 000000000..697e37ab3
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/7_KFDRC.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/8_KFDRC.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/8_KFDRC.png
new file mode 100644
index 000000000..5f5dbffa8
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/8_KFDRC.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/9_KFDRC.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/9_KFDRC.png
new file mode 100644
index 000000000..abd0ce5e2
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/9_KFDRC.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-1.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-1.png
new file mode 100644
index 000000000..ecc32734c
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-1.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-10.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-10.png
new file mode 100644
index 000000000..69417b9d0
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-10.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-2.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-2.png
new file mode 100644
index 000000000..4a60898d5
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-2.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-3.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-3.png
new file mode 100644
index 000000000..face5a03a
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-3.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-4.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-4.png
new file mode 100644
index 000000000..01d96d627
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-4.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-5.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-5.png
new file mode 100644
index 000000000..145da4659
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-5.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-6.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-6.png
new file mode 100644
index 000000000..0409d3b5f
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-6.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-7.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-7.png
new file mode 100644
index 000000000..df5ae57ab
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-7.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-8.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-8.png
new file mode 100644
index 000000000..1ffedaa9d
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-8.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-9.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-9.png
new file mode 100644
index 000000000..8640f9120
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-6-9.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-1.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-1.png
new file mode 100644
index 000000000..6d85f2d1f
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-1.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-2.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-2.png
new file mode 100644
index 000000000..0e8bf51b1
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-2.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-3.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-3.png
new file mode 100644
index 000000000..113ca8555
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-3.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-4.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-4.png
new file mode 100644
index 000000000..0cf2ddd3f
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-4.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-5.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-5.png
new file mode 100644
index 000000000..45b77a8a5
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-5.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-6.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-6.png
new file mode 100644
index 000000000..a22f3c144
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-7-6.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-1.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-1.png
new file mode 100644
index 000000000..ec476d2f3
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-1.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-2.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-2.png
new file mode 100644
index 000000000..754b55d03
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-2.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-3.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-3.png
new file mode 100644
index 000000000..ae23b402b
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-3.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-4.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-4.png
new file mode 100644
index 000000000..2c4ac8383
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-4.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-5.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-5.png
new file mode 100644
index 000000000..c308172c7
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-5.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-6.png b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-6.png
new file mode 100644
index 000000000..7a4864e2e
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-8-6.png differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-workflow.jpeg b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-workflow.jpeg
new file mode 100644
index 000000000..78e7e6778
Binary files /dev/null and b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-images/rna-seq-workflow.jpeg differ
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-supporting-docs/.Rapp.history b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-supporting-docs/.Rapp.history
new file mode 100644
index 000000000..e69de29bb
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-supporting-docs/Cancer_DGE_Analysis.R b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-supporting-docs/Cancer_DGE_Analysis.R
new file mode 100644
index 000000000..0642d0016
--- /dev/null
+++ b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-supporting-docs/Cancer_DGE_Analysis.R
@@ -0,0 +1,172 @@
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ # Author: Saranya Canchi
+ # Filename: Cancer_DGE_Analysis.R
+ # Purpose: R script for differential gene expression between
+ # pediatric cancer types using DESeq2
+ # Version: 1.0
+ # Date: 01/22/2021
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+# Install libraries####
+
+if (!requireNamespace("BiocManager", quietly = TRUE))
+ install.packages("BiocManager")
+BiocManager::install(c("tximport",
+ "regionReport",
+ "org.Hs.eg.db",
+ "pcaExplorer",
+ "BiocStyle"),
+ update=FALSE,
+ ask=FALSE)
+
+# Load libraries####
+
+library(GenomicFeatures)
+library(DESeq2)
+library(tximport)
+library(org.Hs.eg.db)
+library(pcaExplorer)
+library(regionReport)
+library(ggplot2)
+library(knitr)
+
+# Generate transcript and gene name table for gene level summary####
+## Using GenomicFeatures pkg to read in the reference GTF files - following steps from DESeq2 app
+
+data_dir <- "/sbgenomics/project-files"
+txdb <- makeTxDbFromGFF(file= file.path(data_dir,"Homo_sapiens.GRCh38.84.gtf"))
+k <- keys(txdb, keytype = "TXNAME")
+
+## HGNC gene name not available in list of filters. Using Ensemble gene name for one to one mapping.
+
+tx2gene <- select(txdb, k, "GENEID", "TXNAME")
+head(tx2gene)
+
+# Read the phenotype data####
+
+pheno_data <- read.csv(file.path(data_dir,"phenotype_filtered.csv"))
+str(pheno_data)
+
+## Converting covariates of interest to factors
+### Setting Ependymoma as reference factor level for histology variable
+
+pheno_data$histology <- factor(pheno_data$histology,
+ levels=c("Ependymoma", "Medulloblastoma"))
+pheno_data$tumor_location <- factor(pheno_data$tumor_location)
+pheno_data$diagnosis_age_range <- factor(pheno_data$diagnosis_age_range)
+
+# Generate gene level summary using tximport####
+
+head(list.files(data_dir))
+files <- file.path(data_dir, pheno_data$name)
+names(files) <- pheno_data$sample_id
+head(files)
+txi_sum <- tximport(files,
+ type="kallisto",
+ tx2gene=tx2gene,
+ ignoreTxVersion = TRUE)
+names(txi_sum)
+head(txi_sum$counts)
+
+# DESeq2 import and analysis####
+
+dds_cancer <- DESeqDataSetFromTximport(txi_sum,
+ colData = pheno_data,
+ design = ~ diagnosis_age_range + tumor_location + histology)
+head(rownames(dds_cancer))
+
+## Generating additional gene info to add to dds object
+
+cancer_gene_map <- mapIds(org.Hs.eg.db,
+ keys=rownames(dds_cancer),
+ column="SYMBOL",
+ keytype="ENSEMBL",
+ multiVals="first")
+cancer_gene_map <- stack(cancer_gene_map)
+head(cancer_gene_map)
+colnames(cancer_gene_map)=c("gene_symbol","ensembl_gene_id")
+
+### Check row for row line up and add gene symbol column
+
+all(rownames(dds_cancer) == cancer_gene_map$ensembl_gene_id)
+mcols(dds_cancer) <- cbind(mcols(dds_cancer), cancer_gene_map$gene_symbol)
+colnames(rowData(dds_cancer))[1]="gene_symbol"
+
+# PCA interactive analysis - pcaExplorer####
+## Select Try again in the popup window to open in a new browser tab.
+
+pcaExplorer(dds=dds_cancer,
+ annotation=cancer_gene_map)
+
+## Alternatively use Variance stabilization transformation to generate PCA plot
+
+output_dir <- "/sbgenomics/output-files"
+vsd_cancer <- vst(dds_cancer, blind=FALSE)
+
+### Using histology and tumor_location variables
+
+png(filename=file.path(output_dir,"PCA_histology_tumor-location.png"))
+plotPCA(vsd_cancer, intgroup=c("histology","tumor_location"))
+dev.off()
+
+### Using histology and diagnosis_age_range variables
+
+png(filename=file.path(output_dir,"PCA_histology_diagnosis-age.png"))
+plotPCA(vsd_cancer, intgroup=c("histology","diagnosis_age_range"))
+dev.off()
+
+### Note: The PCA plots in the report generated by the DESeq2 app use rlog transformation.
+# We can create the same plot using the following code:
+#rld_cancer <- rlog(dds_cancer, blind=FALSE)
+#plotPCA(rld_cancer, intgroup=c("histology"))
+
+
+# DGE analysis####
+
+dds_cancer <- DESeq(dds_cancer,
+ fit="parametric",
+ test='Wald',
+ betaPrior=TRUE,
+ parallel=TRUE)
+
+res_cancer <- results(dds_cancer,
+ contrast=c("histology","Medulloblastoma","Ependymoma"),
+ alpha=0.05)
+summary(res_cancer)
+resultsNames(dds_cancer)
+
+plotMA(res_cancer,
+ ylim=range(res_cancer$log2FoldChange, na.rm=TRUE))
+
+## Write the results to a table
+
+res_cancer_order <- res_cancer[order(res_cancer$pvalue),]
+write.csv(as.data.frame(res_cancer_order),
+ file=file.path(output_dir,"Cancer_DESeq2_DGE_results.csv"))
+
+## Write normalized counts to a file
+
+norm_counts <- counts(dds_cancer,normalized=TRUE)
+write.table(norm_counts,
+ file=file.path(output_dir,"Cancer_DESeq2_normalized_counts.txt"),
+ sep="\t",
+ quote=FALSE,
+ col.names = NA)
+
+# Create a report with all the visualization from the DESeq2 vignette####
+
+dir.create(file.path(output_dir,"DESeq2-Report"),
+ showWarnings = FALSE,
+ recursive = TRUE)
+
+report <- DESeq2Report(dds = dds_cancer,
+ project = 'Pediatric Cancer DGE Analysis with DESeq2',
+ intgroup = c('histology','diagnosis_age_range'),
+ res = res_cancer,
+ outdir = file.path(output_dir,"DESeq2-Report"),
+ output = 'Cancer-DESeq2-Report',
+ theme = theme_bw())
+
+save.image(file=file.path(output_dir,paste0("Cancer_DGE_",Sys.Date(), ".env.Rdata")))
+
+
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-supporting-docs/Cancer_DGE_Analysis_Automate.R b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-supporting-docs/Cancer_DGE_Analysis_Automate.R
new file mode 100644
index 000000000..3772aec3d
--- /dev/null
+++ b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna-seq-supporting-docs/Cancer_DGE_Analysis_Automate.R
@@ -0,0 +1,110 @@
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ # Author: Saranya Canchi
+ # Filename: Cancer_DGE_Analysis_Automate.R
+ # Purpose: R script for differential gene expression between
+ # pediatric cancer types using DESeq2
+ # Version: 1.0
+ # Date: 01/22/2021
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+# Install libraries####
+
+if (!requireNamespace("BiocManager", quietly = TRUE))
+ install.packages("BiocManager")
+BiocManager::install(c("tximport",
+ "regionReport",
+ "BiocStyle"),
+ update=FALSE,
+ ask=FALSE)
+
+# Load libraries####
+
+library(GenomicFeatures)
+library(DESeq2)
+library(tximport)
+library(regionReport)
+library(ggplot2)
+library(knitr)
+
+# Generate transcript and gene name table for gene level summary####
+## Using GenomicFeatures pkg to read in the reference GTF files - following steps from DESeq2 app
+
+data_dir <- "/sbgenomics/project-files"
+txdb <- makeTxDbFromGFF(file= file.path(data_dir,"Homo_sapiens.GRCh38.84.gtf"))
+k <- keys(txdb, keytype = "TXNAME")
+
+## HGNC gene name not available in list of filters. Using Ensemble gene name for one to one mapping.
+
+tx2gene <- select(txdb, k, "GENEID", "TXNAME")
+
+# Read the phenotype data####
+
+pheno_data <- read.csv(file.path(data_dir,"phenotype_filtered.csv"))
+
+## Converting covariates of interest to factors
+### Setting Ependymoma as reference factor level for histology variable
+
+pheno_data$histology <- factor(pheno_data$histology, levels=c("Ependymoma", "Medulloblastoma"))
+pheno_data$tumor_location <- factor(pheno_data$tumor_location)
+pheno_data$diagnosis_age_range <- factor(pheno_data$diagnosis_age_range)
+
+# Generate gene level summary using tximport####
+
+head(list.files(data_dir))
+files <- file.path(data_dir, pheno_data$name)
+names(files) <- pheno_data$sample_id
+txi_sum <- tximport(files,
+ type="kallisto",
+ tx2gene=tx2gene,
+ ignoreTxVersion = TRUE)
+
+# DESeq2 import and analysis####
+
+dds_cancer <- DESeqDataSetFromTximport(txi_sum,
+ colData = pheno_data,
+ design = ~ diagnosis_age_range + tumor_location + histology)
+
+# DGE analysis####
+
+dds_cancer <- DESeq(dds_cancer,
+ fit="parametric",
+ test='Wald',
+ betaPrior=TRUE,
+ parallel=TRUE)
+
+res_cancer <- results(dds_cancer,
+ contrast=c("histology","Medulloblastoma","Ependymoma"),
+ alpha=0.05)
+
+## Write the results to a table
+
+output_dir <- "/sbgenomics/output-files"
+res_cancer_order <- res_cancer[order(res_cancer$pvalue),]
+write.csv(as.data.frame(res_cancer_order),
+ file=file.path(output_dir,"Cancer_DESeq2_DGE_results.csv"))
+
+## Write normalized counts to a file
+
+norm_counts <- counts(dds_cancer,normalized=TRUE)
+write.table(norm_counts,
+ file=file.path(output_dir,"Cancer_DESeq2_normalized_counts.txt"),
+ sep="\t",
+ quote=FALSE,
+ col.names = NA)
+
+# Create a report with all the visualization from the DESeq2 vignette####
+
+dir.create(file.path(output_dir,"DESeq2-Report"),
+ showWarnings = FALSE,
+ recursive = TRUE)
+
+report <- DESeq2Report(dds = dds_cancer,
+ project = 'Pediatric Cancer DGE Analysis with DESeq2',
+ intgroup = c('histology','diagnosis_age_range'),
+ res = res_cancer,
+ outdir = file.path(output_dir,"DESeq2-Report"),
+ output = 'Cancer-DESeq2-Report',
+ theme = theme_bw())
+
+save.image(file=file.path(output_dir,paste0("Cancer_DGE_",Sys.Date(), ".env.Rdata")))
+
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_1.md b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_1.md
new file mode 100644
index 000000000..1a3984cd8
--- /dev/null
+++ b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_1.md
@@ -0,0 +1,62 @@
+---
+layout: page
+title: RNAseq Tutorial Overview
+hide:
+ - toc
+---
+
+Differential Gene Expression Analysis on Cavatica Cloud Platform
+============================================
+
+**RNA sequencing (RNAseq)** is a high throughput technique that provides qualitative and quantitative information about RNA biology including transcriptome-wide expression quantification, discovery of novel genes and gene isoforms, and differential expression.
+
+The goal of this tutorial is to enable you to:
+
+1. create virtual cancer cohorts using the NIH Common Fund-supported Gabriella Miller Kids First Data Portal (KF Portal).
+2. analyze the differential gene expression (DGE) on Cavatica, an integrated cloud based platform.
+
+You will learn two different approaches for DGE analysis using open access human cancer data on Cavatica: **(a)** using a public workflow app and **(b)** running code from an analysis script on an instance with RStudio computational environment.
+
+**Table of contents**
+
+| Est. Time| Lesson Name | Description|
+| ---|--------|--------|
+| 10 mins |[An Introduction to RNAseq](./rna_seq_2.md)| Background about RNAseq
+| 20 mins |[Selecting Kids First Cancer Cohort](./rna_seq_3.md)| Select Kids First open access cancer RNAseq files and push to Cavatica |
+| 20 mins |[Cavatica - View, Filter, Tag and Download](./rna_seq_4.md) | Filter imported data, tag and download relevant metadata from Cavatica |
+| 20 mins |[Setup DESeq2 Public App](./rna_seq_5.md)| Setting up the workflow app based on DESeq2 on Cavatica |
+| 15 mins |[Phenotype File and Upload to Cavatica](./rna_seq_6.md) | Reformat metadata file and upload it to Cavatica |
+| 50 mins |[Analysis with DESeq2 Public App](./rna_seq_7.md) | Run the DESeq2 app with appropriate inputs and computational settings |
+| 60 mins |[Analysis using Data Cruncher](./rna_seq_8.md) | Analysis on an instance in the RStudio environment |
+
+!!! note "Learning Objectives"
+ * learn to build virtual cohorts on KF portal
+ * learn to navigate project folder and perform file operations on Cavatica
+ * learn to upload and download data from Cavatica
+ * learn to search, copy, and edit public workflow apps on Cavatica
+ * learn to perform differential gene expression (DGE) analysis using DESeq2 app
+ * learn to setup analysis environment and execute code for DGE analysis
+
+=== "Prerequisites"
+ * Setup: Integrated login accounts on Kid's First Data Portal & Cavatica - Follow our lessons on account setup and connecting the two accounts.
+ - [Setup Kids First account](../Kids-First/Portal-Setup-And-Permissions/KF_3_KF_Registration.md)
+ - [Register for Cavatica](../Kids-First/Portal-Setup-And-Permissions/KF_4_Cavatica_Registration.md)
+ - [Connect KF and Cavatica](../Kids-First/Portal-Setup-And-Permissions/KF_5_ConnectingAccounts.md)
+
+ !!! note "Login Credentials"
+
+ You do not need **eRA Commons ID** to do the lesson!
+
+ * Background: Knowledge of biology and rudimentary genetics.
+ * Technology: Basic knowledge of R and command line. Familiarity with RStudio is useful.
+ * Financial: Pilot funds ($100) are provided to every user on Cavatica with linked KF accounts.
+ * Time: Initial account setup may take hours to a day for verification. Setup of eRA Commons ID may take days and is institute dependent.
+=== "Est.Cost"
+ - DESeq2 app < $1.00
+ - Analysis with R < $1.00
+=== "Tutorial Resources"
+ - [Kids First Data Portal](https://kidsfirstdrc.org)
+ - [Cavatica Documentation](https://docs.cavatica.org/docs/getting-started)
+ - [Playlist of video tutorials explaining concepts used in RNAseq analysis](https://www.youtube.com/playlist?list=PLblh5JKOoLUJo2Q6xK4tZElbIvAACEykp)
+ - [DESeq2 vignette](https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#how-do-i-use-vst-or-rlog-data-for-differential-testing)
+ - [tximport](https://bioconductor.org/packages/release/bioc/vignettes/tximport/inst/doc/tximport.html)
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_2.md b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_2.md
new file mode 100644
index 000000000..a24127c03
--- /dev/null
+++ b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_2.md
@@ -0,0 +1,43 @@
+---
+layout: page
+title: An Introduction to RNAseq
+---
+
+RNAseq uses the capability of next generation sequencing techniques to explore and/or quantify expression. The information stored in the DNA is organized into genes which are used to encode proteins, the functional products necessary for cell function. Although all cells contain the same DNA, the gene expression varies widely across cell types and cell states. Different biological conditions/disorders, as well as mutations, can influence the on/off state of genes as well as how much certain genes are turned on or off.
+
+!!! info "DNA to Protein"
+
+ Learn more about the [biological processes involved in conversion of DNA to proteins](https://www.nature.com/scitable/topicpage/translation-dna-to-mrna-to-protein-393/).
+
+RNAseq data can be used to study the transcriptome which is defined as a collection of all the transcript readouts from a cell. The transcriptome data can be utilized for many types of analyses. In this lesson we will focus on one such application: **Differential Gene Expression (DGE)** analysis in order to determine which genes are expressed at different levels between the conditions/groups of interest. The identified genes offer biological insight into the processes and pathways affected by the chosen experimental conditions.
+
+A typical RNAseq workflow is highlighted in the schematic diagram below. The orange boxes highlight the steps you will do in this tutorial!
+
+![RNAseq workflow](../rna-seq-images/rna-seq-workflow.jpeg "RNAseq workflow")
+
+The first steps involve extraction, purification, and quality checks of RNA from the biological samples, followed by library preparation to convert the RNA to cDNA (complimentary DNA) fragments which are then sequenced.
+The generated raw reads are quality checked and aligned against either a reference genome/transcriptome (if available) or used for *de novo* assembly. The expression abundance estimates are generated and if done at transcript level, the values are summarized for gene-level analysis. The expression count data is used for statistical modeling and testing to identify differentially expressed genes which can be examined further via visualization and downstream functional analysis.
+
+!!! info "RNAseq Resources"
+
+ Learn more about RNAseq through this [video tutorial by StatQuest](https://www.youtube.com/watch?v=tlf6wYJrwKY&list=PLblh5JKOoLUJo2Q6xK4tZElbIvAACEykp&index=1). You can also follow this [end to end RNAseq workflow](https://www.bioconductor.org/packages/devel/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html) that uses well known [Bioconductor packages](http://bioconductor.org).
+
+Experimental Plan
+
+In this tutorial, we will evaluate the difference between [pediatric **Medulloblastoma** vs **Ependymoma**](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2719002/).
+
+Medulloblastoma
+
+* is a common malignant childhood brain tumor
+* typically occurs in the 4th ventricle region of the brain
+* has five different histological types
+* subtypes impact the prognosis and response to therapy
+
+Ependymoma
+
+* is a broad group of tumors
+* often arises from lining of the ventricles in the brain
+* can also occur in the central canal in the spinal cord
+* anatomical distribution impacts prognosis
+
+We will use the [Kids First Data Portal (KF Portal)](https://kidsfirstdrc.org) to build a virtual cohort containing the two pediatric cancers and select pre-processed transcript abundance files. We will then proceed with analysis on [Cavatica](https://cavatica.sbgenomics.com), the integrated cloud based platform.
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_3.md b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_3.md
new file mode 100644
index 000000000..4e3db7302
--- /dev/null
+++ b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_3.md
@@ -0,0 +1,108 @@
+---
+layout: page
+title: Selecting Kids First Cancer Cohort
+---
+
+Selecting Kids First Cancer Cohort
+====================================
+
+The [Gabriella Miller Kids First Pediatric Data Portal (KF portal)](https://kidsfirstdrc.org) hosts datasets at the intersection of childhood development and cancer from over 16,000 samples with the constant addition of new data.
+
+!!! tip "Kids First Data Portal"
+
+ Check out our lessons on Kids First to learn more about the [different Data Portal features](../Kids-First/Exploring-Data-in-the-KF-Portal/KF_5_Explore.md) and [building simple to complex queries](../Kids-First/Advanced-KF-Portal-Queries/KF_9_AdvancedQuery.md).
+
+There are data with different access levels hosted on the KF portal including open (processed files, reports, plots, etc) and controlled (raw sequencing files, histological images, etc). For this tutorial, we will use **open access pre-processed files** generated using [Kallisto (v0.43.1)](http://pachterlab.github.io/kallisto//releases/2017/03/20/v0.43.1), which uses pseudoalignments to quantify transcript abundance from raw data.
+
+!!! info "KFDRC RNAseq workflow"
+
+ [Kids First RNAseq pipeline](https://github.com/kids-first/kf-rnaseq-workflow) uses multiple tools/packages for expression detection and fusion calls. The workflow requires raw FASTQ files (controlled access) as input and generates multiple outputs including the Kallisto transcript quantification files. All the output files of this pipeline are available on the portal as open access data. In addition to the restricted data access issue, it is computationally taxing to run this workflow on multiple files.
+
+## Step 1: Filter for open access data
+
+* Login to the [KF portal](https://kidsfirstdrc.org/)
+* Select the File Repository tab
+* Select the Browse All option for the Filter.
+
+![File Repository](../rna-seq-images/1_KFDRC.png "File Repository")
+
+!!! note "Data Summary"
+
+ At the time of the tutorial (Jan 2021), the portal contained a total of 88,728 files. Since new datasets are constantly uploaded to the KF portal, the query numbers may change when run in the future.
+
+* Select the Access filter listed under FILE field
+* Select Open value
+* Click View Results to update selection. This results in 18,162 files.
+
+![Open access filter](../rna-seq-images/2_KFDRC.png "Open access filter")
+
+## Step 2: Apply File Filters to obtain RNAseq files
+
+Select the File Filters tab and apply the following filters:
+
+* **Experimental Strategy** --> RNA-Seq
+* **Data Type** --> Gene expression
+* **File Format** --> tsv
+
+This results in 1,477 files.
+
+![File filters](../rna-seq-images/3_KFDRC.png "File filters")
+
+## Step 3: Select cancer type
+
+Switch to the Clinical Filter tab and apply:
+
+* **Diagnosis (Source Text)** --> Medulloblastoma and Ependymoma.
+
+This filters the number of files to 235.
+
+![Cancer type](../rna-seq-images/4_KFDRC.png "Cancer type")
+
+## Step 4: Subset cohort
+
+To reduce possible sources of variation from sex and race, we subset further to include data from only white male patients.
+
+Under the Clinical Filters tab select:
+
+* **Gender** --> Male
+* **Race** --> White
+
+This results in 99 files.
+
+![Subset by Clinical Filters](../rna-seq-images/5_KFDRC.png "Subset by Clinical Filters")
+
+## Step 5: Copy files to Cavatica
+
+!!! important "Important"
+
+ It is crucial to ensure the Cavatica integrations are enabled to allow for file transfers. Find more details in our [Push to Cavatica lesson](../Kids-First/KF_7_PushToCavatica.md). You **do not** have to have the Data Repository Integrations set up to continue with this lesson.
+
+* Click on the ANALYZE IN CAVATICA button.
+* Select the CREATE A PROJECT option and provide an appropriate name for your folder. In this tutorial, **`cancer-dge`** was chosen as the project name.
+* Use the SAVE option to create the project.
+
+![Create project on Cavatica](../rna-seq-images/6_KFDRC.png "Create project on Cavatica")
+
+Following project creation, the option will update to enable copying of the selected files to Cavatica.
+
+![Copy files to Cavatica](../rna-seq-images/7_KFDRC.png "Copy files to Cavatica")
+
+Successful copying of the files to the project folder will result in a pop-up box summarizing the details along with a link to view the project folder on Cavatica. If the pop-up box disappears before you have a chance to click on the project link, you can [login to Cavatica](https://cavatica.sbgenomics.com){:target="_blank"} and follow the steps to [view files in Cavatica](./rna_seq_4.md#step-1-view-files-in-cavatica).
+
+![Successful copy to Cavatica](../rna-seq-images/8_KFDRC.png "Successful copy to Cavatica")
+
+!!! info "Query link"
+
+ The KF portal enables sharing of the query with the unique filter combinations including as a short URL. Login to your KF account and [click on the query link](https://p.kfdrc.org/s/6ic){:target="_blank"} to obtain the selected cohort.
+
+ ![Sharing query](../rna-seq-images/9_KFDRC.png "Sharing query")
+
+ [You can learn more about the different options to save/share queries in the KF portal from our lesson](../Kids-First/Advanced-KF-Portal-Queries/KF_13_SavingQueries.md){:target="_blank"}.
+
+In our next lesson, we will explore the newly created project folder and files on the Cavatica platform!
+
+## Media resources
+
+A video walkthrough of the cancer cohort selection on Kids First portal:
+
+
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_4.md b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_4.md
new file mode 100644
index 000000000..fb6703b1e
--- /dev/null
+++ b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_4.md
@@ -0,0 +1,98 @@
+---
+layout: page
+title: Cavatica - View, Filter, Tag and Download
+---
+
+To view the project folder on Cavatica, you can click the link from the pop box in KF portal after successful copy of files which will open the Cavatica login page.
+Alternatively, you can [login to Cavatica](https://cavatica.sbgenomics.com){:target="_blank"} in a new tab.
+
+## Step 1: View files in Cavatica
+
+* Select the newly created project folder under the Projects tab.
+* The Dashboard of the project folder has three panels: Description, Members and Analyses.
+* Click on the Files tab to list all the project files.
+
+![Files tab in project homepage](../rna-seq-images/10_Cavatica.png "Files tab in project homepage")
+
+* Click on the Type: All filter for a drop down box which lists the type and number of files: 99 compressed tsv files.
+
+![Total files in project folder](../rna-seq-images/11_Cavatica.png "Total files in project folder")
+
+
+## Step 2: Apply filters to subset cohort
+
+Before we proceed to the Differential Gene Expression Analysis (DGE analysis), it is a good idea to examine the metadata associated with our selected cohort. Since we aim to keep the experimental design simple, we will further filter down to remove possible sources of variation.
+
+The columns visible in the table are the platform default options. Click on :fontawesome-solid-columns: on the right hand corner and select any columns to view from the metadata list.
+
+![Edit table columns](../rna-seq-images/12_Cavatica.png "Edit table columns")
+
+![Custom columns](../rna-seq-images/13_Cavatica.png "Custom columns"){: align=right width=52%}
+
+Here we have selected:
+
+ - Age at diagnosis
+ - Vital status
+ - tumor_location
+ - histology
+ - histology_type
+
+
+!!! info "Age at diagnosis"
+
+ The default unit for any age metadata field is recorded in days and is reflected in the large numeric values for Age at diagnosis column.
+
+Each of these columns have multiple values. To filter the data using values within multiple metadata columns, use the :fontawesome-solid-plus: sign to add a filter. If you cannot see the :fontawesome-solid-plus: button, refresh your browser, as your session may have timed out.
+
+![Apply additional filters](../rna-seq-images/14_Cavatica.png "Apply additional filters")
+
+* First, we filter to only include surviving patients. Click on :fontawesome-solid-plus: and
+choose Vital status, then select **Alive** from the sub-menu.
+
+![Vital status filter](../rna-seq-images/15_Cavatica.png "Vital status filter")
+
+* Since the patients could have presented with multiple cancers over diagnostic timeline, the histology metadata has other values in addition to the cancer types of interest. Click :fontawesome-solid-plus: again this time choosing histology and selecting both **Medulloblastoma** & **Ependymoma**.
+
+![histology filter](../rna-seq-images/16_Cavatica.png "histology filter")
+
+* To ensure comparison of cancer from the first presentation in the patient, we eliminate recurrent or progressive subtypes using the histology_type filter following the same steps as previously. This time select only **Initial CNS Tumor**.
+
+![histology_type filter](../rna-seq-images/17_Cavatica.png "histology_type filter")
+
+The tumor_location metadata column has some values that include multiple anatomically distinct locations separated by a **`;`**. This could indicate the observation of spread of tumor to multiple locations during first occurrence.
+
+* We filter using the tumor_location metadata, choosing only values **without** the **`;`**. Select the eleven distinct values for tumor_location (not including those with **`;`**, **`Not Reported `** , and **`Other locations NOS `**). You can see the complete list in the screen capture below.
+
+![tumor_location filter](../rna-seq-images/18_Cavatica.png "tumor_location filter")
+
+This results in total of 50 files from our initial 98 copied files.
+
+## Step 3: Create tags & download filtered dataset
+
+To enable quick access to the filtered data without having to re-run all the metadata filters, we can create tags for the filtered data.
+
+
+* Select all the files by clicking on :material-square-rounded-outline: in the column header and click on :fontawesome-solid-tags:Tags tab.
+
+![All filtered files](../rna-seq-images/19_Cavatica.png "All filtered files")
+
+* Type the name of the tag and click Add new tag.
+
+![Add new tag](../rna-seq-images/20_Cavatica.png "Add new tag")
+
+!!! tip "Tag Names"
+
+ While you can use any tag name you see fit, use **DGE-FILTER-DATA** as used in this lesson to match your screen with the lesson screenshots.
+
+* Click Apply. In case, you wish to remove the tag, use the :fontawesome-solid-times: in the tag name to delete.
+
+![Apply new tag](../rna-seq-images/21_Cavatica.png "Apply new tag")
+
+The filtered files are now tagged. We need to download and modify the metadata file which will be used as the accompanying phenotype file for our DGE analysis in the next lesson. To download:
+
+* Click on the :fontawesome-solid-ellipsis-h: button on the right corner.
+* Select Export metadata manifest from filtered files.
+
+![Download filtered metadata](../rna-seq-images/22_Cavatica.png "Download filtered metadata")
+
+In our next lesson, we will learn to setup the DESeq2 app in our project folder.
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_5.md b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_5.md
new file mode 100644
index 000000000..0fdd0910f
--- /dev/null
+++ b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_5.md
@@ -0,0 +1,89 @@
+---
+layout: page
+title: Setup DESeq2 Public App
+---
+
+[DESeq2](https://bioconductor.org/packages/release/bioc/html/DESeq2.html) is a Bioconductor package used to perform DGE analysis by fitting the [negative binomial model](https://www.statisticshowto.com/negative-binomial-experiment/) to the count data. It requires a counts table as input along with a phenotype file describing the experimental groups.
+
+DESeq2 performs multiple steps including:
+
+* estimating size factors to account for differences in library depth
+* estimating gene-wise dispersions to generate accurate estimates of within-group variation
+* shrinkage of dispersion estimates which reduces false positives in the DGE analysis
+* hypothesis testing using the [Wald test](https://www.statisticshowto.com/wald-test/) or [Likelihood Ratio test](https://www.statisticshowto.com/likelihood-ratio-tests/)
+
+DESeq2 automatically removes outlier genes from analysis using [Cook's distance](https://www.statisticshowto.com/cooks-distance/) and filters genes with low counts which helps improve detection power by making the multiple testing adjustment of the p-values less severe. Refer to the [DESeq2 vignette](https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html) for a more detailed explanation, helpful suggestions, and examples.
+
+Cavatica offers DESeq2 as a stand alone public app which consists of a [Common Workflow Language (CWL)](https://www.commonwl.org) wrapper around a script with functions from the DESeq2 package. In this lesson we learn to copy, edit, and setup the DESeq2 app in the project folder with cancer data files.
+
+!!! info "Terminology"
+
+ * Count data - represents the number of sequence reads that originated from a particular gene
+ * Dispersion - a measure of spread or variability in the data. DESeq2 dispersion estimates are inversely related to the mean and directly related to variance
+ * LFC - log2 fold change
+
+## Step 1: Search & copy DESeq2 app
+
+!!! tip "Vidlets"
+
+ We recommend watching the vidlets first before utilizing the step wise written instructions to follow along.
+
+The first step is to obtain a copy of the DESeq2 app in the project folder.
+
+ * Click the Apps tab which is currently empty and click :fontawesome-solid-plus: Add App button which opens the list of Public Apps.
+ * You can find the DESeq2 app by typing "DESEQ" in the search bar.
+ * In the DESeq2 app box select the Other versions drop down box and click on the version 1.18.1.
+ * This opens the app in a new tab where you can click on the :fontawesome-solid-ellipsis-h: on the right hand corner and click Copy.
+ * Select the project folder **`cancer-dge`** (or the project name you have chosen) and click Copy.
+ * Navigate to your project Dashboard using Projects drop down menu and view the app under the Apps tab. You can also click the project link in the popup box that appears on top of the page.
+
+
+
+
+## Step 2: Edit DESeq2 app (Optional)
+
+!!! important "DESeq2 App Version"
+
+ The IgnoreTxVersion bug was fixed in **Revision 17** of the DESeq2 1.18.1 app and will be the default selection when you copy the app. Follow the steps in this section if using older Revision versions of DESeq2 1.18.1 app.
+
+The DESeq2 app has a bug with the IgnoreTxVersion parameter that can be rectified by editing the app using the tool editor.
+
+* To do so, click on DESeq2 in the Apps tab. This opens the app page.
+* Click the Edit button on right hand upper corner which prompts a popup box with a warning message about losing update notifications for the original app. Click Proceed to editing.
+* In the DESeq2's tool editor, find the IgnoreTxVersion input port and click on it.
+* In the Value transform field of the port, click on **</>**, enter the following code and click Save.
+
+ ```
+ {
+ if ($job.inputs.ignoreTxVersion) {
+ return "TRUE"
+ }
+ else {
+ return "FALSE"
+ }
+ }
+ ```
+
+* Click :fontawesome-regular-save: icon on the top right hand corner to add a revision note.
+* On the app page, the revision history is updated to read Revision 1.
+
+
+
+
+## Step 3: Obtain reference gene annotation
+
+A reference gene annotation file in GTF format is required by DESeq2 app to summarize the transcript level abundances contained in the [Kallisto](http://pachterlab.github.io/kallisto//releases/2017/03/20/v0.43.1) files for gene-level analysis. Internally, [tximport](https://bioconductor.org/packages/release/bioc/vignettes/tximport/inst/doc/tximport.html), another Bioconductor package, is utilized to obtain the gene level summary.
+
+* Navigate to the Files tab and edit the metadata columns to show Reference genome column. To do so, click on the :fontawesome-solid-columns: icon and select Reference genome. All files in this dataset used the GRCh38 (hg38) homo sapiens genome assembly released by Genome Reference Consortium.
+* Click on Data drop down menu and click on Public Reference Files.
+* This takes you to a new page for Public Files.
+* Click on Type: All button to bring a drop down list and select **GTF**.
+* From the results, select **Homo_sapiens.GRCh38.84.gtf** which is the ENSEMBL Release 84 version of the Human gene annotation in GTF format.
+* Click on Copy and select the project folder with the cancer files.
+* Select Copy in the popup window.
+* A notification menu will highlight the successful copy of the file and clicking on the project folder name will take you to the Files tab in folder.
+* Check for the reference file using the Type: All button and select **GTF**.
+
+
+
+In our next lesson, we will learn to edit our previously downloaded phenotype file and upload it to Cavatica!
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_6.md b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_6.md
new file mode 100644
index 000000000..038a5eba3
--- /dev/null
+++ b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_6.md
@@ -0,0 +1,85 @@
+---
+layout: page
+title: Phenotype File and Upload to Cavatica
+---
+
+One of the necessary input files for DGE analysis using DESeq2 is the **phenotype file** which lists the experimental groups and associated metadata for use in our design. You previously [downloaded the metadata manifest for the filtered files from Cavatica](./rna_seq_4.md#step-3-create-tags-download-filtered-dataset).
+In this lesson, you will learn to modify the phenotype data and upload it from your local computer back into Cavatica. Microsoft Excel is used in this lesson for data modification but you can use any other equivalent program to accomplish the same tasks.
+
+## Step 1: Change column order
+
+The DESeq2 app requires **Phenotype data** input file in CSV format with the Sample ID in the first column.
+
+* The default column order of the downloaded metadata manifest had **id** in the first column.
+
+![Default column order](../rna-seq-images/rna-seq-6-1.png "Default column order")
+
+* Rearrange the order by using cut/insert to move the **sample_id** to the first column.
+
+![Reorder column order](../rna-seq-images/rna-seq-6-2.png "Reorder column order")
+
+## Step 2: Convert age at diagnosis to intervals
+
+The unit for **age_at_diagnosis** is in days with wide spread of values across the experimental groups and hence needs to be included in the design as a covariate. It is more meaningful to convert the continuous values of this variable into small number of defined bins or in this case age ranges. Here, we split the ages into five year bins capped at twenty years of age.
+
+* Create a new column **age_at_diagnosis_yrs** and enter the formula. This column lists **age_at_diagnosis** in years rounding at three decimal places.
+
+```
+=ROUND(N2/365,3)
+```
+
+![Age at diagnosis in years](../rna-seq-images/rna-seq-6-3.png "Age at diagnosis in years")
+
+* Sort the newly created column from largest to smallest value.
+ * one entry corresponding to biospecimen (BS_BA6AZWB3) is collected from a patient at 36.5 yrs of age, who is considered an adult.
+ * since we are studying the difference between pediatric cancers types, delete this entry from table leaving 49 rows.
+
+* Next, create a new column **diagnosis_age_range** and enter the formula. This column converts the age at diagnosis in years to intervals spanning five years.
+
+```
+=LOOKUP(Z2,{0,5,10,15},{"0-5","5-10","10-15","15-20"})
+```
+
+![Diagnosis age range](../rna-seq-images/rna-seq-6-4.png "Diagnosis age range")
+
+!!! note "Column Names"
+
+ If you choose your own version of column names for the newly created columns, remember to substitute those names in the appropriate input files for the DESeq2 app in the following lesson.
+
+
+## Step 3: Upload phenotype file to Cavatica
+
+* Save and export the modified metadata file in CSV format to your local computer.
+
+!!! important
+
+ * It is crucial to name the CSV file **phenotype_filtered.csv** to ensure the R scripts can run without errors. If you choose a different file name, ensure to update the file name in the R script before executing.
+
+ * Avoid having empty rows in your exported CSV. This may cause the DESeq2 app to error. To avoid this, select all the rows and columns with values and select File -> Save as option. You can also check the number of rows in the file on Cavatica after uploading to ensure you have 49 rows and 27 columns.
+
+* Access the Files tab in your project folder on Cavatica and click the :fontawesome-solid-plus: Add files.
+
+![Add files](../rna-seq-images/rna-seq-6-5.png "Add files")
+
+* Select Your Computer as source to add files. You can either browse files or Drag & drop files from your local system.
+
+![Choose Your Computer](../rna-seq-images/rna-seq-6-6.png "Choose Your Computer")
+
+* Click on Start upload to add the files to Cavatica.
+
+![Start upload](../rna-seq-images/rna-seq-6-7.png "Start upload")
+
+* Successful upload results in a popup box and Status updated to {==UPLOADED==}. Click on the :fontawesome-solid-times: on the top right hand corner.
+
+![Successful upload](../rna-seq-images/rna-seq-6-8.png "Successful upload")
+
+* Navigate back to the Files tab in your project folder and select Type: CSV to confirm the upload.
+
+![Check upload](../rna-seq-images/rna-seq-6-9.png "Check upload")
+
+* Click on the file name to preview the contents of the phenotype file.
+
+![Preview csv file](../rna-seq-images/rna-seq-6-10.png "Preview csv file")
+
+
+We have completed all the initial setup necessary and are now ready to run the DESeq2 app for DGE analysis. Continue to the next lesson to learn more!
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_7.md b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_7.md
new file mode 100644
index 000000000..cc40a97f9
--- /dev/null
+++ b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_7.md
@@ -0,0 +1,103 @@
+---
+layout: page
+title: Analysis with DESeq2 Public App
+---
+
+## Step 1: Select inputs
+
+* Access the DESeq2 app under Apps.
+* Click :fontawesome-solid-play: Run to open the app task page.
+* Under Inputs, click the :fontawesome-solid-folder-open: Select files icon next to each of data type.
+ * For Expression data, use Type option to choose **TSV.GZ** files and subset using :fontawesome-solid-tags:Tags to select **DGE-Filter-Data**. Select all filtered files by clicking on :material-square-rounded-outline: on the left corner of the table and click Save selection.
+ * For Gene annotation, the files list is updated to show the **GTF** file. Choose the file and click Save selection.
+ * For Phenotype data, the file list is updated to show the **CSV** file. Choose the file and click Save selection.
+
+
+
+## Step 2: Update app settings & execute
+
+* Provide an **Analysis title**. In this lesson, **Cancer_DGE** was used as the title.
+* **Control variables** represent potential confounders in the data that need to be controlled in the test for differential expression. You can add more than one variable as values for this field by using the :fontawesome-solid-plus: button. In this tutorial, **tumor_location** and **diagnosis_age_range** are two metadata variables which contribute to additional biological variability in the expression levels of the genes.
+* Input the column name from the uploaded phenotype file for **Covariate of interest** which captures the experimental groups we are interested in pairwise comparison. In this tutorial, **histology** designates the two different pediatric cancers that we wish to compare.
+* The default value for **FDR cutoff** is set at 0.1. Set the FDR, or false discovery rate to **0.05**, which means that the proportion of false positives we expect amongst the differentially expressed genes is 5%.
+* **Factor level - reference** represents the denominator for the log2 fold change (LFC) i.e what condition/group do we compare against. Enter **Ependymoma** as the reference factor. Changing the order of the reference or test factor level results in reversal of direction of log fold change.
+* **Factor level - test** represents the numerator for the LFC. Enter **Medulloblastoma** as the test factor.
+* Select the **Quantification tool** used to calculate transcript abundance from the drop down menu. The expression data for our data were generated using **kallisto**.
+* **IgnoreTxVersion** is a function in `tximport` package designed to ignore transcript version number. Set it to **True**.
+* DESeq2 allows for the shrinkage of the LFC which uses information from all genes to generate accurate estimates. Although, using LFC shrinkage does not impact the total number of genes that are identified as being significantly differentially expressed, it is useful for downstream assessment of results. Set the **log2 fold change shrinkage** to **True**.
+* Click :fontawesome-solid-play: Run on the right hand corner to initiate the analysis
+
+!!! note "Default Settings"
+
+ The other fields in the app settings we left at default `No value` setting.
+
+<
+
+## Step 3: Explore analysis outputs
+
+Upon successful completion of the task, the label next to the task name is updated to {==COMPLETED==}. The execution details along with the Price and Duration for the task are listed below the task name. For this lesson, the DESeq2 app took 36 minutes for completion with total cost of $0.14.
+
+!!! info "Email notification"
+
+ An email is sent from The Seven Bridges Team to the email ID associated with your Cavatica account whenever a task starts and when the task is completed. Learn more about [managing the notifications for your project](https://docs.sevenbridges.com/docs/manage-email-notifications).
+
+The generated output are listed under the Outputs section:
+
+### DESeq2 analysis results
+
+It is an output file with name {Analysis title}.out.csv in CSV format. This is generated using the `results()` function in DESeq2 package and contains gene level statistics.
+
+![DESeq2 results table](../rna-seq-images/rna-seq-7-1.png "DESeq2 results table")
+
+Column Header | Description |
+| :--- | :-------- |
+| baseMean | mean of normalized counts for all samples|
+| log2FoldChange | log-ratio of a gene's expression values in two different conditions|
+| lfcSE | standard error |
+| stat | Wald statistic |
+| pvalue | Wald test p-value |
+| padj | [Benjamini-Hochberg](https://www.statisticshowto.com/benjamini-hochberg-procedure/) adjusted p-value |
+
+### HTML report
+
+The file with name {Analysis title}.{deseq2_app_version}.summary_report.b64html is a summary report. This report contains information on the inputs, plots from exploratory analysis, details of the DGE analysis along with the R Session info which includes a list of all the packages along with the version number for reproducibility.
+
+![DESeq2 report](../rna-seq-images/rna-seq-7-2.png "DESeq2 report")
+
+One of the plots under the exploratory analysis section is the principal component analysis (PCA) plot based on the expression values. PCA is a technique used to emphasize variation and highlight patterns in a dataset. To learn more, we encourage you to explore [StatQuest's video on PCA](https://www.youtube.com/watch?v=_UVHneBUBW0&list=PLblh5JKOoLUJo2Q6xK4tZElbIvAACEykp&index=22).
+
+In the dataset used in this analysis, we observe the separation of the data along x-axis (PC1) is greater than separation of data along y-axis (PC2) indicating that the between-group variation is greater than the within-group variation.
+
+![PCA plot](../rna-seq-images/rna-seq-7-3.png "PCA plot")
+
+A summary of the DGE analysis indicates that 10,830 genes are upregulated and 8,591 genes are downregulated in Medulloblastoma when compared to Ependymoma pediatric cancer.
+
+![Analysis summary](../rna-seq-images/rna-seq-7-4.png "Analysis summary")
+
+These results are visualized in a MA plot which shows the mean of the normalized counts versus the LFC for all genes tested. The red colored dots represent genes that are significantly differentially expressed between the two cancer types.
+
+![MA plot](../rna-seq-images/rna-seq-7-5.png "MA plot")
+
+### Normalized counts
+
+It is in TXT format with name {Analysis title}.raw_counts.txt. This contains the counts normalized using the estimates sample-specific normalization factors.
+
+![Normalized counts](../rna-seq-images/rna-seq-7-6.png "Normalized counts")
+
+### RData files
+
+It is a R workspace image with name {Analysis title}.env.RData. This contains all the app-defined objects including vectors, matrices, dataframes, lists and functions from the R working environment.
+
+## Step 4: Tag & download analysis outputs
+
+You can easily tag these files and download them to your local computer. The files are also clickable to preview the content on Cavatica.
+
+* Navigate to Files tab.
+* Use the Type drop down menu to select B64HTML, CSV, RDATA and TXT.
+* Select all files with the {Analysis title} in the name.
+* Click on :fontawesome-solid-tags:Tags, add a new tag and click Apply. Here **DESeq2-Output** was used as tag name.
+* Click :fontawesome-solid-download: Download to obtain a local copy of the files. The files will be downloaded to your computer's default location for e.g. Downloads on MacOS.
+
+
+
+In the next lesson, we will learn the second approach of using a RStudio computational environment to perform DGE analysis!
diff --git a/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_8.md b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_8.md
new file mode 100644
index 000000000..95c2b106c
--- /dev/null
+++ b/docs/Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_8.md
@@ -0,0 +1,165 @@
+---
+layout: page
+title: Analysis using Data Cruncher
+---
+
+So far we have explored running DGE analysis using a public app based on DESeq2. In the second approach, we will set up an interactive analysis on an instance running the RStudio computational environment. We will run a DGE workflow using an analysis script, and generate reports and plots.
+
+!!! note "DGE Tools"
+
+ While there are other established tools to perform DGE analysis including [DESeq2](https://bioconductor.org/packages/release/bioc/html/DESeq2.html), [EdgeR](https://bioconductor.org/packages/release/bioc/html/edgeR.html) and [Limma-Voom](https://genomebiology.biomedcentral.com/articles/10.1186/gb-2014-15-2-r29), we will be using DESeq2 in our script to allow you to compare the output between the two approaches.
+
+
+## Step 1: Starting Data Cruncher
+
+* Click the Interactive Analysis tab located on the right hand corner below your account settings menu.
+* Select Open in the Data Cruncher panel.
+* Click on the Create your first analysis which appears the first time your are setting up.
+* In the popup box, select RStudio for Environment. Provide an analysis name in the box. Here, **Cancer_DGE** was used to title the analysis. Click Next when done.
+* In the `Compute requirements` tab, we will use the default instance type (c5.2xlarge, $0.49/hr). We increase the `Suspend time`, which is the period of inactivity after which the instance is stopped automatically and the analysis is saved, from 30 to 60 minutes.
+* Click Start the analysis. This prompts the initialization of the analysis which involves set up of the instance and preparation of the analysis environment.
+
+
+
+!!! info "Instance Types"
+
+ You can find details on all available US instances from Amazon Web Services (AWS) on [Cavatica's Platform Documentation](https://docs.sevenbridges.com/docs/list-of-available-amazon-web-services-instances).
+
+## Step 2: Navigating analysis editor and load script
+
+After the instance is initialized, you will be automatically directed to the analysis editor which in this case is the RStudio interface.
+
+!!! info "RStudio IDE"
+
+ [Read more about the different panes and options of the RStudio interface](https://georgejmount.com/tourofrstudio/), the integrated development environment (IDE) for the R programing language.
+
+### Directory structure
+
+The editor is associated with a directory structure to help you navigate the working space. You can access it via the Files/Packages/Plots/Help/Viewer pane on the bottom right hand corner of RStudio.
+
+```
+/sbgenomics
+|–– output-files
+|–– project-files
+|–– projects
+|–– workspace
+```
+
+!!! important "Important"
+
+ The `project-files` directory which contains all the input files is a read only file system while you have read-write permissions for the `workspace` and `output-files` directories.
+
+* **workspace** is the default working directory for the analysis. You can use the RStudio Upload option to get files from your local computer to the workspace.
+* **output-files** can be used as the directory to save all the outputs from your analysis. If not specified, the files are saved to workspace.
+* **project-files** is the directory containing all the input files from the current project. Since it is a read only file system, no changes can be made to these files via the editor interface.
+
+
+
+
+### Session outputs
+
+The generated output and environment files from an active session are saved when the analysis is stopped by clicking :fontawesome-solid-stop: Stop located on the right hand top corner. You can access the session files via the Files tab in your project folder.
+
+The Data Cruncher comes with a set of libraries that are pre-installed. These vary depending on the environment you chose during setup. We chose the default environment for RStudio `SB Bioinformatics - R 4.0` which is loaded with [set of CRAN and Bioconductor libraries](https://docs.sevenbridges.com/docs/about-libraries-in-a-data-cruncher-analysis).
+
+!!! important "Installing additional libraries"
+
+ Although the output files, the environment, and history of the session are saved upon stopping the analysis editor, any installed libraries are only good for the session and must be re-installed for every restart of the instance.
+
+
+## Step 3: Run analysis script
+
+You will need to download an analysis script for this step. We provide you with the option to download two versions of the analysis script based on choice of execution in RStudio. Click on your preferred option and save the file:
+
+ (a) [version to execute automatically using `Source`](./rna-seq-supporting-docs/Cancer_DGE_Analysis_Automate.R)
+ (b) [version to execute the code in chunks using the `Run` option](./rna-seq-supporting-docs/Cancer_DGE_Analysis.R).
+
+The (b) version of the script is run manually and contains some additional packages and lines of code to allow for interactive exploration of the data prior to analysis. The DGE analysis and all the generated output are otherwise identical between the two versions.
+
+Upload the script file to the **workspace** directory. View the upload steps in the [vidlet](#upload). Briefly:
+
+ * Click on the Upload option in the Files/Packages/Plots/Help/Viewer pane.
+ * Click Choose File to select the file from your local computer.
+ * Once uploaded, click on the script file name to open it in the script editor pane (top left hand corner).
+ * To execute go to **[Step 3a](#step3a)** if you chose the (a) version or **[Step 3b](#step3b)** if you chose (b).
+
+!!! important "Phenotype File Name"
+
+ For the scripts to run error-free ensure that the name of the phenotype CSV file is [**"phenotype_filtered.csv"**](./rna_seq_6.md#step-3-upload-phenotype-file-to-cavatica). Otherwise, update the R script file if your CSV file has a different name before execution.
+
+### Step 3a: Execute using `Source` version
+
+To get started, click on the down arrow next to Source and click Source with Echo. This will print the comments as the code is executed.
+
+![Source with Echo](../rna-seq-images/rna-seq-8-1.png "Source with Echo")
+
+This process will take about 15-20 minutes. Once completed, you will get a popup window asking to try to open the html report. Click Try Again to open a new tab for the report.
+
+![Popup window](../rna-seq-images/rna-seq-8-2.png "Popup window")
+
+Alternatively, you can click Cancel in the popup window and subsequently click :fontawesome-solid-stop: Stop to view the files in your project folder.
+
+![Stop analysis](../rna-seq-images/rna-seq-8-3.png "Stop analysis")
+
+For costs and time comparison between the two approaches, we use the automated version with the option to view the output files in the project folder which took 25 minutes to run and cost $0.2. You are now ready to view your output. Go to **[Step 4](#step4)**.
+
+
+### Step 3b: Execute using `Run` version
+
+You can also execute the code by selecting a line or multiple lines of code and clicking the Run option or using ++ctrl+enter++ keys. This allows you greater flexibility to explore and understand the outputs of each line of code.
+
+The first step is installing the packages necessary for DGE analysis and this takes approximately 17 minutes. Highlight the package install section as shown in the image below and click Run.
+
+![Use Run](../rna-seq-images/rna-seq-8-4.png "Use Run")
+
+This version includes the Bioconductor package [`pcaExplorer`](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2879-1), which provides interactive visualization of RNAseq datasets based on [Principal Components Analysis](https://www.youtube.com/watch?v=_UVHneBUBW0&list=PLblh5JKOoLUJo2Q6xK4tZElbIvAACEykp&index=22).
+
+After running **lines 1-99** of the R script, you should see an interactive output from the `pcaExplorer()` command. Watch the video below to learn how to use `pcaExplorer` for the filtered cancer dataset.
+
+
+
+When you are finished running the R script, click :fontawesome-solid-stop: Stop to view the output files in your Cavatica project folder.
+
+!!! bug "Login Timeout"
+
+ It is possible to be logged out of Cavatica despite having an active RStudio session. You will be unable to stop the analysis from within the editor using :fontawesome-solid-stop: Stop if that occurs.
+
+ * Login to Cavatica in a new tab or window.
+ * Navigate to the data cruncher session via either the Interactive Analysis tab or using the `ANALYSES` pane in your project home page.
+
+ ![Data Cruncher quick access](../rna-seq-images/rna-seq-8-5.png "Data Cruncher quick access"){: width=70%}
+
+ * Click :fontawesome-solid-stop: Stop on the session page.
+
+## Step 4: View output files
+
+All the session files and the generated outputs are saved after the analysis is stopped and are accessible on the session page.
+
+![Output Files](../rna-seq-images/rna-seq-8-6.png "Output Files")
+
+The tag for the session changes from {==RUNNING==} to {==SAVED==}. Similar to the DESeq2 app, four output files are generated:
+
+* **Cancer_DESeq2_DGE_results.csv** contains the ordered table of gene level statistics generated using the `results()` function in DESeq2 package.
+* **Cancer_DESeq2_normalized_counts.txt** contains counts normalized using the estimated sample-specific normalization factors.
+* **DESeq2-Report** folder which contains the HTML report generated using [regionReport](https://f1000research.com/articles/4-105/v2). The report contains all the visualizations along with the associated code from the [DESeq2 vignette](https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#how-do-i-use-vst-or-rlog-data-for-differential-testing).
+* **Cancer_DGE_{Date}.env.RData** is the R workspace image that includes all the objects and variables generated from the code. The `.RData` listed under Workspace is saved by default by the Data Cruncher.
+
+!!! info "Output Differences"
+
+ Although the DGE results are the same between the two analysis approaches, there are some differences between the two `html` reports since they are not the exact same code. The MA plots generated using Data Cruncher use blue to signify significant genes and the counts plot uses points instead of bars.
+
+All the files are clickable for preview on Cavatica. You can either download individual files by clicking on the file name or follow the steps to [tag and download the files listed in the Analysis with DESeq2 Public App](./rna_seq_7.md#step-4-tag-download-analysis-outputs) lesson.
+
+## Conclusion
+
+This concludes the RNAseq on Cavatica tutorial. We hope that you found the tutorial helpful and will continue to use cloud computing for your analysis!
+
+!!! note " Key Points"
+
+ * [Kids First Portal](https://kidsfirstdrc.org) is the go to resource for pediatric cancer & structural birth defects datasets.
+ * Data to analysis in one step using [Cavatica](cavatica.sbgenomics.com), the cloud based analysis platform integrated into Kids First Portal.
+ * You can filter, view, and download data from Cavatica.
+ * Upload data to Cavatica from multiple sources including your local machine.
+ * You can search, copy, and modify a public app on Cavatica.
+ * Setup and successful run of the DESeq2 app by choosing appropriate inputs.
+ * Setup a virtual computational environment running RStudio and analyze by executing code from a script.
diff --git a/docs/Bioinformatics-Skills/Snakemake/index.md b/docs/Bioinformatics-Skills/Snakemake/index.md
index 688c84daa..527429772 100644
--- a/docs/Bioinformatics-Skills/Snakemake/index.md
+++ b/docs/Bioinformatics-Skills/Snakemake/index.md
@@ -1,9 +1,12 @@
---
layout: page
title: Snakemake Overview
+hide:
+ - toc
---
-**An introduction to Snakemake for workflow management**
+An introduction to Snakemake for workflow management
+============================================
Workflow management systems help to automate analyses and make them easier to maintain, reproduce, and share with others. In this tutorial, we will walk through the basic steps for creating a [variant calling](https://www.ebi.ac.uk/training-beta/online/courses/human-genetic-variation-introduction/variant-identification-and-analysis/) workflow with the Snakemake workflow management system.
diff --git a/docs/Bioinformatics-Skills/Snakemake/snakemake_0.md b/docs/Bioinformatics-Skills/Snakemake/snakemake_0.md
index 4b07bfdaa..330288e4a 100644
--- a/docs/Bioinformatics-Skills/Snakemake/snakemake_0.md
+++ b/docs/Bioinformatics-Skills/Snakemake/snakemake_0.md
@@ -10,6 +10,6 @@ Workflow systems help you automate and manage the inputs, outputs, and commands
[Snakemake](https://snakemake.readthedocs.io/en/stable/) is a Python-based workflow system ([see 2012 publication](https://academic.oup.com/bioinformatics/article/28/19/2520/290322)). The name "Snakemake" comes from the fact that it's written in (and can be extended by) the Python programming language.
-Snakemake works by looking at a file, called a "Snakefile", that contains rules for creating output files. Generally, each rule is defined as a step in the workflow. Snakemake uses the rules and command line options to figure how the rules relate to each other so it can manage the workflow steps.
+Snakemake works by looking at a file, called a "Snakefile", that contains rules for creating output files. Generally, each rule is defined as a step in the workflow. Snakemake uses the rules and command line options to figure out how the rules relate to each other so it can manage the workflow steps.
Let's get started!
diff --git a/docs/Bioinformatics-Skills/index.md b/docs/Bioinformatics-Skills/index.md
index ecfd8ac16..ee109cbd2 100644
--- a/docs/Bioinformatics-Skills/index.md
+++ b/docs/Bioinformatics-Skills/index.md
@@ -6,7 +6,7 @@ title: Overview
Bioinformatics Skills
=======================
-Tutorials in this section provide lessons for finding data sets, setting up compute environments for analysis, and running hands-on bioinformatics analyses.
+Tutorials in this section provide lessons for finding data sets, setting up compute environments for analysis, and running hands-on bioinformatics analyses.
Common Fund Data - Finding Datasets for Analysis:
@@ -17,6 +17,7 @@ Software and Compute Set up for Analysis:
- [Set up Conda Computing Environment](install_conda_tutorial.md)
- [Introduction to Amazon Web Services](Introduction_to_Amazon_Web_Services/introtoaws1.md)
+- [Introduction to Google Cloud Platform](Introduction-to-GCP/index.md)
Bioinformatics Analysis:
@@ -24,4 +25,4 @@ Bioinformatics Analysis:
- [GWAS in the Cloud](GWAS-in-the-cloud/index.md)
- [Snakemake Workflow Management](Snakemake/index.md)
- [Simulate Illumina Reads](Simulate_Illumina_Reads.md)
-
+- [RNAseq on Cavatica](RNAseq-on-Cavatica/rna_seq_1.md)
diff --git a/docs/Bioinformatics-Skills/install_conda_tutorial.md b/docs/Bioinformatics-Skills/install_conda_tutorial.md
index 298f45fb0..4b9d4d9b7 100644
--- a/docs/Bioinformatics-Skills/install_conda_tutorial.md
+++ b/docs/Bioinformatics-Skills/install_conda_tutorial.md
@@ -94,10 +94,8 @@ Check the version of your new conda installation:
Conda uses channels to look for available software installations. These are some good channels to set up:
```
+conda config --add channels defaults
conda config --add channels bioconda
-```
-
-```
conda config --add channels conda-forge
```
@@ -109,7 +107,7 @@ There is always a `(base)` conda environment. You can then create new environmen
conda create -n
```
-This takes a few minutes (you'll see the message "Solving environment"). Conda will then ask you to confirm the location of the new environment. Type `y`.
+This takes a few minutes (you'll see the message "Solving environment"). Conda will then ask you to confirm the location of the new environment. Type ++y++.
More options to customize the environment are documented under the help page for this command: `conda create -h`.
@@ -143,7 +141,7 @@ The basic command for installing packages is:
conda install -y
```
-It will ask if you want to install dependencies. Type `y`. This command will show a list of the software installed in this environment:
+It will ask if you want to install dependencies. Type ++y++. This command will show a list of the software installed in this environment:
```
conda list -n
diff --git a/docs/CFDE-Internal-Training/.pages b/docs/CFDE-Internal-Training/.pages
index cd43df667..92c0d0d1c 100644
--- a/docs/CFDE-Internal-Training/.pages
+++ b/docs/CFDE-Internal-Training/.pages
@@ -3,4 +3,5 @@ nav:
- Website-Style-Guide
- cfdebot_website_editing.md
- Identifying MIME Types: MIME-type
- - ProtectedBranch_HowTo.md
+ - ProtectedBranch_HowTo.md
+ - github_auth_setup.md
diff --git a/docs/CFDE-Internal-Training/MIME-type/index.md b/docs/CFDE-Internal-Training/MIME-type/index.md
index aa52bf75a..2baa91b12 100644
--- a/docs/CFDE-Internal-Training/MIME-type/index.md
+++ b/docs/CFDE-Internal-Training/MIME-type/index.md
@@ -1,9 +1,12 @@
---
layout: page
title: MIME type Overview
+hide:
+ - toc
---
-**An introduction to MIME types for file formats**
+An Introduction to MIME types for File Formats
+=================================================
A MIME type or media type is a form of identification for file formats and contents transmitted over the internet. It is useful to specify the data identification label of a file to allow software to properly interpret and render the data. This is especially important for Common Fund (CF) programs who may undertake data transfers over the internet and thus, have to ensure the data integrity along with data formats for a successful transfer. In this tutorial, we will describe how to determine MIME type for single and multiple files, and create custom MIME types specific to the file format.
diff --git a/docs/CFDE-Internal-Training/Website-Style-Guide/0index.md b/docs/CFDE-Internal-Training/Website-Style-Guide/0index.md
index f3904b2fa..5395f0d6a 100644
--- a/docs/CFDE-Internal-Training/Website-Style-Guide/0index.md
+++ b/docs/CFDE-Internal-Training/Website-Style-Guide/0index.md
@@ -1,4 +1,12 @@
-# Contributing to the nih-cfde Training and Engagement website
+---
+layout: page
+title: Website Style Guide
+hide:
+ - toc
+---
+
+Contributing to the nih-cfde Training and Engagement website
+===============================================================
This is a style guide for content on the CFDE [training website](https://cfde-training-and-engagement.readthedocs-hosted.com/en/latest/).
@@ -14,11 +22,11 @@ Time | Section | About
=== "Prerequisites"
- To contribute to the CFDE's website, you must be onboarded to the CFDE. Contact us for help if you're interested in contributing training materials!
+ To contribute to the CFDE's website, you must be onboarded to the CFDE. Contact us at for help if you're interested in contributing training materials!
=== "Resources"
- - [Tutorial template markdown doc](./tutorial_template_docs/TutorialTemplate.md)
+ - [Get started with new tutorials using the tutorial template markdown doc](https://github.com/nih-cfde/training-and-engagement/blob/dev/docs/CFDE-Internal-Training/Website-Style-Guide/tutorial_template_docs/TutorialTemplate.md)
- [Powerpoint slide template](https://drive.google.com/drive/u/0/folders/14dOaf7-G4k7rCw5mL2Q5jdRWXrO0Y5i-)
diff --git a/docs/CFDE-Internal-Training/Website-Style-Guide/3TutorialComponents.md b/docs/CFDE-Internal-Training/Website-Style-Guide/3TutorialComponents.md
index 51c920df9..2ec1e1242 100644
--- a/docs/CFDE-Internal-Training/Website-Style-Guide/3TutorialComponents.md
+++ b/docs/CFDE-Internal-Training/Website-Style-Guide/3TutorialComponents.md
@@ -28,7 +28,11 @@ Tutorials should consist primarily of original content. If lesson material is ad
## Tutorial structure
-All tutorials should begin with landing page information. For longer tutorials that are split over multiple pages, start the tutorial steps on a new page (more details below). See the [tutorial template](./tutorial_template_docs/TutorialTemplate.md) for a page outline. For one-page tutorials, the landing "page" information may all be on the same page as the tutorial steps.
+All tutorials should begin with landing page information. For longer tutorials that are split over multiple pages, start the tutorial steps on a new page (more details below). For one-page tutorials, the landing "page" information may all be on the same page as the tutorial steps.
+
+!!! tip
+
+ See the [markdown tutorial template](https://github.com/nih-cfde/training-and-engagement/blob/dev/docs/CFDE-Internal-Training/Website-Style-Guide/tutorial_template_docs/TutorialTemplate.md) for a page outline. You can copy/paste the markdown to start new tutorials.
### Tutorial landing page components
diff --git a/docs/CFDE-Internal-Training/Website-Style-Guide/tutorial_template_docs/TutorialTemplate.md b/docs/CFDE-Internal-Training/Website-Style-Guide/tutorial_template_docs/TutorialTemplate.md
index 857e9373f..9909223bb 100644
--- a/docs/CFDE-Internal-Training/Website-Style-Guide/tutorial_template_docs/TutorialTemplate.md
+++ b/docs/CFDE-Internal-Training/Website-Style-Guide/tutorial_template_docs/TutorialTemplate.md
@@ -3,7 +3,7 @@ layout: page
title: Overview
---
-Note, overview/landing pages should have the yaml file header, e.g.,:
+Note, overview/landing pages should start with the yaml file header, e.g.,:
```
---
layout: page
diff --git a/docs/CFDE-Internal-Training/cfdebot_website_editing.md b/docs/CFDE-Internal-Training/cfdebot_website_editing.md
index 5579b2435..052af446d 100644
--- a/docs/CFDE-Internal-Training/cfdebot_website_editing.md
+++ b/docs/CFDE-Internal-Training/cfdebot_website_editing.md
@@ -70,11 +70,11 @@ The website created by the `published-documentation` repo pulls some docs that a
Follow the general steps above, with the following additional steps:
- Push your changes to the `preview` branch first to check the rendered website.
-- If the changes look as you expected, make a PR of your branch to `dev` and tag the admin team (@ACharbonneau and @marisalim), who will check the changes and approve. Approved changes will periodically be promoted to the `stable` branch to be rendered on the public website.
+- If the changes look as you expected, make a PR of your branch to `dev` and tag the admin team (@ACharbonneau or @marisalim), who will check the changes and approve. Approved changes will periodically be promoted to the `stable` branch to be rendered on the public website.
### B) **To edit documents that are in the sub-module repos**
-The cfde-bot's process for checking changes to the sub-module repos (`the-fair-cookbook` and `specifications-and-documentation`) is slightly different:
+The cfde-bot's process for checking changes to the sub-module repos (`the-fair-cookbook` and `c2m2`) is slightly different:
- The `published-documentation` cfde-bot checks hourly for changes to the sub-module repo's `master` branch. Thus, changes should be made directly in these repositories.
@@ -89,7 +89,7 @@ The cfde-bot's process for checking changes to the sub-module repos (`the-fair-c
Reminder: you must be onboarded to the CFDE to edit these repositories:
- `the-fair-cookbook` repo: [https://github.com/nih-cfde/the-fair-cookbook](https://github.com/nih-cfde/the-fair-cookbook)
-- `specifications-and-documentation` repo: [https://github.com/nih-cfde/specifications-and-documentation](https://github.com/nih-cfde/specifications-and-documentation)
+- `c2m2` repo: [https://github.com/nih-cfde/c2m2](https://github.com/nih-cfde/c2m2)
#### Step 2: Make changes *directly* on the `master` branch
@@ -102,7 +102,7 @@ The bot will automatically create preview branches (`update--preview`) if
If the website build checks all pass, the bot will then automatically merge:
- `update-fair-preview` into `cookbookpreview`, and will build a preview site for you to browse at: [https://cfde-published-documentation.readthedocs-hosted.com/en/cookbookpreview/](https://cfde-published-documentation.readthedocs-hosted.com/en/cookbookpreview/)
-- `update-specsdocs-preview` into `specspreview`, and will build a preview site for you to browse at: [https://cfde-published-documentation.readthedocs-hosted.com/en/specspreview/](https://cfde-published-documentation.readthedocs-hosted.com/en/specspreview/)
+- `update-c2m2-preview` into `c2m2preview`, and will build a preview site for you to browse at: [https://cfde-published-documentation.readthedocs-hosted.com/en/c2m2preview/](https://cfde-published-documentation.readthedocs-hosted.com/en/c2m2preview/)
#### Step 5: Publishing your changes
diff --git a/docs/CFDE-Internal-Training/github_auth_setup.md b/docs/CFDE-Internal-Training/github_auth_setup.md
new file mode 100644
index 000000000..1f5cbbfef
--- /dev/null
+++ b/docs/CFDE-Internal-Training/github_auth_setup.md
@@ -0,0 +1,147 @@
+# Setting up Github Authentication
+
+
+By mid-2021, Github will complete its [transition](https://github.blog/2020-12-15-token-authentication-requirements-for-git-operations/) to requiring a personal access token (PAT) key instead of a password to connect to Github remotely (e.g., using `git` on your local computer to work on remote branches).
+
+In this tutorial, we will show you how to enable two-factor authentication (optional) and generate a PAT.
+
+!!! note "Learning Objectives"
+
+ - learn how to set up two-factor authentication
+ - learn how to set up a personal access token
+
+=== "Est. Time"
+
+ 30 mins
+
+=== "Prerequisites"
+
+ - GitHub account
+ - Git installed on your computer
+ - Access to a Unix shell
+ - Basic command line skills
+
+=== "Tutorial Resources"
+
+ - [Github documentation on two-factor authentication](https://docs.github.com/en/free-pro-team@latest/github/authenticating-to-github/configuring-two-factor-authentication)
+ - [Github documentation on personal access token](https://docs.github.com/en/free-pro-team@latest/github/authenticating-to-github/creating-a-personal-access-token)
+ - [Github documentation on updating credentials](https://docs.github.com/en/free-pro-team@latest/github/using-git/updating-credentials-from-the-macos-keychain)
+
+### Step 1: Go to Github account settings
+
+- Click on Settings from the top-right dropdown menu on your Github profile picture.
+
+### Step 2a: Set up two-factor authentication
+
+While this step is optional, it is a good security measure to protect your account.
+
+- Click on Account security. On this [page](https://github.com/settings/security), scroll past the change password section to the two-factor authentication section.
+
+![](./images-github-auth/0-account-security.png "account security tab")
+
+- Click Enable two-factor authentication.
+
+![](./images-github-auth/1-two-factor-auth.png "enable two factor auth button")
+
+### Step 2b: Choose how to receive codes
+
+There are two options for receiving the two-factor authentication code.
+
+![](./images-github-auth/2-two-factor-auth-phone-set-up.png "set up phone")
+
+The recommended method is to receive the code from a phone app, such as Authy, 1Password, or LastPass Authenticator. The Duo Security app also works. For this option, click Set up using an app.
+
+The second option is to receive the code via text message to your phone. This option is only available in certain countries. For detailed steps on this method, see the Github [documentation](https://docs.github.com/en/free-pro-team@latest/github/authenticating-to-github/configuring-two-factor-authentication#configuring-two-factor-authentication-using-text-messages).
+
+### Step 2c: Save recovery codes
+
+The next page will show a series of recovery codes; you will need these codes to regain access to your account if it is ever lost. Download, print, or copy these codes to a safe place, then click Next.
+
+![](./images-github-auth/3-save-recovery-codes.png "save recovery codes")
+
+### Step 2d: Enable two-factor authentication
+
+If you chose to set up two-factor authentication with a phone app, open the app and scan the QR code. Enter the six-digit code from the app on Github in the text box below the QR code. After you click Enable, the two-factor authentication set up is complete!
+
+You can test by logging out of Github and logging back in - the phone app should send you a six-digit code to enter as part of login.
+
+### Step 3a: Generate a PAT
+
+Navigate to Developer settings located on the left panel of Account settings.
+
+![](./images-github-auth/0-developer-settings.png "developer settings tab")
+
+This will take you to a new page, on the left panel, click on Personal access tokens.
+
+Click on Generate new token. Give it a name in the **Note** text box - this can be a nickname to help you remember what the token is for/when it was created.
+
+Scopes enable setting permissions for user access to the various functionality of a repo. To set the scope for your user account, check the box next to **repo** and select all the tasks pertaining to a private repo that apply.
+
+![](./images-github-auth/4-generate-pat.png "Generate new token")
+
+!!! info "Update Scope"
+
+ You can run into OAuth error with tasks if the original PAT doesn't include the correct scope - for example, you may want to include **workflow** in your scope to edit workflow files remotely:
+
+ > refusing to allow a Personal Access Token to create or update workflow `....` without workflow scope
+
+ To update the scopes associated with your PAT, you can do so by:
+
+ - generating a new PAT key with the updated repo scopes
+ - delete the GitHub credentials in keychain (on MacOS) or in Git Credential manager (on Windows)
+ - delete and update the git credentials ([Step 3b](#updatekeychain))
+
+ Alternatively, you can use the [**Git Credential Manager Core**](https://github.com/microsoft/Git-Credential-Manager-Core) which is a cross platform git credential helper which will request the correct scopes.
+
+Then scroll down and click Generate token.
+
+!!! warning
+
+ Be sure to save the token somewhere safe (e.g., password manager). After you leave this page, the token will no longer be viewable.
+
+The token will look like a string of letters and numbers and appear in the green box just below the warning to make a copy of the token in the blue box. **Keep this page open - we will need to use the PAT key instead of our password to login at the command line.**
+
+![](./images-github-auth/5-personal-access-token.png "new token")
+
+### Step 3b: Update keychain with PAT
+
+If you have saved your Github password with a password manager (e.g., `osxkeychain` on MacOS) to work on Github repositories remotely, it needs to be updated to the PAT we generated. If your Github password is not managed by a password manager, continue to [Step 3c](#enterPAT).
+
+!!! note
+
+ If you normally enter your user name and password when you `git push` local changes to Github, you'll need to enter the PAT key instead of your password
+
+From the terminal, check whether the `credential.helper` is set on your `git` configurations:
+
+=== "Input"
+
+ ```
+ git config --list
+ ```
+
+=== "Expected Output"
+
+ On a MacOS, it may show:
+ ```
+ credential.helper=osxkeychain
+ ```
+
+In this example, we will delete the saved password from `osxkeychain`, so that it can be updated with the PAT key. Type ++enter++ after each of the commands below at the terminal. After entering `protocol=https` you need to press ++enter++ **twice**. If the commands are successful, there should be no output in the terminal.
+
+```
+git credential-osxkeychain erase
+host=github.com
+protocol=https
+```
+
+### Step 3c: Enter PAT as password
+
+The next time you `git push` changes from your local computer to a remote Github repository, enter your user name and the PAT key from [Step 3a](#generatePAT) as the password.
+
+!!! tip
+
+ You may want to `git push` a test change (that can be deleted later) to a remote repository you work on now, so that you do not lose the PAT key!
+
+If you have a password manager, it should "remember" the PAT key so it will not need to be entered the next time you use `git`.
+
+For other options to update your Github credentials with the PAT key, see the Github [documentation](https://docs.github.com/en/free-pro-team@latest/github/using-git/updating-credentials-from-the-macos-keychain).
diff --git a/docs/CFDE-Internal-Training/images-github-auth/0-account-security.png b/docs/CFDE-Internal-Training/images-github-auth/0-account-security.png
new file mode 100644
index 000000000..4cd455b02
Binary files /dev/null and b/docs/CFDE-Internal-Training/images-github-auth/0-account-security.png differ
diff --git a/docs/CFDE-Internal-Training/images-github-auth/0-developer-settings.png b/docs/CFDE-Internal-Training/images-github-auth/0-developer-settings.png
new file mode 100644
index 000000000..b6d6a33b5
Binary files /dev/null and b/docs/CFDE-Internal-Training/images-github-auth/0-developer-settings.png differ
diff --git a/docs/CFDE-Internal-Training/images-github-auth/1-two-factor-auth.png b/docs/CFDE-Internal-Training/images-github-auth/1-two-factor-auth.png
new file mode 100644
index 000000000..08c9740d2
Binary files /dev/null and b/docs/CFDE-Internal-Training/images-github-auth/1-two-factor-auth.png differ
diff --git a/docs/CFDE-Internal-Training/images-github-auth/2-two-factor-auth-phone-set-up.png b/docs/CFDE-Internal-Training/images-github-auth/2-two-factor-auth-phone-set-up.png
new file mode 100644
index 000000000..8da24b33e
Binary files /dev/null and b/docs/CFDE-Internal-Training/images-github-auth/2-two-factor-auth-phone-set-up.png differ
diff --git a/docs/CFDE-Internal-Training/images-github-auth/3-save-recovery-codes.png b/docs/CFDE-Internal-Training/images-github-auth/3-save-recovery-codes.png
new file mode 100644
index 000000000..fcd7a9bc7
Binary files /dev/null and b/docs/CFDE-Internal-Training/images-github-auth/3-save-recovery-codes.png differ
diff --git a/docs/CFDE-Internal-Training/images-github-auth/4-generate-pat.png b/docs/CFDE-Internal-Training/images-github-auth/4-generate-pat.png
new file mode 100644
index 000000000..4d2c29477
Binary files /dev/null and b/docs/CFDE-Internal-Training/images-github-auth/4-generate-pat.png differ
diff --git a/docs/CFDE-Internal-Training/images-github-auth/5-personal-access-token.png b/docs/CFDE-Internal-Training/images-github-auth/5-personal-access-token.png
new file mode 100644
index 000000000..9ff3342de
Binary files /dev/null and b/docs/CFDE-Internal-Training/images-github-auth/5-personal-access-token.png differ
diff --git a/docs/CFDE-Internal-Training/index.md b/docs/CFDE-Internal-Training/index.md
index 9f5d9c87f..220f4af83 100644
--- a/docs/CFDE-Internal-Training/index.md
+++ b/docs/CFDE-Internal-Training/index.md
@@ -21,3 +21,4 @@ Common Fund Programs
GitHub:
- [Working with Protected Branches](ProtectedBranch_HowTo.md)
+ - [Setting up Github authentication](github_auth_setup.md)
diff --git a/docs/Cheat-Sheets/.pages b/docs/Cheat-Sheets/.pages
index 6cf79de08..0754576cf 100644
--- a/docs/Cheat-Sheets/.pages
+++ b/docs/Cheat-Sheets/.pages
@@ -3,3 +3,4 @@ nav:
- Bash and nano commands: bash_cheatsheet.md
- Conda commands: conda_cheatsheet.md
- Snakemake commands: snakemake_cheatsheet.md
+ - Screen commands: screen_cheatsheet.md
diff --git a/docs/Cheat-Sheets/conda_cheatsheet.md b/docs/Cheat-Sheets/conda_cheatsheet.md
index e71a004f1..c21232cb5 100644
--- a/docs/Cheat-Sheets/conda_cheatsheet.md
+++ b/docs/Cheat-Sheets/conda_cheatsheet.md
@@ -1,12 +1,19 @@
# Conda Command Cheat Sheet
+Commonly used conda commands:
+
conda | Description
--- | ---
-`conda create -n ` | create a new conda environment. You can include other flags to customize the environment more.
-`conda activate ` | activate conda environment
-`conda install -y ` | install software in conda environment
-`conda deactivate` | deactivate conda environment
-`conda info --envs` | list conda environments, `*` will be next to the environment you are currently in
-`conda list -n ` | list software installed in this conda environment. Or simply, `conda list`.
-`conda info` | information about your conda environment
-`conda env remove --name ` | remove a conda environment
+`conda create -n ` or `conda create -n ` | Create a new conda environment. You can include other flags to customize the environment more and install software in the conda environment. For example, `conda create -n fastqc_env fastqc` will install the FastQC program into a conda environment called `fastqc_env`.
+`conda env create -n -f ` | Create a new conda environment using specifications from a yaml file. For example, `conda env create -n test -f environment.yml`.
+`conda install -y ` | Install software in conda environment
+`conda activate ` | Activate conda environment
+`conda deactivate` | Deactivate conda environment
+`conda info --envs` or `conda env list` | Both commands list conda environments, `*` will be next to the environment you are currently in
+`conda list -n ` | List software installed in this conda environment. Or simply, `conda list`.
+`conda info` | Information about your conda environment
+`conda search ` | Search for available software versions
+`conda env remove --name ` | Remove a conda environment
+
+
+Download the official [conda cheat sheet](https://docs.conda.io/projects/conda/en/latest/user-guide/cheatsheet.html) for more commands
diff --git a/docs/Cheat-Sheets/index.md b/docs/Cheat-Sheets/index.md
index c4fe37a72..87dad7ba0 100644
--- a/docs/Cheat-Sheets/index.md
+++ b/docs/Cheat-Sheets/index.md
@@ -6,8 +6,9 @@ title: Overview
Cheat Sheets
==============
-The cheat sheets below are a quick reference to commonly used commands in bash, conda, and Snakemake.
+Quick reference guides to commonly used command-line commands:
- [Bash and nano Commands](./bash_cheatsheet.md)
- [Conda Commands](./conda_cheatsheet.md)
- [Snakemake Commands](./snakemake_cheatsheet.md )
+- [Screen Commands](./screen_cheatsheet.md)
diff --git a/docs/Cheat-Sheets/screen_cheatsheet.md b/docs/Cheat-Sheets/screen_cheatsheet.md
new file mode 100644
index 000000000..f09c339c5
--- /dev/null
+++ b/docs/Cheat-Sheets/screen_cheatsheet.md
@@ -0,0 +1,38 @@
+# Screen Cheat Sheet
+
+ Command | Description
+--------|---------
+++ctrl+a+c++ | Creates a new screen session so that you can use more than one screen session at once
+++ctrl+a+n++ | Switches to the next screen session if you have more than one running screen
+++ctrl+a+p++ | Switches to the previous screen session if you have more than one running screen
+++ctrl+a+d++ or ++ctrl+a++ then ++ctrl+d++ | Detaches a screen session without killing the processes running in it
+++ctrl+a++ then ++ctrl+a++ | Switches between screens
+`exit`| Kills a screen session permanently
+
+## Getting in
+Command | Description
+--------|---------
+`screen -S ` | Start a new screen session with session name
+`screen -ls` | List running sessions/screens
+`screen -r` | Attach to a running session
+`screen -r ` |Attach to a running session with a name
+
+## Getting Out
+Command | Description
+--------|---------
+`screen -d ` | Detach a running session
+++ctrl+a+d++ | Detaches a screen session without killing the processes running in it
+++ctrl+a++ then ++ctrl+d++| Detach and logout (quick exit)
+`screen -S -X quit` | Delete a screen while in detached state
+
+## Toggling
+Command | Description
+--------|---------
+++ctrl+a+c++ | Create new window
+++ctrl+a+n++ or ++ctrl+a++ | Change to next window in list
+++ctrl+a+p++ or ++ctrl+a++ | Change to previous window in list
+
+## Help
+Command | Description
+--------|---------
+`screen -h` | See help
diff --git a/docs/Release-Notes/.pages b/docs/Release-Notes/.pages
index 1d9f80d98..826ea8073 100644
--- a/docs/Release-Notes/.pages
+++ b/docs/Release-Notes/.pages
@@ -1,5 +1,4 @@
nav:
- index.md
+ - December-2020.md
- October-2020.md
-
-
diff --git a/docs/Release-Notes/December-2020.md b/docs/Release-Notes/December-2020.md
new file mode 100644
index 000000000..4793f6da1
--- /dev/null
+++ b/docs/Release-Notes/December-2020.md
@@ -0,0 +1,25 @@
+---
+layout: page
+title: December 2020 Release
+---
+
+December 2020 Release
+=================
+
+**Updated December 18, 2020**
+
+New Tutorials
+
+
+- [Simulate Illumina Reads](../Bioinformatics-Skills/Simulate_Illumina_Reads.md)
+- [Uploading Data to Cavatica](../Bioinformatics-Skills/Kids-First/Upload_Data.md)
+
+Updates and Fixes
+
+- [Editing MkDocs Websites with cfde-bot](../CFDE-Internal-Training/cfdebot_website_editing.md): PR steps updated
+- [Updates to Kids First Portal tutorials](../Bioinformatics-Skills/Kids-First/index.md): Screenshots and text updated to match new versions seen on Kids First Website
+- [Edits to Kids First Data Download terminal screencasts](../Bioinformatics-Skills/Kids-First/Download_Data/Data-Download-Via-Cavatica.md): Speeds up the screencasts, fixes typos, and removes some redundancy
+
+Website Features
+
+- Home page: Re-design of home page and addition of carousel feature
diff --git a/docs/Release-Notes/index.md b/docs/Release-Notes/index.md
index e09d7dedc..178384a58 100644
--- a/docs/Release-Notes/index.md
+++ b/docs/Release-Notes/index.md
@@ -1,25 +1,28 @@
---
layout: page
-title: December 2020 Release
+title: February 2021 Release
---
Latest Release
=================
-**Updated December 18, 2020**
+**Updated February 24, 2021**
New Tutorials
-
-- [Simulate Illumina Reads](../Bioinformatics-Skills/Simulate_Illumina_Reads.md)
-- [Uploading Data to Cavatica](../Bioinformatics-Skills/Kids-First/Upload_Data.md)
+- [RNAseq on Cavatica](../Bioinformatics-Skills/RNAseq-on-Cavatica/rna_seq_1.md)
+- [Using the Screen Command on AWS](../Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws5_Screen.md)
+- [Introduction to Google Cloud Platform](../Bioinformatics-Skills/Introduction-to-GCP/index.md)
+- [Saving Queries on the Kids First Data Resource Portal](../Bioinformatics-Skills/Kids-First/Advanced-KF-Portal-Queries/KF_13_SavingQueries.md)
+- [Setting Up Github Authentication](../CFDE-Internal-Training/github_auth_setup.md)
Updates and Fixes
-- [Editing MkDocs Websites with cfde-bot](../CFDE-Internal-Training/cfdebot_website_editing.md): PR steps updated
-- [Updates to Kids First Portal tutorials](../Bioinformatics-Skills/Kids-First/index.md): Screenshots and text updated to match new versions seen on Kids First Website
-- [Edits to Kids First Data Download terminal screencasts](../Bioinformatics-Skills/Kids-First/Download_Data/Data-Download-Via-Cavatica.md): Speeds up the screencasts, fixes typos, and removes some redundancy
+- [GWAS in the Cloud](../Bioinformatics-Skills/GWAS-in-the-cloud/index.md): updated AWS screenshots to new user experience interface
+- [Introduction to Amazon Web Services](../Bioinformatics-Skills/Introduction_to_Amazon_Web_Services/introtoaws1.md): updated AWS tutorial based on workshop feedback
+- [Editing MkDocs Websites with cfde-bot](../CFDE-Internal-Training/cfdebot_website_editing.md): updated Github repo names
+- [CFDE Portal Use Cases](../Bioinformatics-Skills/CFDE-Portal/index.md): lesson landing page added back
Website Features
-- Home page: Re-design of home page and addition of carousel feature
+- Home page: added upcoming workshop events link, updated home page formatting
diff --git a/docs/TrainingRepoReleasePlan/TrainingRepo-Release-Plan.md b/docs/TrainingRepoReleasePlan/TrainingRepo-Release-Plan.md
index 84b63128f..c5bb27d1f 100644
--- a/docs/TrainingRepoReleasePlan/TrainingRepo-Release-Plan.md
+++ b/docs/TrainingRepoReleasePlan/TrainingRepo-Release-Plan.md
@@ -29,6 +29,10 @@ Our current release dates are set to coincide with NIH deliverable dates.
#### Labels Format
+Single event based label
+Name: *Training-Release*
+
+Month based label
Name: release.month(short form)-release.year
For example, *Oct-2020*.
@@ -58,7 +62,7 @@ The PR author should ensure:
- colorblindness test https://www.toptal.com/designers/colorfilter/
- correct rendering of website (preview of their branch using autogenerated RTD link)
- resolution of any merge conflicts
-- release label - tagging the upcoming release, e.g. `Oct-2020`
+- release label - it is important to tag with **`Training-Release`** and is required. You can also tag with month based label e.g.`Oct-2020` which is optional.
- PR type label - `new` for new content, `feature` for new features, `fixes` for updates and fixes to existing content
- assignment to the appropriate project - project board for a given release for tracking
- linking of related issues - if applicable
@@ -151,7 +155,7 @@ To access the draft
- use the `Edit` button on the right top corner
- modify the release titles if applicable, add date and save draft
-The release information will also be documented in a markdown file `index.md`, hosted within `Release Notes` folder under the `/docs` folder structure of the training repo in Github. Editing of the the website release version will include removing `Website Features` category, intra linking of tutorials mentioned and documenting changes to existing tutorials under the `Updates and Fixes` category. Addition of enhancements features like vidlets, images, screencasts etc to existing tutorials can be highlighted in a separate category (Enhancements or Improvements) if applicable.
+The release information will also be documented in a markdown file `index.md`, hosted within `Release Notes` folder under the `/docs` folder structure of the training repo in Github. For every new release, copy the existing contents within `index.md` to a new file with name {Month}-{year}.md before entering the latest release notes into `index.md`. Editing of the the website release version will include removing `Website Features` category, intra linking of tutorials mentioned and documenting changes to existing tutorials under the `Updates and Fixes` category. Addition of enhancements features like vidlets, images, screencasts etc to existing tutorials can be highlighted in a separate category (Enhancements or Improvements) if applicable.
The release notes will also be referenced in the landing page of the training website.
diff --git a/docs/stylesheets/extra.css b/docs/stylesheets/extra.css
index 0b3700890..47b9598b2 100644
--- a/docs/stylesheets/extra.css
+++ b/docs/stylesheets/extra.css
@@ -195,3 +195,8 @@ a:active {
color: #00AB6C;
}
+.highlight_txt {
+ color: #9f0bde;
+ background-color: #ededed;
+ padding: .25em;
+}
diff --git a/docs/templates/main.html b/docs/templates/main.html
deleted file mode 100644
index 3b8b23471..000000000
--- a/docs/templates/main.html
+++ /dev/null
@@ -1,25 +0,0 @@
-{% extends "base.html" %}
-{% block analytics %}
-
-
-
-
-{% endblock %}
-
-{% block disqus %}
- {% include "partials/integrations/disqus.html" %}
-{% endblock %}
-
-
diff --git a/mkdocs.yml b/mkdocs.yml
index 5dfbd8f21..2316bddfa 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -9,10 +9,11 @@ theme:
palette:
#scheme: preference
primary: white
- #accent: white
+ #accent: teal
features:
- navigation.tabs
- navigation.instant
+ #- navigation.sections
favicon: images/CFDE-logo-white-outline.png
logo: images/CFDE-logo.png
custom_dir: custom
@@ -28,10 +29,15 @@ plugins:
markdown_extensions:
- admonition
+#- def_list
+- pymdownx.critic
- pymdownx.details
+- pymdownx.highlight
- pymdownx.superfences
- pymdownx.tabbed
-- pymdownx.emoji
+- pymdownx.emoji:
+ emoji_index: !!python/name:materialx.emoji.twemoji
+ emoji_generator: !!python/name:materialx.emoji.to_svg
- pymdownx.tabbed
- pymdownx.extra
- pymdownx.superfences
@@ -40,6 +46,7 @@ markdown_extensions:
- pymdownx.keys
- pymdownx.inlinehilite
+
- codehilite:
guess_lang: false
- toc:
diff --git a/requirements.txt b/requirements.txt
index 0cc65a168..18e08f368 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,4 +1,4 @@
-mkdocs-material
+mkdocs-material==6.2.6
mkdocs-jupyter
mkdocs-git-revision-date-localized-plugin
mkdocs-awesome-pages-plugin