Course about scientific programming, from algorithm to implementation.
Scientific programming is quite like performing a wetlab experiment: there is a research question, an experimental design, materials, and results. Of course, one important difference is once you run your experiment once, repeating the experiment becomes a lot easier. This course assumes programming experience and covers various aspects of using scientific programming for development of software to answer biological questions. During this course the students will using professional software development standards and values for research integrity, particularly around reproducibility. Regarding algorithm development, the course focuses on programming approaches like parallel computing, client-server communication, and automating statistical analysis using interactive notebooks.
- Create software to implement an algorithm answering a biological question
- Learn about parallel computing
- Learn about server-client interaction
- Learn about making multivariate statistics reproducible
- Apply version control in software development
- Explain algorithms and implementations in software in appropriate documentation
Week 1 provides a lecture introducing the course, and the practical is a GitHub 101 (students are free to use GitLab or another alternative).
- Week 2,3: Server-Client Interaction
- Week 4,5: Making multivariate statistics reproducible
- Week 6,7: Parallel computing
The lectures outlines the format of the course, topics, assessment, and grading.
- making a GitHub account (or GitLab)
- creating a repository (private or public)
- creating files
- writing commit messages
- adding all required files (LICENSE, AUTHORS, README, etc)
- making a tag and matching release (with release notes)
Lecture 1 discusses these topics:
- application programming interfaces (APIs)
- history of web APIs: SOAP, XMPP, REST
- synchronous versus asynchronous API calls
- the JavaScript + HTML powerhouse
- Wikidata and SPARQL
Lecture 2 discusses these topics:
- JSON format
- d3.js
Practical 1 has the following aspects as main topic:
- look at the Wikidata Query Service and learn from the examples (click the
Examples
button) - set up a GitHub repository for the first assignment (see the practical of Week 1)
- use the two JavaScript examples (htmljs.template.html and wikidata.template.html) to create a simple, running HTML example that runs a simple SPARQL query
- learn how to use the web browser console
Wikibase SDK javascript library is used in the template file wikidata.template.html as a dependency to provide the functionality of executing a SPARQL query over WikiData SPARQL endpoint and retrieve the data as json in the browser. Wikibase SDK library is released under MIT License. For additional information, plase visit https://github.com/maxlath/wikibase-sdk repository.
- Programming in the Life Sciences #4: communication from within HTML
- Programming in the Life Sciences #5: converting the results into HTML
- Programming in the Life Sciences #10: JavaScript Object Notation (JSON)
- Programming in the Life Sciences #11: HTML
- Programming in the Life Sciences #19: debugging
- Programming in the Life Sciences #20: extracting data from JSON
- How to learn D3.js
Practical 2 ...
Programming languages and file formats used: HTML, JavaScript, REST API, SPARQL, JSON
Lecture 1 discusses these topics:
- boiling points of alkanes
- the concept of QSAR
- molecular descriptors (and rcdk)
- Partial Least Squares
- R Markdown
Lecture 2 discusses these topics:
Practical 1 ...
Practical 2 ...
Programming languages and file formats used: R, Markdown, multivariate statistics
Lecture 1 discusses these topics:
Lecture 2 discusses these topics:
Practical 1 ...
Practical 2 ...
Programming languages and file formats used: Nextflow, Groovy