We need recipes for common bioinformatics tasks

e · l · n
May 27, 2024

Ad-hoc tasks in bioinformatics can contain such an immense number of operations and tasks that need to be performed to achieve a certain goal. Often these are all individually regarded as rather "standard" or "routine". Despite this, it is quite hard to find an authoritative set of "recipes" for how to do such tasks.

Thus I was starting to think that there needs to be a collection of bioinformatics "recipes". A sort of "cookbook" for common bioinformatics tasks.

It turns out that there are a number of such resources available though:

  1. First there is bioinformatics.recipes which is quite much exactly the type of content I was after, except:
    • The only thing I note about this one is it would be great if these were based on a version control like git, and hosted on a code hosting platform like GitHub, to make it easier to collectively both fork, clone and use the included code, but also contribute fixes or improvements to documentation.
  2. Then, somebody also mentioned this list of methods primers in the Nature journal.
    • The comment here is perhaps that this is more of high level information about various methods, which is also great, but not exactly the same thing as a recipes.
  3. Finally there is of course BioStars, which is a huge resource of questions and answers over the last decades.
    • A problem here is mainly that a lot of the questions (and answers) are quite old, perhaps 10+ years sometimes, which makes it hard to judge whether an answer is still relevant or not. And then the selection fo topics is also naturally limited to what actual questions people have had, which might not cover the problem area evently.
  4. Also, there are lists of common best-practice pipelines like the one at NF-core, which is a hugely useful resource.
    • But it covers more of high-level analyses, and does not typically provide pipelines for common smaller tasks like using blast to cut out a gene from a reference genome or similar.

With this in mind, is it perhaps time to create a bioinformatics recipe resource based on something like GitHub, where the community can crowd-source these recipes both in terms of code and documentation?

Or, is there already something like this available?

If not, below are some random ideas about how to do this: