
Following up on the previous bioinformatics tool chest post, I thought I’d cover Bioconductor next. Bioconductor is actually an off-shoot of the R-project.
Now hold on, I know what you’re thinking. “But you talked about R last time, why do we have to talk about R again?!?” It’s simple really. Though bioconductor is a derivitive of R, its purpose truly is unique enough to deserve its own post.
Bioconductor (or BioC) is an open-source derivitive of R focused on facilitating the analysis of genomic data. One might ask, why should I care? If you perform any kind of high-throughput SNP genotyping or gene expression analysis, this software suite gives you immediate access to free, open-source, extremely powerful data analysis options. Got Affymetrix CEL files for expression data? No problem. Bioconductor can load, normalize, analyze, and summarize that data for you. How about SNP genotyping data? Again no problem. Want to check the copy number of your SNP data? You’ll have several options. Many Bioconductor packages are built using S4 methods and classes (the exact definition of which are unimportant for this article). The advantage of that coding system is that you can use and extend existing classes to perform your own, custom designed analysis methods. And even better, once you’ve worked out a new method, you can incorporate it into a package and submit it to Bioconductor for everyone to use!
The bottom line is this: if you need powerful, customizable, freely available analysis software (and who doesn’t after spending ridulous amounts of money running many samples on high-throughput technology) then Bioconductor is a viable choice. If you have genomic data give BioC a try, and if it’s useful to you build your own packages for the whole community.
