<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Bioinformatics Blog</title>
	<atom:link href="http://bioinformatics.whatheblog.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://bioinformatics.whatheblog.com</link>
	<description>One base pair at a time...</description>
	<lastBuildDate>Tue, 20 Oct 2009 05:37:10 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.5</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Metagenomics Resources</title>
		<link>http://bioinformatics.whatheblog.com/2009/10/metagenomics-resources/</link>
		<comments>http://bioinformatics.whatheblog.com/2009/10/metagenomics-resources/#comments</comments>
		<pubDate>Tue, 20 Oct 2009 05:37:10 +0000</pubDate>
		<dc:creator>Abbas</dc:creator>
				<category><![CDATA[Science]]></category>
		<category><![CDATA[Technical]]></category>
		<category><![CDATA[genome. microbiome]]></category>
		<category><![CDATA[metagenomics]]></category>

		<guid isPermaLink="false">http://bioinformatics.whatheblog.com/?p=39</guid>
		<description><![CDATA[
MEGAN MEtaGenome ANalyzer. A stand-alone metagenome analysis tool.
Metagenomics and Our Microbial Planet A website on metagenomics and the vital role of microbes on Earth from the National Academies.
The New Science of Metagenomics: Revealing the Secrets of Our Microbial Planet A report released by the National Research Council in March 2007. Also, see the Report In [...]]]></description>
			<content:encoded><![CDATA[<ul>
<li><a class="external text" rel="nofollow" href="http://www-ab.informatik.uni-tuebingen.de/software/megan/">MEGAN</a> MEtaGenome ANalyzer. A stand-alone metagenome analysis tool.</li>
<li><a class="external text" rel="nofollow" href="http://dels.nas.edu/metagenomics/">Metagenomics and Our Microbial Planet</a> A website on metagenomics and the vital role of microbes on Earth from the <a class="external text" rel="nofollow" href="http://nationalacademies.org/">National Academies.</a></li>
<li><a class="external text" rel="nofollow" href="http://books.nap.edu/catalog.php?record_id=11902">The New Science of Metagenomics: Revealing the Secrets of Our Microbial Planet</a> A report released by the National Research Council in March 2007. Also, see the <a class="external text" rel="nofollow" href="http://dels.nas.edu/dels/rpt_briefs/metagenomics_brief_final.pdf">Report In Brief.</a></li>
<li><a class="external text" rel="nofollow" href="http://img.jgi.doe.gov/m">IMG/M</a> The Integrated Microbial Genomes system, for metagenome analysis by the DOE-JGI.</li>
<li><a class="external text" rel="nofollow" href="http://camera.calit2.net/index.php">CAMERA</a> Cyberinfrastructure for Metagenomics, data repository and tools for metagenomics research.</li>
<li><a class="external text" rel="nofollow" href="http://www.scq.ubc.ca/?p=509">A good overview of metagenomics from the Science Creative Quarterly</a></li>
<li><a class="external text" rel="nofollow" href="http://www.genomesonline.org/gold.cgi?want=Metagenomes">list of Metagenome Projects from genomesonline.org</a></li>
<li><a class="external text" rel="nofollow" href="http://metagenomics.nmpdr.org/">MG-RAST</a> publicly available, free, metagenomics annotation pipeline and repository for pyrosequences, Sanger sequences, and other sequence approaches.</li>
<li><a class="mw-redirect" title="Human microbiome project" href="http://en.wikipedia.org/wiki/Human_microbiome_project">Human microbiome project</a></li>
<li><a class="external text" rel="nofollow" href="http://www.metahit.eu/">MetaHIT</a> official website for the EU-funded project : Metagenomics of the Human Intestinal Tract</li>
<li><a class="external text" rel="nofollow" href="http://annotathon.univ-mrs.fr/">Annotathon</a> Bioinformatics Training Through Metagenomic Sequence Annotation</li>
<li><a class="external text" rel="nofollow" href="http://www.highveld.com/pages/metagenomics.html">Metagenomics</a> Metagenomics research and applications</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://bioinformatics.whatheblog.com/2009/10/metagenomics-resources/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Next Gen. Sequencing</title>
		<link>http://bioinformatics.whatheblog.com/2009/10/next-gen-sequencing/</link>
		<comments>http://bioinformatics.whatheblog.com/2009/10/next-gen-sequencing/#comments</comments>
		<pubDate>Mon, 12 Oct 2009 18:50:32 +0000</pubDate>
		<dc:creator>Abbas</dc:creator>
				<category><![CDATA[Science]]></category>
		<category><![CDATA[nextgen]]></category>
		<category><![CDATA[videos]]></category>

		<guid isPermaLink="false">http://bioinformatics.whatheblog.com/?p=38</guid>
		<description><![CDATA[With IBM tossing it&#8217;s hat into the ring of &#8220;next-next-generation&#8221; sequencing, I&#8217;m starting to get lost as to which generation is which. For the moment, I&#8217;m sort of lumping things together, while I wait to see how the field plays out. In my mind, first generation is anything that requires chain termination, Second generation is [...]]]></description>
			<content:encoded><![CDATA[<p>With IBM tossing it&#8217;s hat into the ring of &#8220;next-next-generation&#8221; sequencing, I&#8217;m starting to get lost as to which generation is which. For the moment, I&#8217;m sort of lumping things together, while I wait to see how the field plays out. In my mind, first generation is anything that requires chain termination, Second generation is chemical based pyrosequencing, and third generation is single molecule sequencing based on a nano-scale mechanical process. It&#8217;s a crude divide, but it seems to have some consistency.</p>
<p>At any rate, I decided I&#8217;d collect a few videos to illustrate each one. For Sanger, there are a LOT of videos, many of which are quite excellent, but I only wanted one. (Sorry if I didn&#8217;t pick yours.) For second and third generation DNA sequencing videos, the selection kind of flattens out, and two of them come from corporate sites, rather than youtube &#8211; which seems to be the general consensus repository of technology videos.</p>
<p>Personally, I find it interesting to see how each group is selling themselves. You&#8217;ll notice some videos press heavily on the technology, while others focus on the workflow.</p>
<p>As an aside, I also find it interesting to look for places where the illustrations don&#8217;t make sense&#8230; there&#8217;s a lovely place in the 454 video where two strands of DNA split from each other on the bead, leaving the two full strands and a complete primer sequence&#8230; mysterious! (Yes, I do enjoy looking for inconsistencies when I go to the movies.)</p>
<p>Ok, get out your popcorn.</p>
<p><span style="font-weight: bold;">First Generation:</span><br />
Sanger Entry: <a href="http://www.youtube.com/watch?v=oYpllbI0qF8">Link</a><br />
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="344" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/oYpllbI0qF8&amp;hl=en&amp;fs=1&amp;" /><embed type="application/x-shockwave-flash" width="425" height="344" src="http://www.youtube.com/v/oYpllbI0qF8&amp;hl=en&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p><span style="font-weight: bold;">Second Generation:</span><br />
Pyrosequencing Entry: <a href="http://www.youtube.com/watch?v=nFfgWGFe0aA">Link</a><br />
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="344" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/nFfgWGFe0aA&amp;hl=en&amp;fs=1&amp;" /><embed type="application/x-shockwave-flash" width="425" height="344" src="http://www.youtube.com/v/nFfgWGFe0aA&amp;hl=en&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p><span id="more-38"></span></p>
<p>Helicose Entry: <a href="http://www.youtube.com/watch?v=TboL7wODBj4">Link</a><br />
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="426" height="259" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/TboL7wODBj4&amp;hl=en&amp;fs=1&amp;" /><embed type="application/x-shockwave-flash" width="426" height="259" src="http://www.youtube.com/v/TboL7wODBj4&amp;hl=en&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>Illumina (Corporate site): <a href="http://www.illumina.com/downloads/GASequencingVid/illumina_sequencing_hi.swf">Link</a><br />
(<a href="http://www.illumina.com/downloads/GASequencingVid/illumina_sequencing_hi.swf">Click to see the Flash animation</a>)</p>
<p>454 Entry: <a href="http://www.youtube.com/watch?v=bFNjxKHP8Jc">Link</a><br />
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="344" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/bFNjxKHP8Jc&amp;hl=en&amp;fs=1&amp;" /><embed type="application/x-shockwave-flash" width="425" height="344" src="http://www.youtube.com/v/bFNjxKHP8Jc&amp;hl=en&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p><span style="font-weight: bold;">Third Generation:</span></p>
<p>Pacific Biosciences: <a href="http://www.pacificbiosciences.com/video_lg.html">Link</a><br />
(<a href="http://www.pacificbiosciences.com/video_lg.html">Click to see the Flash Video</a>)</p>
<p>Oxford Nanopore Entry: <a href="http://www.youtube.com/watch?v=HbjAMJehSlg">Link</a><br />
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="419" height="255" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/HbjAMJehSlg&amp;hl=en&amp;fs=1&amp;" /><embed type="application/x-shockwave-flash" width="419" height="255" src="http://www.youtube.com/v/HbjAMJehSlg&amp;hl=en&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>IBM&#8217;s Entry: <a href="http://www.youtube.com/watch?v=wvclP3GySUY">Link</a><br />
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="344" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/wvclP3GySUY&amp;hl=en&amp;fs=1&amp;" /><embed type="application/x-shockwave-flash" width="425" height="344" src="http://www.youtube.com/v/wvclP3GySUY&amp;hl=en&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>Note: If I&#8217;ve missed something, please let me know.  I&#8217;m happy to add to this post whenever something new comes up.</p>
<p class="blogger-labels">Labels: <a rel="tag" href="http://www.fejes.ca/labels/Sequencing.html">Sequencing</a>, <a rel="tag" href="http://www.fejes.ca/labels/SMRT.html">SMRT</a>, <a rel="tag" href="http://www.fejes.ca/labels/Solexa__Illumina.html">Solexa/Illumina</a></p>
]]></content:encoded>
			<wfw:commentRss>http://bioinformatics.whatheblog.com/2009/10/next-gen-sequencing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NYTimes on Probiotics</title>
		<link>http://bioinformatics.whatheblog.com/2009/10/nytimes-on-probiotics/</link>
		<comments>http://bioinformatics.whatheblog.com/2009/10/nytimes-on-probiotics/#comments</comments>
		<pubDate>Mon, 12 Oct 2009 05:12:03 +0000</pubDate>
		<dc:creator>Abbas</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[probiotics]]></category>

		<guid isPermaLink="false">http://bioinformatics.whatheblog.com/?p=37</guid>
		<description><![CDATA[There was an article on probiotics in the New York Times today. By Tara Parker-Pope it addresses some important issues rarely covered in the press about probiotics (see Well &#8211; Probiotics &#8211; Looking Underneath the Yogurt Label &#8211; NYTimes.com).
On the one hand, the article does a decent job of pointing out that there is great [...]]]></description>
			<content:encoded><![CDATA[<p>There was an article on probiotics in the New York Times today. By Tara Parker-Pope it addresses some important issues rarely covered in the press about probiotics (see <a href="http://www.nytimes.com/2009/09/29/health/29well.html?_r=1">Well &#8211; Probiotics &#8211; Looking Underneath the Yogurt Label &#8211; NYTimes.com).</a></p>
<p>On the one hand, the article does a decent job of pointing out that there is great strain to strain variation among microbes labelled as probiotics. In this regard there is a great quote by Gregor Reid:</p>
<blockquote><p>Lactobacillus is just the bacterium,” said Gregor Reid, director of the Canadian Research and Development Center for Probiotics. “To say a product contains Lactobacillus is like saying you’re bringing <a title="More articles about George Clooney" href="http://topics.nytimes.com/top/reference/timestopics/people/c/george_clooney/index.html?inline=nyt-per">George Clooney</a> to a party. It may be the actor, or it may be an 85-year-old guy from Atlanta who just happens to be named George Clooney. With probiotics, there are strain-to-strain differences.”</p></blockquote>
<blockquote><p><span id="more-37"></span></p></blockquote>
<p>Personally I think the article did a poor job of discussing one of the real complexities of probiotics (and actually any drug) in that seems to suggest that particular strains are going to be useful for certain ailments or not. In reality, the human gut is a horribly complex place, and the effectiveness of particular strains is no doubt going to depend on health status, history, other microbes being present, gender, age, genetics, and much much more. Thus it would have been good to include some more discussion of just how complex the interaction between probiotics and &#8220;health&#8221; is likely to be.</p>
<p>Interestingly, the article suggests:</p>
<blockquote><p>Consumers interested in probiotics should look for products that list the specific strain on the label and offer readers easy access to scientific studies supporting the claims. A good place to find studies on various probiotic strains is the Web site <a href="http://www.pubmed.gov/" target="_">www.PubMed.gov</a>.</p></blockquote>
<p>On the one hand, I am very happy that the Times is suggesting consumers look up information in Pubmed, a great resource. On the other hand, much of the published work on probiotics is still hidden from consumers being the veil of corporate and society publishing practices. Perhaps the Times author had access to all these articles. But the consumer right now does not. Too bad the Times missed a chance to discuss this important component of getting consumers involved in making their own health decisions.</p>
]]></content:encoded>
			<wfw:commentRss>http://bioinformatics.whatheblog.com/2009/10/nytimes-on-probiotics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New Look at Cancer Biology</title>
		<link>http://bioinformatics.whatheblog.com/2009/10/new-look-at-cancer-biology/</link>
		<comments>http://bioinformatics.whatheblog.com/2009/10/new-look-at-cancer-biology/#comments</comments>
		<pubDate>Mon, 12 Oct 2009 05:08:07 +0000</pubDate>
		<dc:creator>Abbas</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[cancer]]></category>
		<category><![CDATA[watson]]></category>

		<guid isPermaLink="false">http://bioinformatics.whatheblog.com/?p=36</guid>
		<description><![CDATA[Sure, James Watson has been known, especially recently, to say some outrageous things. But here is something I think everyone, scientists and the public should read &#8211; an opinoin piece in the NY Times today by Watson ( Op-Ed Contributor &#8211; To Fight Cancer, Know the Enemy &#8211; NYTimes.com)
This piece is worth reading because it [...]]]></description>
			<content:encoded><![CDATA[<p>Sure, James Watson has been known, especially recently, to say some outrageous things. But here is something I think everyone, scientists and the public should read &#8211; an opinoin piece in the NY Times today by Watson ( <a href="http://www.nytimes.com/2009/08/06/opinion/06watson.html?_r=2">Op-Ed Contributor &#8211; To Fight Cancer, Know the Enemy &#8211; NYTimes.com)</a></p>
<p>This piece is worth reading because it contains some critical ideas and wisdom which has been missing in discussions of the fight against cancer.</p>
<p>First, Watson discusses the critical importance of basic science and says that when he expressed this importance to the National Cancer Institute advisory board many years ago, he was eventually booted off.</p>
<p>Second, he discusses how we have only recently begun to understand the basic biology of cancer (he also mentions how the human genome project has helped in this). The genome project will, he says, allow for the determination of most/all of the major genetic changes that occur in cancer cells.</p>
<p><span id="more-36"></span></p>
<p>Third, he discusses some limitations of the FDA drug approval process that limit the ability to test combinations of drugs which Watson believes will be needed in the fight against cancer.</p>
<p>Fourth he suggests that the National Cancer Institute should help support small biotech companies in the development of new drugs since venture capital has dried up for such endeavors.</p>
<p>As usual, Watson would not be Watson if he did not say something potentially controversial. In this, the most controversial thing is probably how he discusses that the National Cancer Institute has become a &#8220;a largely rudderless ship in dire need of a bold captain who will settle only for total victory. &#8221; Now, I do not have any opinion about this since I have not followed NCI or its leadership. But it is certainly worth considering Watson&#8217;s opinion here.</p>
<p>In the end, Watson says the time is now to reinvigorate the &#8220;War on Cancer.&#8221; Despite misgivings about many things he has been up to recently, I found myself agreeing with almost everything he said in this piece. Again, definitely worth a read.</p>
]]></content:encoded>
			<wfw:commentRss>http://bioinformatics.whatheblog.com/2009/10/new-look-at-cancer-biology/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Pervasive Effects of an Antibiotic on the Gut</title>
		<link>http://bioinformatics.whatheblog.com/2009/10/the-pervasive-effects-of-an-antibiotic-on-the-human-gut-microbiota/</link>
		<comments>http://bioinformatics.whatheblog.com/2009/10/the-pervasive-effects-of-an-antibiotic-on-the-human-gut-microbiota/#comments</comments>
		<pubDate>Sun, 11 Oct 2009 21:45:02 +0000</pubDate>
		<dc:creator>Abbas</dc:creator>
				<category><![CDATA[Science]]></category>
		<category><![CDATA[metagenomics]]></category>
		<category><![CDATA[microbiome]]></category>

		<guid isPermaLink="false">http://bioinformatics.whatheblog.com/?p=35</guid>
		<description><![CDATA[The Pervasive Effects of an Antibiotic on the Human Gut Microbiota, as Revealed by Deep 16S rRNA Sequencing
Dethlefsen L, Huse S, Sogin ML, Relman DA
PLoS Biology Vol. 6, No. 11, e280 doi:10.1371/journal.pbio.0060280

A paper in PLOS Biology from the Relman lab investigates the effect of a treatment with the antibiotic ciprofloxacin on the bacteria in the [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://biology.plosjournals.org/perlserv/?request=get-document&amp;doi=10.1371%2Fjournal.pbio.0060280&amp;ct=1&amp;SESSID=dd86d748dd6a153ff711b9ea44203c7f">The Pervasive Effects of an Antibiotic on the Human Gut Microbiota, as Revealed by Deep 16S rRNA Sequencing</a></p>
<div>Dethlefsen L, Huse S, Sogin ML, Relman DA<br />
PLoS Biology Vol. 6, No. 11, e280 doi:10.1371/journal.pbio.0060280</div>
<div></div>
<div>A paper in PLOS Biology from the Relman lab investigates the effect of a treatment with the antibiotic ciprofloxacin on the bacteria in the intestine. They collected over 7,000 full-length 16S rDNA sequences (1100-1400 bp) by Sanger sequencing and over 900,000 reads (~250 bp) from 454 sequencing of the V3 and the V6 regions.</div>
<div></div>
<div>There are many important results in this paper, but it is particularly relevant that 454 sequencing reveals more taxonomic variation with greater stability than traditional sequencing. In my own work, I have found that sequence variants that occur only once in the experiment cannot be used to differentiate samples. Deep sequencing reveals more taxa, and also reduces the frequency of singletons. A rare sequence variant (OTU) that occurs only once in the ~7000 full-length sequences occurs about 65 times in the 454 data set, providing more than enough &#8220;probability of detection&#8221; to be used for comparisons between samples.</div>
<div>&#8220;This set of 7,208 sequences is among the largest datasets of full-length 16S rRNA sequences from the human microbiota (or any environment), the rarefaction curves for V6 and V3 tag pyrosequencing eventually rise higher and display more curvature toward the horizontal than the OTU0.01 curve. These features show that a single run of the [454] FLX sequencer targeting V6 or V3 tags from the human gut microbiota can reveal more taxa, and capture a larger proportion of the detectable taxa, than a more extensive effort directed toward full-length 16S rRNA clone sequencing.&#8221;</div>
]]></content:encoded>
			<wfw:commentRss>http://bioinformatics.whatheblog.com/2009/10/the-pervasive-effects-of-an-antibiotic-on-the-human-gut-microbiota/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Converting between Unix and Windows text files?</title>
		<link>http://bioinformatics.whatheblog.com/2009/04/how-do-i-convert-between-unix-and-windows-text-files/</link>
		<comments>http://bioinformatics.whatheblog.com/2009/04/how-do-i-convert-between-unix-and-windows-text-files/#comments</comments>
		<pubDate>Sat, 11 Apr 2009 22:09:48 +0000</pubDate>
		<dc:creator>Abbas</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Technical]]></category>
		<category><![CDATA[converting]]></category>
		<category><![CDATA[mac]]></category>
		<category><![CDATA[newline]]></category>
		<category><![CDATA[unix]]></category>
		<category><![CDATA[windows]]></category>

		<guid isPermaLink="false">http://bioinformatics.whatheblog.com/?p=34</guid>
		<description><![CDATA[The format of Windows and Unix text files differs slightly. In Windows, lines end with both the line feed and carriage return ASCII characters, but Unix uses only a line feed.  As a consequence, some Windows applications will not show the line breaks in Unix-format files.  Likewise, Unix programs may display the carriage [...]]]></description>
			<content:encoded><![CDATA[<p>The format of Windows and <a href="http://kb.iu.edu/data/agat.html">Unix</a> text files differs slightly. In Windows, lines end with both the line feed and carriage return <a href="http://kb.iu.edu/data/afht.html">ASCII</a> characters, but Unix uses only a line feed.  As a consequence, some Windows applications will not show the line breaks in Unix-format files.  Likewise, Unix programs may display the carriage returns in Windows text files with <code>Ctrl-m</code> (<code> ^M </code>) characters at the end of each line.</p>
<p>There are many ways to solve this problem. This document provides instructions for using <a href="http://kb.iu.edu/data/aerg.html">FTP</a>, screen capture, <a href="http://kb.iu.edu/data/acux.html">unix2dos</a> and <a href="http://kb.iu.edu/data/acux.html">dos2unix</a>, <code>tr</code>, <a href="http://kb.iu.edu/data/afja.html">awk</a>, <a href="http://kb.iu.edu/data/afhp.html">Perl</a>, and <a href="http://kb.iu.edu/data/adxz.html">vi</a> to do the conversion.  Before you use these utilities, the files you are converting must first be on a Unix computer.</p>
<p><strong>Note:</strong> In the instructions below, replace <code>unixfile.txt</code> with the name of the Unix file you are transferring, and replace <code>winfile.txt</code> with the name of the Windows file you are transferring.</p>
<h3>FTP</h3>
<p>When using an FTP program to move a text file between Unix and Windows, be sure the file is transferred in <a href="http://kb.iu.edu/data/afht.html">ASCII</a> format. This will ensure that the document is transformed into a text format appropriate for the host.  Some FTP programs, especially graphical applications like Hummingbird FTP, do this automatically.  If you are using FTP from the command line, however, before you begin the file transfer, be sure to enter at the FTP prompt:</p>
<p><span class="example"> ascii</span></p>
<p><strong>Note:</strong> You need to use a client that supports secure FTP to transfer files to and from Indiana University&#8217;s central systems. For more, see <a href="http://kb.iu.edu/data/ahjh.html">At IU, what SSH/SFTP clients are supported and where can I get them?</a></p>
<h3>Screen capture</h3>
<p>You can also convert files from Unix to Windows format when transferring them to a PC with a communications program by selecting ASCII text download.  Select this option with your communications program to capture all the text subsequently displayed to your screen, and then enter at the Unix prompt:</p>
<p><span class="example"> cat unixfile.txt</span></p>
<p>Most communications programs will add carriage returns to the stream of text as they save it to your computer&#8217;s hard drive.  Once the file has finished displaying, abort the text download.</p>
<p><strong>Note:</strong> This method may be slow for large text files. Also, no error checking is performed on the file as it is transferred.</p>
<h3><code>dos2unix</code> and <code>unix2dos</code></h3>
<p>On systems using <a href="http://kb.iu.edu/data/agjq.html">Solaris</a>, the utilities <code>dos2unix</code> and <code>unix2dos</code> are available.  These utilities provide a straightforward method for converting files from the Unix command line.</p>
<p>To use either command, simply type the command followed by the name of the file you wish to convert, and the name of a file which will contain the converted results.  Thus, to convert a Windows file to a Unix file, at the Unix prompt, enter:</p>
<p><span class="example"> dos2unix winfile.txt unixfile.txt</span></p>
<p><span id="more-34"></span></p>
<p>To convert a Unix file to Windows, enter:</p>
<p><span class="example"> unix2dos unixfile.txt winfile.txt</span></p>
<p><strong>Note:</strong> These utilities are available only on Solaris systems.  To determine what variety of Unix is running on your computer, see <a href="http://kb.iu.edu/data/aaya.html">In Unix, how can I display information about the operating system?</a></p>
<h3><code>tr</code></h3>
<p>You can use <code>tr</code> to remove all carriage returns and <code>Ctrl-z</code> (<code> ^Z </code>) characters from a Windows file by entering:</p>
<p><span class="example"> tr -d &#8216;\15\32&#8242; &lt; winfile.txt &gt; unixfile.txt</span></p>
<p>You cannot use <code>tr</code> to convert a document from Unix format to Windows.</p>
<h3><code>awk</code></h3>
<p>To use <a href="http://kb.iu.edu/data/afja.html">awk</a> to convert a Windows file to Unix, at the Unix prompt, enter:</p>
<p><span class="example"> awk &#8216;{ sub(&#8221;\r$&#8221;, &#8220;&#8221;); print }&#8217; winfile.txt &gt; unixfile.txt</span></p>
<p>To convert a Unix file to Windows using <code>awk</code>, at the command line, enter:</p>
<p><span class="example"> awk &#8217;sub(&#8221;$&#8221;, &#8220;\r&#8221;)&#8217; unixfile.txt &gt; winfile.txt</span></p>
<p>On some systems, the version of <code>awk</code> may be old and not include the function <code>sub</code>.  If so, try the same command, but with <code>gawk</code> or <code>nawk</code> replacing <code>awk</code>.</p>
<h3>Perl</h3>
<p>To convert a Windows text file to a Unix text file using <a href="http://kb.iu.edu/data/afhp.html">Perl</a>, at the Unix <a href="http://kb.iu.edu/data/agvf.html">shell</a> prompt, enter:</p>
<p><span class="example"> perl -p -e &#8217;s/\r$//&#8217; &lt; winfile.txt &gt; unixfile.txt</span></p>
<p>To convert from a Unix text file to a Windows text file with Perl, at the Unix shell prompt, enter:</p>
<p><span class="example"> perl -p -e &#8217;s/\n/\r\n/&#8217; &lt; unixfile.txt &gt; winfile.txt</span></p>
<p>You must use single quotation marks in either command line.  This prevents your shell from trying to evaluate anything inside.  Perl is installed on all <a href="http://kb.iu.edu/data/ahaw.html">UITS</a> shared central Unix systems.</p>
<h3>vi</h3>
<p>In <a href="http://kb.iu.edu/data/adxz.html">vi</a>, you can remove the carriage return ( <code>^M</code> ) characters with the following command:</p>
<p><span class="example"> :1,$s/^M//g</span></p>
<p><strong>Note:</strong> To input the <code>^M</code> character, press <code>Ctrl-v </code>, then press <code>Enter</code> or <code>return</code>.</p>
]]></content:encoded>
			<wfw:commentRss>http://bioinformatics.whatheblog.com/2009/04/how-do-i-convert-between-unix-and-windows-text-files/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Linux: Who&#8217;s on the server???</title>
		<link>http://bioinformatics.whatheblog.com/2009/03/linux-whos-on-the-server/</link>
		<comments>http://bioinformatics.whatheblog.com/2009/03/linux-whos-on-the-server/#comments</comments>
		<pubDate>Thu, 19 Mar 2009 15:19:16 +0000</pubDate>
		<dc:creator>Eli Roberson</dc:creator>
				<category><![CDATA[Science]]></category>
		<category><![CDATA[Technical]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[ssh]]></category>

		<guid isPermaLink="false">http://bioinformatics.whatheblog.com/?p=33</guid>
		<description><![CDATA[Linux? You geeks use Linux?
If you work in science, and you work on big datasets (such as analyzing next generation sequencing data), chances are that you use Linux for some of your work. I frequent several of our lab&#8217;s Red Hat servers for data analysis and code development purposes. However, these aren&#8217;t just my servers [...]]]></description>
			<content:encoded><![CDATA[<h2>Linux? You geeks use Linux?</h2>
<p>If you work in science, and you work on big datasets (such as <a href="2009/02/next-gen-tools/">analyzing</a> <a href="solid.appliedbiosystems.com">next</a> <a href="http://www.454.com">generation</a> <a href="http://www.illumina.com/pages.ilmn?ID=203">sequencing</a> data), chances are that you use <a href="http://en.wikipedia.org/wiki/Linux">Linux</a> for some of your work. I frequent several of our lab&#8217;s Red Hat servers for data analysis and code development purposes. However, these aren&#8217;t just my servers to use. Other lab members and, depending on the server, IT staff use them too. I try to remember to check and see who is on and what they&#8217;re running before getting too involved with something that&#8217;s going to hog memory or processor time. But, of course, I don&#8217;t always remember.</p>
<p>I decided to automate this process to take the remembering part out. By adding in a shell script + some code in my profile file, my ssh login immediately displays relevant information without having to invoke it manually.</p>
<h2>Shell Script</h2>
<p>The code is based on the Bash shell, so it may our may not apply to your ssh login. I keep the shell script in my /home/user directory with the name &#8220;.greeting.sh&#8221;. Adding the leading period just makes it invisible to standard &#8220;ls&#8221; queries so it doesn&#8217;t add to the clutter in my home directory. The code for the &#8220;.greeting.sh&#8221; follows between the lines of # signs:</p>
<p>##################################################<br />
#!/bin/bash</p>
<p>UNAME=`whoami`<br />
TIME=`date`<br />
HOST=`hostname`<br />
UCNT=`users | wc -w`<br />
ULST=`users`<br />
PROC=`ps aux|awk &#8216;NR &gt; 0 { s +=$3 }; END {printf(&#8221;%d\n&#8221;, s + 0.5);}&#8217;`<br />
MPCT=`free | grep Mem | awk &#8216;{printf(&#8221;%d\n&#8221;, $3 / $2 * 100 + 0.5);}&#8217;`<br />
MYSHELL=`echo $SHELL`</p>
<p>echo<br />
echo &#8220;$TIME&#8221;<br />
echo &#8220;Shell: $MYSHELL&#8221;<br />
echo &#8220;Hello $UNAME! Welcome to $HOST!&#8221;</p>
<p>if [ $UCNT -ge 2 ]<br />
then<br />
echo &#8220;$UCNT users are currently logged into $HOST:&#8221;<br />
echo &#8220;$ULST&#8221;<br />
else<br />
echo &#8220;No other users currently logged in.&#8221;<br />
fi</p>
<p>echo &#8220;System Status:&#8221;</p>
<p>if [ $PROC -ge 80 ]<br />
then<br />
echo &#8220;High processor usage at ${PROC}%&#8221;<br />
elif [ $PROC -ge 50 ]<br />
then<br />
echo &#8220;Medium processor usage at ${PROC}%&#8221;<br />
else<br />
echo &#8220;Low processor usage at ${PROC}%&#8221;<br />
fi</p>
<p>if [ $MPCT -ge 80 ]<br />
then<br />
echo &#8220;High memory usage at ${MPCT}%&#8221;<br />
elif [ $MPCT -ge 50 ]<br />
then<br />
echo &#8220;Medium memory usage at ${MPCT}%&#8221;<br />
else<br />
echo &#8220;Low memory usage at ${MPCT}%&#8221;<br />
fi</p>
<p>echo</p>
<p>exit 0<br />
##################################################</p>
<p>For example, the code above prints the following when logging in: The date, a greeting, the hostname, my current shell, whether other users are logged in (and the list of users if others are on), and information about current processor and memory usage. I customize this script depending on the primary use of the server. If you have a server that should always be running a certain program, add a line that looks for that program. If it were called &#8220;myprogram&#8221; you could add the following line to the program:</p>
<p>PROG=`ps aux | grep -v grep | grep myprogram | wc -l`</p>
<p>If the program is running, then it will return 1 (if only one instance is running), or 0 if it isn&#8217;t running. By adding in some language later testing if $PROG -ge 1, a message could print saying the program was running or not.</p>
<p>Take note! Don&#8217;t forget to alter the permissions on the script to allow execution, using something like &#8220;chmod +x .greeting.sh&#8221;. Also note that the variables are defined using backticks (same key as the ~ on standard US QWERTY keyboards), not single quotes.</p>
<h2>Automatically running</h2>
<p>The script isn&#8217;t much use if you have to run it manually (if I remembered to do that, why would I need a script?), so I like to set the script to run automatically immediately following an ssh login. As I said before, I use Bash on most of the Linux servers I use. For this shell, there is a file called &#8220;.bash_profile&#8221; in the home directory of each user. This profile file is executed on every ssh connection to set some common environment variables, like PATH. By adding in code to run the greeting script, the output from the script will be displayed immediately after login. Example code to add to the bottom of your profile file:</p>
<p>##################################################<br />
if [ -e "/home/user/.greeting.sh" ]<br />
then<br />
/home/user/.greeting.sh<br />
fi<br />
##################################################</p>
<p>That&#8217;s all there is to it. A simple, but powerfull script to automatically give you information on server login. Feel free to your system and purpose.</p>
]]></content:encoded>
			<wfw:commentRss>http://bioinformatics.whatheblog.com/2009/03/linux-whos-on-the-server/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Craig Venter: On the verge of creating synthetic life</title>
		<link>http://bioinformatics.whatheblog.com/2009/02/craig-venter-on-the-verge-of-creating-synthetic-life/</link>
		<comments>http://bioinformatics.whatheblog.com/2009/02/craig-venter-on-the-verge-of-creating-synthetic-life/#comments</comments>
		<pubDate>Wed, 25 Feb 2009 08:25:29 +0000</pubDate>
		<dc:creator>Abbas</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[life]]></category>
		<category><![CDATA[synthetic]]></category>
		<category><![CDATA[venter]]></category>

		<guid isPermaLink="false">http://bioinformatics.whatheblog.com/?p=32</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p><object width="425" height="344"><param name="movie" value="http://www.youtube.com/v/nKZ-GjSaqgo&#038;hl=en&#038;fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/nKZ-GjSaqgo&#038;hl=en&#038;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object></p>
]]></content:encoded>
			<wfw:commentRss>http://bioinformatics.whatheblog.com/2009/02/craig-venter-on-the-verge-of-creating-synthetic-life/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Next Generation Seq Tools</title>
		<link>http://bioinformatics.whatheblog.com/2009/02/next-gen-tools/</link>
		<comments>http://bioinformatics.whatheblog.com/2009/02/next-gen-tools/#comments</comments>
		<pubDate>Wed, 25 Feb 2009 07:46:50 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Technical]]></category>
		<category><![CDATA[next-gen]]></category>
		<category><![CDATA[sequencing]]></category>

		<guid isPermaLink="false">http://bioinformatics.whatheblog.com/?p=31</guid>
		<description><![CDATA[Something I came across.
Integrated solutions
* CLCbio Genomics Workbench &#8211; de novo and reference assembly of Sanger, 454, Solexa, Helicos, and SOLiD data. Commercial next-gen-seq software that extends the CLCbio Main Workbench software. Includes SNP detection, browser and other features. Runs on Windows, Mac OS X and Linux.
* NextGENe &#8211; de novo and reference assembly of [...]]]></description>
			<content:encoded><![CDATA[<p>Something I came across.</p>
<p>Integrated solutions<br />
* CLCbio Genomics Workbench &#8211; de novo and reference assembly of Sanger, 454, Solexa, Helicos, and SOLiD data. Commercial next-gen-seq software that extends the CLCbio Main Workbench software. Includes SNP detection, browser and other features. Runs on Windows, Mac OS X and Linux.</p>
<p>* NextGENe &#8211; de novo and reference assembly of Illumina and SOLiD data. Uses a novel Condensation Assembly Tool approach where reads are joined via &#8220;anchors&#8221; into mini-contigs before assembly. Requires Win or MacOS.</p>
<p>* SeqMan Genome Analyser &#8211; Software for Next Generation sequence assembly of Illumina, 454 Life Sciences and Sanger data integrating with Lasergene Sequence Analysis software for additional analysis and visualization capabilities. Can use a hybrid templated/de novo approach. Early release commercial software. Compatible with Windows® XP X64 and Mac OS X 10.4.</p>
<p><span id="more-31"></span></p>
<p>Align/Assemble to a reference</p>
<p>* Bowtie &#8211; Ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of 25 million reads per hour on a typical workstation with 2 gigabytes of memory. Link to discussion thread here. Written by Ben Langmead and Cole Trapnell.</p>
<p>* ELAND &#8211; Efficient Large-Scale Alignment of Nucleotide Databases. Whole genome alignments to a reference genome. Written by Illumina author Anthony J. Cox for the Solexa 1G machine.</p>
<p>* EULER &#8211; Short read assembly. By Mark J. Chaisson and Pavel A. Pevzner from UCSD (published in Genome Research).</p>
<p>* Exonerate &#8211; Various forms of alignment (including Smith-Waterman-Gotoh) of DNA/protein against a reference. Authors are Guy St C Slater and Ewan Birney from EMBL. C for POSIX.</p>
<p>* GMAP &#8211; GMAP (Genomic Mapping and Alignment Program) for mRNA and EST Sequences. Developed by Thomas Wu and Colin Watanabe at Genentec. C/Perl for Unix.</p>
<p>* MOSAIK &#8211; Reference guided aligner/assembler. Written by Michael Strömberg at Boston College.</p>
<p>* MAQ &#8211; Mapping and Assembly with Qualities (renamed from MAPASS2). Particularly designed for Illumina-Solexa 1G Genetic Analyzer, and has preliminary functions to handle ABI SOLiD data. Written by Heng Li from the Sanger Centre.</p>
<p>* MUMmer &#8211; MUMmer is a modular system for the rapid whole genome alignment of finished or draft sequence. Released as a package providing an efficient suffix tree library, seed-and-extend alignment, SNP detection, repeat detection, and visualization tools. Version 3.0 was developed by Stefan Kurtz, Adam Phillippy, Arthur L Delcher, Michael Smoot, Martin Shumway, Corina Antonescu and Steven L Salzberg &#8211; most of whom are at The Institute for Genomic Research in Maryland, USA. POSIX OS required.</p>
<p>* Novocraft &#8211; Tools for reference alignment of paired-end and single-end Illumina reads. Uses a Needleman-Wunsch algorithm. Available free for evaluation, educational use and for use on open not-for-profit projects. Requires Linux or Mac OS X.</p>
<p>* RMAP &#8211; Assembles 20 &#8211; 64 bp Solexa reads to a FASTA reference genome. By Andrew D. Smith and Zhenyu Xuan at CSHL. (published in BMC Bioinformatics). POSIX OS required.</p>
<p>* SeqMap &#8211; Works like ELand, can do 3 or more bp mismatches and also INDELs. Written by Hui Jiang from the Wong lab at Stanford. Builds available for most OS&#8217;s.</p>
<p>* SHRiMP &#8211; Assembles to a reference sequence. Developed with Applied Biosystem&#8217;s colourspace genomic representation in mind. Authors are Michael Brudno and Stephen Rumble at the University of Toronto.</p>
<p>* Slider- An application for the Illumina Sequence Analyzer output that uses the probability files instead of the sequence files as an input for alignment to a reference sequence or a set of reference sequences.. Authors are from BCGSC. Paper is here.</p>
<p>* SOAP &#8211; SOAP (Short Oligonucleotide Alignment Program). A program for efficient gapped and ungapped alignment of short oligonucleotides onto reference sequences. Author is Ruiqiang Li at the Beijing Genomics Institute. C++ for Unix.</p>
<p>* SSAHA &#8211; SSAHA (Sequence Search and Alignment by Hashing Algorithm) is a tool for rapidly finding near exact matches in DNA or protein databases using a hash table. Developed at the Sanger Centre by Zemin Ning, Anthony Cox and James Mullikin. C++ for Linux/Alpha.</p>
<p>* SXOligoSearch &#8211; SXOligoSearch is a commercial platform offered by the Malaysian based Synamatix. Will align Illumina reads against a range of Refseq RNA or NCBI genome builds for a number of organisms. Web Portal. OS independent.</p>
<p>de novo Align/Assemble<br />
* MIRA2 &#8211; MIRA (Mimicking Intelligent Read Assembly) is able to perform true hybrid de-novo assemblies using reads gathered through 454 sequencing technology (GS20 or GS FLX). Compatible with 454, Solexa and Sanger data. Linux OS required.</p>
<p>* SHARCGS &#8211; De novo assembly of short reads. Authors are Dohm JC, Lottaz C, Borodina T and Himmelbauer H. from the Max-Planck-Institute for Molecular Genetics.</p>
<p>* SSAKE &#8211; Version 2.0 of SSAKE (23 Oct 2007) can now handle error-rich sequences. Authors are René Warren, Granger Sutton, Steven Jones and Robert Holt from the Canada&#8217;s Michael Smith Genome Sciences Centre. Perl/Linux.</p>
<p>* VCAKE &#8211; De novo assembly of short reads with robust error correction. An improvement on early versions of SSAKE.</p>
<p>* Velvet &#8211; Velvet is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454. Need about 20-25X coverage and paired reads. Developed by Daniel Zerbino and Ewan Birney at the European Bioinformatics Institute (EMBL-EBI).</p>
<p>SNP/Indel Discovery<br />
* ssahaSNP &#8211; ssahaSNP is a polymorphism detection tool. It detects homozygous SNPs and indels by aligning shotgun reads to the finished genome sequence. Highly repetitive elements are filtered out by ignoring those kmer words with high occurrence numbers. More tuned for ABI Sanger reads. Developers are Adam Spargo and Zemin Ning from the Sanger Centre. Compaq Alpha, Linux-64, Linux-32, Solaris and Mac</p>
<p>* PolyBayesShort &#8211; A re-incarnation of the PolyBayes SNP discovery tool developed by Gabor Marth at Washington University. This version is specifically optimized for the analysis of large numbers (millions) of high-throughput next-generation sequencer reads, aligned to whole chromosomes of model organism or mammalian genomes. Developers at Boston College. Linux-64 and Linux-32.</p>
<p>* PyroBayes &#8211; PyroBayes is a novel base caller for pyrosequences from the 454 Life Sciences sequencing machines. It was designed to assign more accurate base quality estimates to the 454 pyrosequences. Developers at Boston College.</p>
<p>Genome Annotation/Genome Browser/Alignment Viewer/Assembly Database<br />
* STADEN &#8211; Includes GAP4. GAP5 once completed will handle next-gen sequencing data. A partially implemented test version is available here<br />
* EagleView &#8211; An information-rich genome assembler viewer. EagleView can display a dozen different types of information including base quality and flowgram signal. Developers at Boston College.</p>
<p>* XMatchView &#8211; A visual tool for analyzing cross_match alignments. Developed by Rene Warren and Steven Jones at Canada&#8217;s Michael Smith Genome Sciences Centre. Python/Win or Linux.</p>
<p>* SAM &#8211; Sequence Assembly Manager. Whole Genome Assembly (WGA) Management and Visualization Tool. It provides a generic platform for manipulating, analyzing and viewing WGA data, regardless of input type. Developers are Rene Warren, Yaron Butterfield, Asim Siddiqui and Steven Jones at Canada&#8217;s Michael Smith Genome Sciences Centre. MySQL backend and Perl-CGI web-based frontend/Linux.</p>
<p>CHiP-Seq/BS-Seq<br />
* FindPeaks &#8211; perform analysis of ChIP-Seq experiments. It uses a naive algorithm for identifying regions of high coverage, which represent Chromatin Immunoprecipitation enrichment of sequence fragments, indicating the location of a bound protein of interest. Original algorithm by Matthew Bainbridge, in collaboration with Gordon Robertson. Current code and implementation by Anthony Fejes. Authors are from the Canada&#8217;s Michael Smith Genome Sciences Centre. JAVA/OS independent. Latest versions available as part of the Vancouver Short Read Analysis Package</p>
<p>* CHiPSeq &#8211; Program used by Johnson et al. (2007) in their Science publication</p>
<p>* BS-Seq &#8211; The source code and data for the &#8220;Shotgun Bisulphite Sequencing of the Arabidopsis Genome Reveals DNA Methylation Patterning&#8221; Nature paper by Cokus et al. (Steve Jacobsen&#8217;s lab at UCLA). POSIX.</p>
<p>* SISSRs &#8211; Site Identification from Short Sequence Reads. BED file input. Raja Jothi @ NIH. Perl.</p>
<p>* QuEST &#8211; Quantitative Enrichment of Sequence Tags. Sidow and Myers Labs at Stanford. From the 2008 publication Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. (C++)</p>
<p>Alternate Base Calling<br />
* Rolexa &#8211; R-based framework for base calling of Solexa data. Project publication</p>
<p>* Alta-cyclic &#8211; &#8220;a novel Illumina Genome-Analyzer (Solexa) base caller&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://bioinformatics.whatheblog.com/2009/02/next-gen-tools/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Bioinformatics Tool Chest: Why You Should Be Using Firefox</title>
		<link>http://bioinformatics.whatheblog.com/2009/02/bioinformatics-tool-chest-why-you-should-be-using-firefox/</link>
		<comments>http://bioinformatics.whatheblog.com/2009/02/bioinformatics-tool-chest-why-you-should-be-using-firefox/#comments</comments>
		<pubDate>Sat, 07 Feb 2009 04:05:48 +0000</pubDate>
		<dc:creator>Eli Roberson</dc:creator>
				<category><![CDATA[Science]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://bioinformatics.whatheblog.com/?p=30</guid>
		<description><![CDATA[Firefox?!?!
I know what you&#8217;re thinking. &#8220;Come on. A browser? As a bioinformatics tool?&#8221; You might actually be surprised. I think that most people that do research spend at least some amount of time online trying to track down information. Maybe it&#8217;s  protein name, or DNA elements in a chromosome segment. Maybe it&#8217;s a certain paper [...]]]></description>
			<content:encoded><![CDATA[<h3>Firefox?!?!</h3>
<p>I know what you&#8217;re thinking. &#8220;Come on. A browser? As a bioinformatics tool?&#8221; You might actually be surprised. I think that most people that do research spend at least some amount of time online trying to track down information. Maybe it&#8217;s  protein name, or DNA elements in a chromosome segment. Maybe it&#8217;s a certain paper or topic through PubMed. Personally, I spend a good amount of time searching out answers. Furthermore, I switch between databases / websites between tabs to get information from different sources. Could there be a way to search faster?</p>
<h3>Keyword Search To The Rescue!</h3>
<p>Luckily, there is a faster way: the keyword search. Basically the keyword search will allow you to make a bookmark shortcut to any search box using a keyword. Once a keyword search has been saved that particular search can be invoked with just the keyword. I frequently use the UCSC Genome Browser for research, so I&#8217;ll use this as an example.</p>
<h3>How To</h3>
<ol>
<li>Navigate to the <a href="http://genome.ucsc.edu">UCSC Genome Browser</a> main page.</li>
<li>In the top navigation panel click &#8220;Genomes&#8221;</li>
<li>The default page should be the Human genome browser. If you are interested in a different organism you can certainly change it using the drop-down boxes. There should be an input box labeled &#8220;position or search term&#8221;. Right click in the box.</li>
<li>In the pop-up menu select &#8220;Add a Keyword for This Search&#8230;&#8221;. An &#8220;Add Bookmark&#8221; window will appear.</li>
<li>In the &#8220;Name&#8221; box type a descriptive name. In this case use &#8220;UCSC Human Search&#8221;.</li>
<li>In the &#8220;Keyword&#8221; box type the keyword you want to use. In this case use &#8220;ucsc&#8221;.</li>
<li>Press the &#8220;Add&#8221; button to save this search.</li>
</ol>
<p>Let&#8217;s test the keyword. Open a new blank Firefox tab by pressing CTRL+T or File -&gt; New Tab. In the address bar type &#8220;ucsc MECP2&#8243; and press enter. The &#8220;ucsc&#8221; keyword triggers the query &#8220;MECP2&#8243; to be run through the search box we saved. After a few seconds a window for the UCSC browser should appear listing possible genes matching the symbol MECP2. If you had navigated to the UCSC Browser directly and typed MECP2 directly in the search box the results would have been the same.</p>
<p>What about direct chromosome positions? Let&#8217;s try it. Clear the text from the URL bar, type &#8220;ucsc  chr1:1-20000000&#8243;, and press enter. The page should change to show the first 20,000,000 base pairs of chromosome 1.</p>
<p>What other uses could it have? What about a &#8220;pubmed&#8221; keyword search? Or an Ensembl search? It can be particularly powerful of you combine these searches. If you were researching Rett Syndrome, you could in one tab search for &#8220;pubmed Rett Syndrome&#8221;. After reading a few papers and finding information on MECP2 in Rett Syndrome all you have do is hit CTRL+T to open another tab. Then type &#8220;ucsc MECP2&#8243; to find it in the genome browser. If you had a saved search for the NCBI Protein database you could go even further by opening yet another tab and typing &#8220;protein MECP2_HUMAN&#8221; (assuming your keyword was protein). The result would be a page about the MECP2 protein in humans where you could get the amino acid sequence. Your specific search set would depend on what databases you search most frequently in your research.</p>
<p>This kind of time savings can really add up. Plus you can show off your cool new hack to friends when they&#8217;re trying to search for something.</p>
]]></content:encoded>
			<wfw:commentRss>http://bioinformatics.whatheblog.com/2009/02/bioinformatics-tool-chest-why-you-should-be-using-firefox/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
