<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Buried Treasure</title>
	<atom:link href="http://larsjuhljensen.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://larsjuhljensen.wordpress.com</link>
	<description>A computational biologist cleans up his disk</description>
	<lastBuildDate>Sat, 24 Dec 2011 09:08:39 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='larsjuhljensen.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Buried Treasure</title>
		<link>http://larsjuhljensen.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://larsjuhljensen.wordpress.com/osd.xml" title="Buried Treasure" />
	<atom:link rel='hub' href='http://larsjuhljensen.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Analysis: Christmas no longer in vogue!</title>
		<link>http://larsjuhljensen.wordpress.com/2011/12/24/analysis-christmas-no-longer-in-vogue/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/12/24/analysis-christmas-no-longer-in-vogue/#comments</comments>
		<pubDate>Sat, 24 Dec 2011 09:07:33 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[christmas]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1580</guid>
		<description><![CDATA[I have just made an alarming discovery: judging from the biomedical literature, researchers appear to increasingly ignore Christmas. My plan was to make a funny Christmas post looking at trivialities such as when during the year Christmas-related papers are posted. To this end, I did a trivial text-mining analysis that pulled out all papers mentioning [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1580&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I have just made an alarming discovery: judging from the biomedical literature, researchers appear to increasingly ignore Christmas.</p>
<p>My plan was to make a funny Christmas post looking at trivialities such as when during the year Christmas-related papers are posted. To this end, I did a trivial text-mining analysis that pulled out all papers mentioning &#8220;Christmas&#8221;, &#8220;Xmas&#8221;, or &#8220;X-mas&#8221; in the title or abstract. As a first check of the data, I looked at how many papes were published each year and was surprised to find only 20-30 in a typical year. To eliminate random fluctuations due to the low counts, I thus binned the data into decades before plotting the temporal trend (black dots are actual data points, red curve is a quadratic trendline):</p>
<p><img src="http://larsjuhljensen.files.wordpress.com/2011/12/christmas.png?w=380" alt="" title="Temporal statistics of &quot;Christmas&quot; in biomedical abstracts"   class="aligncenter size-full wp-image-1583" /></p>
<p>The shocking result is that the frequency of Christmas-related papers has steadily dropped to less than half of what it was in the 1950s!</p>
<p>How can this be? I can think of several possibilities, and you are welcome to come with more in the comments:</p>
<ul>
<li>We are running out of new funny things to say about Christmas.</li>
<li>An increasing proportion of researchers come from countries, in which Christmas is not widely celebrated.</li>
<li>Researchers have collectively stopped believing in Santa, as funding has dried up.</li>
</ul>
<p>Merry Christmas Everyone!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1580/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1580/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1580/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1580/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1580/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1580/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1580/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1580/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1580/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1580/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1580/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1580/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1580/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1580/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1580&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/12/24/analysis-christmas-no-longer-in-vogue/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/12/christmas.png" medium="image">
			<media:title type="html">Temporal statistics of &#34;Christmas&#34; in biomedical abstracts</media:title>
		</media:content>
	</item>
		<item>
		<title>Resource: You want REST with that GreenMamba?</title>
		<link>http://larsjuhljensen.wordpress.com/2011/10/25/resource-you-want-rest-with-that-greenmamba/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/10/25/resource-you-want-rest-with-that-greenmamba/#comments</comments>
		<pubDate>Tue, 25 Oct 2011 07:12:46 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Resource]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[greenmamba]]></category>
		<category><![CDATA[mamba]]></category>
		<category><![CDATA[REST]]></category>
		<category><![CDATA[tool]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1507</guid>
		<description><![CDATA[When you set up a GreenMamba resource, you get not only a web interface for human users, but also a REST web service API meant for scripts to interact with your tool. The REST interface is such an integral part of GreenMamba, that it is not even optional &#8211; you get it whether you want [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1507&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>When you set up a GreenMamba resource, you get not only a web interface for human users, but also a REST web service API meant for scripts to interact with your tool. The REST interface is such an integral part of GreenMamba, that it is not even optional &#8211; you get it whether you want it or not. However, since the purpose of setting up GreenMamba resources is to make your tools and databases accessible to others, we cannot think of a good reason why you would not want to expose them as web services when it takes no extra work.</p>
<p>To illustrate how command-line tools can be accessed as web services, we return to the Motifs tool described in <a href="http://larsjuhljensen.wordpress.com/2011/10/19/resource-turning-a-command-line-tool-into-a-web-tool-with-greenmamba/">an earlier blot post</a>. In addition to having an HTML web interface, it is accessible as a REST web service through the following URL (here shown as a GET request for simplicity; POST requests are also supported):</p>
<p><code>http://localhost:8080/REST/Motifs?fasta=<em>sequences</em>&amp;<br />
motif=<em>regular expression</em></code></p>
<p>The name and parameters for the web service map one-to-one to the resource name and command-line arguments specified in the inifile:</p>
<p><code>[Motifs]<br />
command : greenmamba/examples/motifs.pl $motif @fasta</code></p>
<p>GreenMamba also provides a REST web service API around any database that you configure through the inifile, although it is admittedly not as elegant as it could be. However, there is not much need for an API in this case, as the database functionality of GreenMamba is only intended for databases that are so small that they can easily be downloaded in their entirety instead.</p>
<p>In case of a <a href="http://larsjuhljensen.wordpress.com/2011/10/21/resource-combining-tools-and-databases-into-a-single-greenmamba-web-resource/">GreenMamba metatool</a> there is not a corresponding web service per se. However, because a metatool is made up of a list of subtools that all have their individual sections in the inifile, each of the underlying tools has a REST web service. All the functionalities of a metatool are thus nonetheless exposed as web services.</p>
<p>It should be noted that the REST web services provided by GreenMamba return the output from the underlying tools as is. It may thus be worthwhile to change (or write a post-processing scripts for) tools so that they produce simple tabular output. This will both make GreenMamba format the output nicer in the HTML web interface and make the REST web services more usable.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1507/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1507/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1507/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1507/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1507/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1507/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1507/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1507/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1507/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1507/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1507/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1507/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1507/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1507/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1507&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/10/25/resource-you-want-rest-with-that-greenmamba/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>
	</item>
		<item>
		<title>Resource: Adding bells &amp; whistles to GreenMamba</title>
		<link>http://larsjuhljensen.wordpress.com/2011/10/24/resource-adding-bells-whistles-to-greenmamba/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/10/24/resource-adding-bells-whistles-to-greenmamba/#comments</comments>
		<pubDate>Mon, 24 Oct 2011 06:32:39 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Resource]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[greenmamba]]></category>
		<category><![CDATA[mamba]]></category>
		<category><![CDATA[tool]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1475</guid>
		<description><![CDATA[My latest blog post ended at the stage where we had combined the Instances database and the Motifs tool into a single metatool. In this post I will show how little it takes to add the bells and whistles that turn it into the complete, professional web resource that I showed as a teaser in [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1475&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>My latest blog post ended at the stage where we had combined <a href="https://larsjuhljensen.wordpress.com/2011/10/18/resource-turning-an-excel-sheet-into-a-web-accessible-database-with-greenmamba/">the Instances database</a> and <a href="https://larsjuhljensen.wordpress.com/2011/10/20/resource-improving-a-greenmamba-web-resource-with-a-custom-input-form/">the Motifs tool</a> into <a href="http://larsjuhljensen.wordpress.com/2011/10/21/resource-combining-tools-and-databases-into-a-single-greenmamba-web-resource/">a single metatool</a>. In this post I will show how little it takes to add the bells and whistles that turn it into the complete, professional web resource that I showed as a teaser in the first blog post of this series.</p>
<p>You may not want green to be the design color used throughout your web interface. This is easily changed by adding a line like <code>color : #083D65</code> to your inifile. You can use named colors instead of hex values if you prefer. Whichever color you pick will be used throughout the web interface to ensure a consistent design.</p>
<p>In the simple default design the frame changes size when changing between the Motifs and Instances input forms because the forms are not equally wide. This can easily be changed by setting a fixed width for all lines by adding line such as <code>width : 650px</code>. You do not have to necessarily specify the width in pixels, any units permitted in cascading style sheets can be used.</p>
<p>Most bioinformatics web resources require one or more pages to explain what the resource is all about. Such pages can easily be provided within the GreenMamb framework a by adding lines with the same syntax as <code>page_home</code>. If you add a <code>page_about</code> line, you will get an ABOUT menu item at the top right, which when clicked will show provided HTML text wrapped with within the GreenMamba layout to provide a consistent look. There is nothing magic to the word &#8220;about&#8221;; for example, if you write <code>page_download</code> you will get a page named DOWNLOAD.</p>
<p>You may want to also add a footer that is shown at the bottom of every page that, for example, mentions who made the resource, whom to contact in case of scientific questions or technical problems, and possibly points to one or more papers that describe the tools and which the user is requested to cite. To insert a footer you simple add a line to the inifile with the keyword <code>footer</code> followed by the text you want shown; this text can contain HTML code.</p>
<p>If you set up a Mamba server to host a single resource, you will want the Mamba server to automatically direct users to the main input form in case they access the server without requesting a specific page. For example, we would want to redirect requests for localhost:8080 to localhost:8008/HTML/ELM. This can be done the <code>[REWRITE]</code> section in the inifile, which allows you to specify simple URL rewrite rules similar to what can be done in Apache.</p>
<p>Below is the inifile required to set up the complete ELM example resource as it was shown in <a href="http://larsjuhljensen.wordpress.com/2011/10/17/resource-turning-databases-and-tools-into-web-resources-with-greenmamba/">the first blog post</a> of this series:</p>
<p><code>[SERVER]<br />
host       : localhost<br />
port       : 8080<br />
plugins    : ./greenmamba</p>
<p>[REWRITE]<br />
/          : /HTML/ELM</p>
<p>[Instances]<br />
database   : greenmamba/examples/instances.tsv</p>
<p>[Motifs]<br />
command    : greenmamba/examples/motifs.pl $motif @fasta<br />
page_home  : greenmamba/examples/motifs_home.html</p>
<p>[ELM]<br />
tools      : Motifs; Instances;<br />
color      : #083D65<br />
width      : 650px<br />
footer     : Disclaimer: This is ELM mirror only serves as an example for the GreenMamba framework. For scientific purposes, please use <a href="http://elm.eu.org" target="_blank">the real ELM server</a> instead.<br />
page_about : greenmamba/examples/elm_about.html</code></p>
<p>Starting up the Mamba server with this inifile and accessing localhost:8080 yields this interface:</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example5_input_large.png"><img src="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example5_input_small.png?w=380" alt="" title="GreenMamba example 5 input page"   class="aligncenter size-full wp-image-1315" /></a></p>
<p>Clicking the ABOUT link will brings up the contents of the file <code>elm_about.html</code> wrapped with the GreenMamba design elements:</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example5_about_large.png"><img src="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example5_about_small.png?w=380" alt="" title="GreenMamba example 5 about page"   class="aligncenter size-full wp-image-1490" /></a></p>
<p>In case you want to include pictures or other content on your pages, you do not need a separate web server to host this. Mamba implements a simple web server that you can use for this purpose; all you have to do is to add a <code>www_dir : &lt;directory&gt;</code> in the <code>[SERVER]</code> section of the inifile and place the files you want to serve within the specified directory.</p>
<p>Finally, the output pages of the metatool are also formatted to follow the design specified in the inifile. The header shows the name if the metatool, color matches that of the other pages, the menu with links to the pages is shown, and the footer is included:</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example5_output_large.png"><img src="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example5_output_small.png?w=380" alt="" title="GreenMamba example 5 output page"   class="aligncenter size-full wp-image-1492" /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1475/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1475/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1475/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1475/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1475/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1475/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1475/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1475/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1475/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1475/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1475/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1475/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1475/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1475/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1475&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/10/24/resource-adding-bells-whistles-to-greenmamba/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example5_input_small.png" medium="image">
			<media:title type="html">GreenMamba example 5 input page</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example5_about_small.png" medium="image">
			<media:title type="html">GreenMamba example 5 about page</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example5_output_small.png" medium="image">
			<media:title type="html">GreenMamba example 5 output page</media:title>
		</media:content>
	</item>
		<item>
		<title>Resource: Combining tools and databases into a single GreenMamba web resource</title>
		<link>http://larsjuhljensen.wordpress.com/2011/10/21/resource-combining-tools-and-databases-into-a-single-greenmamba-web-resource/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/10/21/resource-combining-tools-and-databases-into-a-single-greenmamba-web-resource/#comments</comments>
		<pubDate>Fri, 21 Oct 2011 07:01:51 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Resource]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[greenmamba]]></category>
		<category><![CDATA[mamba]]></category>
		<category><![CDATA[tool]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1437</guid>
		<description><![CDATA[In the four previous blog posts I introduced the GreenMamba framework (download) and showed how it can be used to turn a simple tab-delimited files or command-line tools into web resources with a bare minimum of effort. In this post I will show how easy it is to configure multiple databases or tools to run [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1437&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In the <a href="https://larsjuhljensen.wordpress.com/2011/10/17/resource-turning-databases-and-tools-into-web-resources-with-greenmamba/">four</a> <a href="https://larsjuhljensen.wordpress.com/2011/10/18/resource-turning-an-excel-sheet-into-a-web-accessible-database-with-greenmamba/">previous</a> <a href="https://larsjuhljensen.wordpress.com/2011/10/19/resource-turning-a-command-line-tool-into-a-web-tool-with-greenmamba/">blog</a> <a href="https://larsjuhljensen.wordpress.com/2011/10/20/resource-improving-a-greenmamba-web-resource-with-a-custom-input-form/">posts</a> I introduced the GreenMamba framework (<a href="http://tinyurl.com/greenmamba">download</a>) and showed how it can be used to turn a simple tab-delimited files or command-line tools into web resources with a bare minimum of effort. In this post I will show how easy it is to configure multiple databases or tools to run under the same Mamba server and how to make them accessible as a single web resource.</p>
<p>To illustrate this, I will take the Instances database and the Motifs tool and turn them into a web resource called ELM (the name of the database from which the instance data and motifs were obtained in the first place). The following inifile is all it takes to do so:</p>
<p><code>[SERVER]<br />
host       : localhost<br />
port       : 8080<br />
plugins    : ./greenmamba</p>
<p>[Instances]<br />
database   : greenmamba/examples/instances.tsv</p>
<p>[Motifs]<br />
command    : greenmamba/examples/motifs.pl $motif @fasta<br />
page_home  : greenmamba/examples/motifs_home.html</p>
<p>[ELM]<br />
tools      : Motifs; Instances;</code></p>
<p>The <code>[SERVER]</code> section is exactly as in all the previous examples, instructing the Mamba server to run on localhost port 8080 and to import the GreenMamba plugin. The <code>[Instances]</code> section configures a simple database called Instances based on the tab-delimited file <code>instances.tsv</code>, and the <code>[Motifs]</code> section configures a web tool called Motifs that runs the Perl script <code>motifs.pl</code>. These two sections are unchanged compared to the previous blog posts and have here simply been put into inifile, which is how one hosts multiple databases or tools under the same Mamba server. The last section, <code>[ELM]</code>, is the only new part. It instructs GreenMamba to configure a metatool called ELM that combines the two tools Motifs and Instances.</p>
<p>Starting the Mamba server with this inifile and accessing http://localhost:8080/HTML/ELM yields the following web interface:</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example4_input1_large.png"><img src="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example4_input1_small.png?w=380" alt="" title="GreenMamba example 4 input page 1"   class="aligncenter size-full wp-image-1440" /></a></p>
<p>As you can see, what used to be a tool called Motifs has now become a tab within the resource ELM that shows the same (customized) input form. Similarly, the database Instances has become a tab within the same resource:</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example4_input2_large.png"><img src="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example4_input2_small.png?w=380" alt="" title="GreenMamba example 4 input page 2"   class="aligncenter size-full wp-image-1442" /></a></p>
<p>If you press the submit button for Motifs or Instances, you will get output that is formatted as it was when using Motifs and Instances as separate resources, the only change being that the header says ELM. In the next blog post, I will show how the design of GreenMamba web resources can be further customized and how design changes are consistently applied throughout all the individual tools that make up the metatool.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1437/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1437/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1437/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1437/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1437/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1437/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1437/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1437/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1437/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1437/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1437/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1437/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1437/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1437/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1437&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/10/21/resource-combining-tools-and-databases-into-a-single-greenmamba-web-resource/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example4_input1_small.png" medium="image">
			<media:title type="html">GreenMamba example 4 input page 1</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example4_input2_small.png" medium="image">
			<media:title type="html">GreenMamba example 4 input page 2</media:title>
		</media:content>
	</item>
		<item>
		<title>Resource: Improving a GreenMamba web resource with a custom input form</title>
		<link>http://larsjuhljensen.wordpress.com/2011/10/20/resource-improving-a-greenmamba-web-resource-with-a-custom-input-form/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/10/20/resource-improving-a-greenmamba-web-resource-with-a-custom-input-form/#comments</comments>
		<pubDate>Thu, 20 Oct 2011 10:45:41 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Resource]]></category>
		<category><![CDATA[greenmamba]]></category>
		<category><![CDATA[mamba]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1394</guid>
		<description><![CDATA[In the previous blog post I showed how you can use GreenMamba (download) to turn a command-line tool into a simplistic web tool with a minimal of effort. Sometimes, however, you will want to put in just a bit more effort and use a custom input form instead of the default one. The default input [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1394&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://larsjuhljensen.wordpress.com/2011/10/19/resource-turning-a-command-line-tool-into-a-web-tool-with-greenmamba/">the previous blog post</a> I showed how you can use GreenMamba (<a href="http://tinyurl.com/greenmamba">download</a>) to turn a command-line tool into a simplistic web tool with a minimal of effort. Sometimes, however, you will want to put in just a bit more effort and use a custom input form instead of the default one.</p>
<p>The default input page that was automatically created by GreenMamba based on the syntax of the command alone allowed the user to enter a motif in the form of a regular expression:</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example2_input_large1.png"><img src="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example2_input_small1.png?w=380" alt="" title="GreenMamba example 2 input page"   class="aligncenter size-full wp-image-1374" /></a></p>
<p>Suppose we would rather allow the user to select one of the 166 motifs from <a href="http://elm.eu.org/">the ELM database</a> through an input page looking like this:</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example3_input_large.png"><img src="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example3_input_small.png?w=380" alt="" title="GreenMamba example 3 input page"   class="aligncenter size-full wp-image-1422" /></a></p>
<p>To achieve this we add on line to the inifile, which instructs GreenMamba to use the custom HTML in the file <code>motifs_home.html</code> instead of the auto-generated input page:</p>
<p><code>[SERVER]<br />
host       : localhost<br />
port       : 8080<br />
plugins   : ./greenmamba</p>
<p>[Motifs]<br />
command   : greenmamba/examples/motifs.pl $motif @fasta<br />
page_home : greenmamba/examples/motifs_home.html</code></p>
<p>The file <code>motifs_home.html</code> contains the following piece of HTML code (numerous <code>&lt;option&gt;</code> lines for different ELMs replaced with <code>...</code> for brevity):</p>
<p><small><code>Select the Eukaryotic Linear Motif to search:&lt;br /&gt;<br />
&lt;select name='motif'&gt;<br />
&lt;option value='[ILV]..[R][VF][GS].'&gt;CLV_MEL_PAP_1&lt;/option&gt;<br />
&lt;option value='(.RK)|(RR[^KR])'&gt;CLV_NDR_NDR_1&lt;/option&gt;<br />
&lt;option value='R.[RK]R.'&gt;CLV_PCSK_FUR_1&lt;/option&gt;<br />
...<br />
&lt;/select&gt;&lt;br /&gt;<br />
&lt;br /&gt;<br />
Enter the sequences to be searched in FASTA format:&lt;br/&gt;<br />
&lt;textarea name='fasta' cols='80' rows='20'&gt;&amp;gt;MYB_HUMAN<br />
MARRPRHSIYSSDEDDEDFEMCDHDYDGLLPKSGKRHLGKTRWTREEDEKLKKLVEQNGT<br />
DDWKVIANYLPNRTDVQCQHRWQKVLNPELIKGPWTKEEDQRVIELVQKYGPKRWSVIAK<br />
HLKGRIGKQCRERWHNHLNPEVKKTSWTEEEDRIIYQAHKRLGNRWAEIAKLLPGRTDNA<br />
IKNHWNSTMRRKVEQEGYLQESSKASQPAVATSFQKNSHLMGFAQAPPTAQLPATGQPTV<br />
NNDYSYYHISEAQNVSSHVPYPVALHVNIVNVPQPAAAAIQRHYNDEDPEKEKRIKELEL<br />
LLMSTENELKGQQVLPTQNHTCSYPGWHSTTIADHTRPHGDSAPVSCLGEHHSTPSLPAD<br />
PGSLPEESASPARCMIVHQGTILDNVKNLLEFAETLQFIDSFLNTSSNHENSDLEMPSLT<br />
STPLIGHKLTVTTPFHRDQTVKTQKENTVFRTPAIKRSILESSPRTPTPFKHALAAQEIK<br />
YGPLKMLPQTPSHLVEDLQDVIKQESDESGIVAEFQENGPPLLKKIKQEVESPTDKSGNF<br />
FCSHHWEGDSLNTQLFTQTSPVADAPNILTSSVLMAPASEDEDNVLKAFTVPKNRSLASP<br />
LQPCSSTWEPASCGKMEEQMTSSSQARKYVNAFSARTLVM&lt;/textarea&gt;</code></small></p>
<p>Note that this is not a complete HTML page but only the piece of HTML code that goes between the <code>&lt;form&gt;</code> and <code>/&lt;form&gt;</code> tags (minus the submit button). Also note that the names of the input fields must match the handles specified under <code>command</code> in the inifile (e.g. <code>fasta</code> and <code>motif</code>; if they do not, GreenMamba will have no idea where to insert the user input in the command.</p>
<p>The example above is unusually complex due to the mapping of ELM names to regular expression. Usually your custom HTML forms will be far shorter. In those cases you may not even want to store the custom HTML file and instead provide the HTML on a single line inside the inifile, which GreenMamba supports.</p>
<p>Finally, it should be pointed out that this customization step is entirely optional. You do not have to edit HTML forms to set up GreenMamba web resources, but you have the flexibility to do so if you want to.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1394/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1394/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1394/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1394/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1394/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1394/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1394/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1394/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1394&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/10/20/resource-improving-a-greenmamba-web-resource-with-a-custom-input-form/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example2_input_small1.png" medium="image">
			<media:title type="html">GreenMamba example 2 input page</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example3_input_small.png" medium="image">
			<media:title type="html">GreenMamba example 3 input page</media:title>
		</media:content>
	</item>
		<item>
		<title>Resource: Turning a command-line tool into a web tool with GreenMamba</title>
		<link>http://larsjuhljensen.wordpress.com/2011/10/19/resource-turning-a-command-line-tool-into-a-web-tool-with-greenmamba/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/10/19/resource-turning-a-command-line-tool-into-a-web-tool-with-greenmamba/#comments</comments>
		<pubDate>Wed, 19 Oct 2011 05:44:21 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Resource]]></category>
		<category><![CDATA[greenmamba]]></category>
		<category><![CDATA[mamba]]></category>
		<category><![CDATA[tool]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1353</guid>
		<description><![CDATA[In two previous blog posts we introduced the GreenMamba framework (download) and showed how it can be used to easily set up a web database from an Excel sheet or tab-delimited file. However, the primary motivation for developing GreenMamba was to make it as simple as possible to turn command-line tools, e.g. sequence-based prediction methods, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1353&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In two previous blog posts we <a href="http://larsjuhljensen.wordpress.com/2011/10/17/resource-turning-databases-and-tools-into-web-resources-with-greenmamba/">introduced the GreenMamba framework</a> (<a href="http://tinyurl.com/greenmamba">download</a>) and showed how it can be used to easily <a href="http://larsjuhljensen.wordpress.com/2011/10/18/resource-turning-an-excel-sheet-into-a-web-accessible-database-with-greenmamba/">set up a web database</a> from an Excel sheet or tab-delimited file. However, the primary motivation for developing GreenMamba was to make it as simple as possible to turn command-line tools, e.g. sequence-based prediction methods, into full-fledged web tools.</p>
<p>The work that would normally be required to do so is to install a web server, create an HTML page with an input form, and code a CGI script that receives the input from the form, converts the input data into command-line arguments, executes the command-line tool, and returns the result. This is not terribly difficult provided that you know how to configure a web server (e.g. <a href="http://httpd.apache.org/">Apache</a>) and write CGI scripts. However, it takes considerable time to design a consistent, professional looking HTML web interface that handles both input and output and works correctly on all major web browsers.</p>
<p>With GreenMamba setting up a command-line tool as a web tool requires only a few lines in the inifile describing the name and command syntax of the tool. To exemplify this, we have made a simple example Perl script that simply searches a regular expression against a set of protein or DNA sequences in a FASTA file, both of which are provided by the user. The following inifile is all it takes to turn that Perl script into a web tool:</p>
<p><code>[SERVER]<br />
host       : localhost<br />
port       : 8080<br />
plugins    : ./greenmamba</p>
<p>[Motifs]<br />
command    : greenmamba/examples/motifs.pl $motif @fasta</code></p>
<p>The first two lines should be familiar from <a href="http://larsjuhljensen.wordpress.com/2011/10/18/resource-turning-an-excel-sheet-into-a-web-accessible-database-with-greenmamba/">the previous blog post</a>, and the last two lines specify that we have a tool called Motifs, which should run the Perl script <code>motifs.pl</code> with two arguments <code>$motif</code> and <code>@fasta</code>. The difference between handles starting with <code>$</code> and <code>@</code> is that the former will be replaced with the input data, whereas the latter will be replaced with the name of temporary file containing the input data. In the example, the script is to be run with a regular expression (<code>$motif</code>) as first argument and the name of a fasta file (<code>@fasta</code>) as second argument.</p>
<p>Based on the command-line syntax given in the inifile alone, GreenMamba creates the following rudimentary web interface, which can be accessed through http://localhost:8080/HTML/Motifs (here shown with a query):</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example2_input_large1.png"><img src="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example2_input_small1.png?w=380" alt="" title="GreenMamba example 2 input page"   class="aligncenter size-full wp-image-1374" /></a></p>
<p>The names of the various handles (@fasta and $motif) are used as labels for the input fields. It is thus possible to improve the interface a bit simply by giving the handles more descriptive names (underscores will be shown as spaces). GreenMamba also allows the use of a customized input form, which will be explained in an upcoming blog post.</p>
<p>In this example above, pressing the submit button causes GreenMamba to take the command from the inifile, replace <code>$motif</code> with the content of the <code>motif</code> text field, replace <code>@fasta</code> with the name of a temporary file into which the content of the <code>fasta</code> textarea has been written, and execute the resulting command using a system call. Subsequently the output of the command is read and temporary files deleted. In this particular case, the script produces tab-delimited output, which GreenMamba automatically detects and formats as an HTML table in the output page:</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example2_output_large2.png"><img src="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example2_output_small2.png?w=380" alt="" title="GreenMamba example 2 output page"   class="aligncenter size-full wp-image-1379" /></a></p>
<p>If the output is not tab-delimited, it is by default shown as plain pre-formatted text. However, through the .ini file you can change it to handle several other types of output including comma-separated values, HTML, and several image formats. We will likely add support for more formats in the future.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1353/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1353/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1353/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1353/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1353/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1353/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1353/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1353/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1353/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1353/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1353/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1353/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1353/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1353/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1353&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/10/19/resource-turning-a-command-line-tool-into-a-web-tool-with-greenmamba/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example2_input_small1.png" medium="image">
			<media:title type="html">GreenMamba example 2 input page</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example2_output_small2.png" medium="image">
			<media:title type="html">GreenMamba example 2 output page</media:title>
		</media:content>
	</item>
		<item>
		<title>Resource: Turning an Excel sheet into a web-accessible database with GreenMamba</title>
		<link>http://larsjuhljensen.wordpress.com/2011/10/18/resource-turning-an-excel-sheet-into-a-web-accessible-database-with-greenmamba/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/10/18/resource-turning-an-excel-sheet-into-a-web-accessible-database-with-greenmamba/#comments</comments>
		<pubDate>Tue, 18 Oct 2011 08:15:31 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Resource]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[greenmamba]]></category>
		<category><![CDATA[mamba]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1323</guid>
		<description><![CDATA[Anyone who has worked with computational biology for many years will be familiar with the following situation: from collaborators you have received an Excel spreadsheet, which is generously referred to as a “database”, and you now need to make the data accessible to the world. One could obviously simply provide the file for download; however, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1323&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Anyone who has worked with computational biology for many years will be familiar with the following situation: from collaborators you have received an Excel spreadsheet, which is generously referred to as a “database”, and you now need to make the data accessible to the world. One could obviously simply provide the file for download; however, it would be much preferred if the data could be searched through a simple web interface.</p>
<p>This is not a particularly difficult job, but it is a fair amount of work to do. Typically you would need set up a database (be that an SQL database or something else), write a CGI script that queries the database and formats the result as an HTML table, and spend some time on web design to make the input and output pages look aesthetically pleasing. It all takes a lot of time that you would probably rather spend on doing something more productive. Consequently this is often not done at all, and data sets that might be of value to others are thus never made available.</p>
<p>One of the key features of the GreenMamba project (see <a href="http://larsjuhljensen.wordpress.com/2011/10/17/resource-turning-databases-and-tools-into-web-resources-with-greenmamba/">previous blog post</a> on the topic) is to make it as easy as possible to turn any regular Excel spreadsheet into a web database with nearly no work involved. In fact, all it takes is the following four steps:</p>
<ol>
<li><a href="http://tinyurl.com/greenmamba">Download</a> and unpack Mamba.</li>
<li>Save your spreadsheet in tab-delimited format with column names in the first line.</li>
<li>Add the following two lines to your <code>.ini</code> file:<br />
<code>[NameOfDatabase]<br />
database : my_spreadsheet.tsv</code></li>
<li>Start the Mamba server (./mambasrv my_database.ini)
</ol>
<p>To exemplify this, we downloaded the complete list of 1743 known instances of Eukaryotic Linear Motifs from the <a href="http://elm.eu.org">ELM database</a>. The following inifile is all it taks to turn the resulting tab-delimited file into a simple web-accessible database:</p>
<p><code>[SERVER]<br />
host       : localhost<br />
port       : 8080<br />
plugins    : ./greenmamba</p>
<p>[Instances]<br />
database   : greenmamba/examples/instances.tsv</code></p>
<p>The <code>[SERVER]</code> tag specifies the host of the computer where the mamba web server actually runs and the <code>plugins</code> variable specifies where to load the plugins that enable the whole green-mamba framework and should always be set to this to work. The <code>[Instances]</code> tag specifies the name of the database and the <code>database</code> points to the tab-delimited version of the spreadsheet. After starting the mamba server you can access http://localhost:8080/HTML/Instances and to see the following query interface (here shown with a query):</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example1_input_large.png"><img src="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example1_input_small.png?w=380" alt="" title="GreenMamba example 1 input page"   class="aligncenter size-full wp-image-1326" /></a></p>
<p>Upon submitting the query, GreenMamba retrieves all lines that match the search criteria and formats them as an output page:</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example1_output_large.png"><img src="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example1_output_small-e1318856867224.png?w=380" alt="" title="GreenMamba example 1 output page"   class="aligncenter size-full wp-image-1326" /></a></p>
<p>One could set up a nicer and simpler version of the database by filtering the tab-delimited file a bit. For example, one might want to leave out the columns ELMType (which is redundant with ELMIdentifer), Accessions, InstanceLogic, Evidence, PDB, and Organism (which is redundant with ProteinName) and rename ELMIdentifier to ELM and ProteinName to Protein. This would result in a simpler query form and a more concise results table. Doing this is left as an exercise for the interested reader.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1323/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1323/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1323/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1323/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1323/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1323/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1323/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1323/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1323/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1323/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1323/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1323/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1323/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1323/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1323&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/10/18/resource-turning-an-excel-sheet-into-a-web-accessible-database-with-greenmamba/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example1_input_small.png" medium="image">
			<media:title type="html">GreenMamba example 1 input page</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example1_output_small-e1318856867224.png" medium="image">
			<media:title type="html">GreenMamba example 1 output page</media:title>
		</media:content>
	</item>
		<item>
		<title>Resource: Turning databases and tools into web resources with GreenMamba</title>
		<link>http://larsjuhljensen.wordpress.com/2011/10/17/resource-turning-databases-and-tools-into-web-resources-with-greenmamba/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/10/17/resource-turning-databases-and-tools-into-web-resources-with-greenmamba/#comments</comments>
		<pubDate>Mon, 17 Oct 2011 11:22:34 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Resource]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[mamba]]></category>
		<category><![CDATA[greenmamba]]></category>
		<category><![CDATA[command-line]]></category>
		<category><![CDATA[tool]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1301</guid>
		<description><![CDATA[Today, the users of bioinformatics databases and tools increasingly rely on being able to access them through web interfaces. Almost all major databases and most of the commonly used tools can be accessed in this manner, which is mostly good news from the users perspective. However, in my experience from teaching on numerous courses, these [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1301&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Today, the users of bioinformatics databases and tools increasingly rely on being able to access them through web interfaces. Almost all major databases and most of the commonly used tools can be accessed in this manner, which is mostly good news from the users perspective. However, in my experience from teaching on numerous courses, these users have never worked with a command line and thus typically run their head against a wall the moment they have to do anything slightly more specialized than, for example, running a BLAST search or making a multiple alignment.</p>
<p>The reason for this is simple: specialist tools and databases are typically not made available through user-friendly web interfaces, because they have too few users to make it worthwhile to create such an interface. Worse yet, the tools are in many cases not even distributed, because the many dependencies and lack of documentation would result in too many questions if one were to distribute it. Consequently, almost every bioinformatician that I have spoken about this has one or more resources that they are currently not sharing &#8211; not because they are not willing to share, but because sharing would imply too much extra work. To address this problem, we have developed a web server that allows you to easily wrap existing databases and tools with a web interface like the one shown below.</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example5_input_large.png"><img src="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example5_input_small.png?w=380" alt="" title="GreenMamba example 5 input page"   class="aligncenter size-full wp-image-1315" /></a></p>
<p>In my group we are involved in the development and maintenance of many bioinformatics web resources, and I have thus been pushing the development of a reusable infrastructure. The result of this is the Python framework Mamba, which has primarily been developed by Sune Frankild and myself. Briefly, Mamba is a network-centric, multi-threaded queuing system that deals with the many technical aspects related to network communication with the clients and server-side resource management. All the specific work pertaining to a resource is done by modules that run under the Mamba server. GreenMamba is one such Mamba module, which based on a simple configuration file can provide a complete web interface around a tab-delimited data file or a command-line tool.</p>
<p>It is thus with great pleasure that we can now release the first version of the Mamba queuing system and GreenMamba wrapper under <a href="http://www.opensource.org/licenses/BSD-3-Clause">the BSD license</a>. We hope that by eliminating most of the work in setting up bioinformatics web resources, it will encourage people to make available data sets and tools that hitherto were not worthwhile the time and effort to set up.</p>
<p>Over the next days and weeks, I plan to publish a series of blog posts that illustrate how one can use this framework to wrap a web interface around existing databases and command-line tools with practically no work. Impatient people are welcome to <a href="http://tinyurl.com/greenmamba">download the software</a> and look in the greenmamba/examples directory.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1301/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1301/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1301/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1301/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1301/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1301/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1301/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1301/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1301/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1301/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1301/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1301/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1301/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1301/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1301&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/10/17/resource-turning-databases-and-tools-into-web-resources-with-greenmamba/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/10/greenmamba_example5_input_small.png" medium="image">
			<media:title type="html">GreenMamba example 5 input page</media:title>
		</media:content>
	</item>
		<item>
		<title>Live: Lecture by Nobel Laurate Avram Hershko</title>
		<link>http://larsjuhljensen.wordpress.com/2011/05/24/live-lecture-by-nobel-laurate-avram-hershko/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/05/24/live-lecture-by-nobel-laurate-avram-hershko/#comments</comments>
		<pubDate>Tue, 24 May 2011 09:05:38 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Live]]></category>
		<category><![CDATA[cell cycle]]></category>
		<category><![CDATA[degradation]]></category>
		<category><![CDATA[ubiquitylation]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1250</guid>
		<description><![CDATA[Today I am at the the symposium &#8220;Protein Chemistry ‐ Applications to Combat Diseases&#8221;, which takes place in Copenhagen a few minutes walk from where I work. This morning started with a keynote lecture by Nobel Laurate Avram Hershko on regulation of the cell division cycle by ubiquitin‐mediated protein degradation. This post is just a very quick [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1250&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Today I am at the the symposium <a href="http://www.biopeople.dk/index.php?id=266">&#8220;Protein Chemistry ‐ Applications to Combat Diseases&#8221;</a>, which takes place in Copenhagen a few minutes walk from where I work.</p>
<p>This morning started with a keynote lecture by Nobel Laurate <a href="http://en.wikipedia.org/wiki/Avram_Hershko">Avram Hershko</a> on regulation of the cell division cycle by ubiquitin‐mediated protein degradation. This post is just a very quick write-up and a few photos made during and immediately after his presentation.</p>
<p><img class="aligncenter size-full wp-image-1251" title="Avram Hershko" src="http://larsjuhljensen.files.wordpress.com/2011/05/hershko.jpg?w=380" alt="Avram Hershko presenting in Copenhagen"   /></p>
<p>Most of the early work on ubiquitylation was done on model proteins, most of which were extracellular. Interestingly, what spurred Avram Hershko on to study ubiquitylation of physiologically relevant proteins was the early work on cyclin degradation for which <a href="http://en.wikipedia.org/wiki/Tim_Hunt">Tim Hunt</a> received the Nobel Prize. Tim Hunt speculated speculated that there was a <em>cyclin protease</em> that would break down cyclins. However, Avram Hershko showed in 1991 that cyclins are in fact not degraded by a specific protease, but are rather targeted for proteasomal degradation by a specific ubiquitin ligase. Showed this in JBC papers in 1991 and 1994. One year later his group identified this ubiquitin ligase to be what is now known as the <a href="http://en.wikipedia.org/wiki/Anaphase-promoting_complex">Anaphase Promoting Complex (APC) / Cyclosome (APC/C)</a>.</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2011/05/apccyclins.jpg"><img class="aligncenter size-medium wp-image-1252" title="APC/C &amp; cyclin degradation" src="http://larsjuhljensen.files.wordpress.com/2011/05/apccyclins.jpg?w=300&#038;h=225" alt="The role of APC/C in ubiquitylation and degradation of cyclins" width="300" height="225" /></a></p>
<p>In addition to being crucial for degradation of cyclins, APC/C is also required for entry into <a href="http://en.wikipedia.org/wiki/Anaphase">anaphase</a> of the cell cycle (hence the name Anaphase Promoting Complex). This because it is responsible for targeting the <a href="http://en.wikipedia.org/wiki/Securin">securin</a> protein for degradation, which in turns releases <a href="http://en.wikipedia.org/wiki/Separase">separase</a> activity to degrade the <a href="http://en.wikipedia.org/wiki/Cohesin">cohesin</a> rings that hold together <a href="http://en.wikipedia.org/wiki/Sister_chromatids">sister chromatids</a>.</p>
<p>Having worked on other cell-cycle proteins for many years, Avram Hershko has in recent years returned his interest to APC/C, more specifically to understand how the inhibition of APC/C is released, which in turn leads to the whole series of events described above.</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2011/05/apccheckpoint.jpg"><img class="aligncenter size-medium wp-image-1253" title="APC/C &amp; checkpoint inhibition" src="http://larsjuhljensen.files.wordpress.com/2011/05/apccheckpoint.jpg?w=300&#038;h=225" alt="Release of APC/C from checkpoint inhibition" width="300" height="225" /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1250/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1250/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1250/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1250/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1250/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1250/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1250/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1250/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1250/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1250/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1250/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1250/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1250/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1250/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1250&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/05/24/live-lecture-by-nobel-laurate-avram-hershko/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/05/hershko.jpg" medium="image">
			<media:title type="html">Avram Hershko</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/05/apccyclins.jpg?w=300" medium="image">
			<media:title type="html">APC/C &#38; cyclin degradation</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/05/apccheckpoint.jpg?w=300" medium="image">
			<media:title type="html">APC/C &#38; checkpoint inhibition</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: Toward doing science</title>
		<link>http://larsjuhljensen.wordpress.com/2011/05/20/analysis-toward-doing-science/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/05/20/analysis-toward-doing-science/#comments</comments>
		<pubDate>Fri, 20 May 2011 09:36:38 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[publishing]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=17</guid>
		<description><![CDATA[Yesterday, Rangarajan and coworkers published a paper in BMC Bioinformtatics entitled &#8220;Toward an interactive article: integrating journals and biological databases&#8221;. Not many hours later Neil Saunders made the following tweet commenting on it: This reminded me of a draft blog post that I wrote in 2008 on the use of the word &#8220;toward(s)&#8221; in article [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=17&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Yesterday, Rangarajan and coworkers published <a href="http://www.biomedcentral.com/1471-2105/12/175">a paper</a> in <a href="http://www.biomedcentral.com/bmcbioinformatics">BMC Bioinformtatics</a> entitled &#8220;Toward an interactive article: integrating journals and biological databases&#8221;. Not many hours later <a href="http://nsaunders.wordpress.com/">Neil Saunders</a> made the following tweet commenting on it:</p>
<p><img src="http://larsjuhljensen.files.wordpress.com/2011/05/toward-tweet1.png?w=380" alt="Can we ban use of &quot;toward(s)&quot; in article titles?" title="Toward tweet"   class="aligncenter size-full wp-image-1230" /></p>
<p>This reminded me of a draft blog post that I wrote in 2008 on the use of the word &#8220;toward(s)&#8221; in article titles, and I decided that it was time to update the plot and finally publish it. The background was that I had the gut feeling that there was a somewhat disturbing trend, namely that more and more papers use these words in the title. I thus went to Medline and counted the fraction of papers from each year having a title starting with &#8220;toward&#8221; or &#8220;towards&#8221; (I also included them if towards appeared inside the title following a colon, semicolon, or dash):</p>
<p><img src="http://larsjuhljensen.files.wordpress.com/2011/05/toward.png?w=380" alt="" title="The fraction of &quot;toward(s)&quot; articles as function of time"   class="aligncenter size-full wp-image-1245" /></p>
<p>The plot shows that fraction of articles with “toward(s)” in the title is rapidly rising; it has more than tripled over the past two decades. There is thus no doubt that the use of &#8220;toward(s)&#8221; in article titles is a trend in biomedical publishing.</p>
<p>As is often the case with statistics, though, this analysis answers only one question but leads to several new ones. Are we increasingly selling our papers on what we hope to do soon rather than on what we have actually done? Or have we just become more honest by now adding the word &#8220;toward(s)&#8221; where we might have left it out in the past?</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/17/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/17/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/17/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/17/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/17/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/17/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/17/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/17/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/17/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/17/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/17/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/17/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/17/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/17/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=17&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/05/20/analysis-toward-doing-science/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/05/toward-tweet1.png" medium="image">
			<media:title type="html">Toward tweet</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/05/toward.png" medium="image">
			<media:title type="html">The fraction of &#34;toward(s)&#34; articles as function of time</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: 10butnotMe</title>
		<link>http://larsjuhljensen.wordpress.com/2011/05/05/analysis-10butnotme/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/05/05/analysis-10butnotme/#comments</comments>
		<pubDate>Thu, 05 May 2011 14:48:55 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[genomics genetics]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=981</guid>
		<description><![CDATA[About five years ago George Church announced the Personal Genome Project (PGP). A very interesting aspect of this project is that all data are released under the Creative Commons Zero waiver. This includes not only the genetic data, but also some medical information and even the identity of each individual. Although PGP has enrolled more [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=981&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>About five years ago <a href="http://arep.med.harvard.edu/gmc/">George Church</a> announced the <a href="http://www.personalgenomes.org/">Personal Genome Project</a> (PGP). A very interesting aspect of this project is that all data are released under the <a href="http://creativecommons.org/publicdomain/zero/1.0/">Creative Commons Zero waiver</a>. This includes not only the genetic data, but also some medical information and even the identity of each individual.</p>
<p>Although PGP has enrolled more than a thousand individuals, it is presently only possible to download <a href="http://www.personalgenomes.org/public/">data on ten individuals</a>. It is obviously pointless to attempt to link genotype to phenotype based on such a small number of individuals. However, I wondered if any meaningful structure would emerge if I calculated the <a href="http://en.wikipedia.org/wiki/Hamming_distance">Hamming distances</a> for all pairs of individuals, that is the number of SNPs by which they differ (<a href="http://www.box.net/shared/g60iyey1m5">download</a>).</p>
<p>Like said so done. I downloaded all available SNP data from PGP (including array and exome sequencing data), calculated all pairwise SNP distances, and visualized the results as a heatmap along with the faces of the individuals (click for a larger version of the figure):</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2010/08/pgp_matrix_large.png"><img src="http://larsjuhljensen.files.wordpress.com/2010/08/pgp_matrix_small.png?w=380" alt="Number of SNP differences between PGP10 individuals" title="PGP10 distance matrix (small)"   class="aligncenter size-full wp-image-983" /></a></p>
<p>Individual #10 stands out as being genetically most dissimilar from everyone else, which is unsurprising as he is the only African American in the study. I next tried to similarly define the genetically most average individual, that is the individual that is most similar to everyone else. If one defines this as the individual with the lowest sum of differences, the answer is individual #7. However, because the origins of his grandparents are unknown, it is difficult to conclude anything interesting based on this.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/981/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/981/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/981/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/981/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/981/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/981/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/981/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/981/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/981/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/981/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/981/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/981/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/981/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/981/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=981&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/05/05/analysis-10butnotme/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2010/08/pgp_matrix_small.png" medium="image">
			<media:title type="html">PGP10 distance matrix (small)</media:title>
		</media:content>
	</item>
		<item>
		<title>Announcement: Bioinformatics 2011</title>
		<link>http://larsjuhljensen.wordpress.com/2011/03/31/announcement-bioinformatics-2011/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/03/31/announcement-bioinformatics-2011/#comments</comments>
		<pubDate>Thu, 31 Mar 2011 07:49:58 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Announcement]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1201</guid>
		<description><![CDATA[<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1201&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://research.med.helsinki.fi/bioinformatics2011/"><img src="http://larsjuhljensen.files.wordpress.com/2011/03/bioinformatics-2011.jpg?w=380" alt="Bioinformatics 2011" title="bioinformatics-2011"   class="aligncenter size-full wp-image-1170" /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1201/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1201&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/03/31/announcement-bioinformatics-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/03/bioinformatics-2011.jpg" medium="image">
			<media:title type="html">bioinformatics-2011</media:title>
		</media:content>
	</item>
		<item>
		<title>Announcement: EMBO practical course on protein bioinformatics tools</title>
		<link>http://larsjuhljensen.wordpress.com/2011/03/13/announcement-embo-practical-course-on-protein-bioinformatics-tools/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/03/13/announcement-embo-practical-course-on-protein-bioinformatics-tools/#comments</comments>
		<pubDate>Sun, 13 Mar 2011 11:00:10 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Announcement]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1169</guid>
		<description><![CDATA[<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1169&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.embl.de/training/events/2011/AIN11-01"><img src="http://larsjuhljensen.files.wordpress.com/2011/03/proteinbioinformaticstools-2011.jpg?w=380" alt="" title="ProteinBioinformaticsTools-2011"   class="aligncenter size-full wp-image-1170" /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1169/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1169/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1169/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1169/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1169/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1169/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1169/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1169/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1169/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1169/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1169/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1169/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1169/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1169/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1169&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/03/13/announcement-embo-practical-course-on-protein-bioinformatics-tools/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/03/proteinbioinformaticstools-2011.jpg" medium="image">
			<media:title type="html">ProteinBioinformaticsTools-2011</media:title>
		</media:content>
	</item>
		<item>
		<title>Commentary: Intel&#8217;s take on GPU computing</title>
		<link>http://larsjuhljensen.wordpress.com/2011/02/07/commentary-intels-take-on-gpu-computing/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/02/07/commentary-intels-take-on-gpu-computing/#comments</comments>
		<pubDate>Mon, 07 Feb 2011 14:39:22 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Commentary]]></category>
		<category><![CDATA[CUDA]]></category>
		<category><![CDATA[GPU computing]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1094</guid>
		<description><![CDATA[A week or two ago, I published a post in which I argued that most papers, which report order of magnitude speedup of a bioinformatics algorithm by using graphics processors (GPUs), did so based straw man comparisons: Massively parallel GPU implementations were compared to CPU implementations that did not make full use of the multi-core [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1094&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A week or two ago, I published <a href="http://larsjuhljensen.wordpress.com/2011/01/28/commentary-the-gpu-computing-fallacy/">a post</a> in which I argued that most papers, which report order of magnitude speedup of a bioinformatics algorithm by using graphics processors (GPUs), did so based straw man comparisons:</p>
<ul>
<li>Massively parallel GPU implementations were compared to CPU implementations that did not make full use of the multi-core and SIMD (Single Instruction, Multiple Data) features.</li>
<li>The performance comparisons were done using very expensive GPU processing cards that cost as much, if not more, than the host computers.
</ul>
<p>It turns out that Lee and coworkers from <a href="http://www.intel.com/">Intel Corporation</a> have performed a comparison that addresses both of these issues (thanks to <a href="http://bergmanlab.smith.man.ac.uk/">Casey Bergman</a> for making me aware of this). It appeared in 2010 in the proceedings of the <a href="http://isca2010.inria.fr/">37th International Symposium on Computer Architecture</a>:</p>
<blockquote><p><strong>Debunking the 100X GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU</strong></p>
<p>Recent advances in computing have led to an explosion in the amount of data being generated. Processing the ever-growing data in a timely manner has made throughput computing an important aspect for emerging applications. Our analysis of a set of important throughput computing kernels shows that there is an ample amount of parallelism in these kernels which makes them suitable for today’s multi-core CPUs and GPUs. In the past few years there have been many studies claiming GPUs deliver substantial speedups (between 10X and 1000X) over multi-core CPUs on these kernels. To understand where such large performance difference comes from, we perform a rigorous performance analysis and find that after applying optimizations appropriate for both CPUs and GPUs the performance gap between an Nvidia GTX280 processor and the Intel Core i7 960 processor narrows to only 2.5x on average. In this paper, we discuss optimization techniques for both CPU and GPU, analyze what architecture features contributed to performance differences between the two architectures, and recommend a set of architectural features which provide significant improvement in architectural efficiency for throughput kernels.
</p></blockquote>
<p>Without wanting to question the integrity of the researchers, I read the paper with a critical mind for an obvious reason: they work for Intel who is a major player on the CPU market, but not on the GPU market. One always needs to have a critical mind when economic interests are involved. However, it is my clear impression that the researchers did their very best to make each and every algorithm run as fast as possible on the GPU as well as on the CPU.</p>
<p>The only factor I found, which may have tweaked the balance a bit in the favor of the CPU, was the choice of CPU and GPU. Whereas the Intel Core i7 960 was launched in October 2009, the Nvidia GTX 280 was launched in June 2008. That is a difference of 16 months, which by application of <a href="http://en.wikipedia.org/wiki/Moore's_law">Moore&#8217;s law</a> skews the results by almost 2x in favor of the CPU. The average speed-up provided by a high-end gaming GPU over a high-end CPU on this selection of algorithms is thus likely to be 4-5x. However, this advantage drops to about 3-4x if one corrects for the additional cost of the GPU, and to around 2x if one corrects for energy efficiency rather than for initial investment.</p>
<p>The findings of Lee and coworkers are consistent with my own conclusions, which were based on comparing two GPU-accelerated implementations of BLAST. In case of BLAST, the price-performance of GPU implementations ended up worse than that of the CPU implementation. Lee and coworkers found that for a wide variety of highly data-parallel algorithms (none of which are directly related to bioinformatics), only a modest speedup was attained. Not even a single algorithm got anywhere close to the promises of 100x or 1000x speedup, and a couple of algorithms ended up being slower on the GPU than on the CPU. This confirms my view that GPUs are presently not an attractive alternative to CPUs for most scientific computing needs.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1094/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1094/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1094/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1094/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1094/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1094/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1094/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1094/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1094/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1094/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1094/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1094/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1094/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1094/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1094&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/02/07/commentary-intels-take-on-gpu-computing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>
	</item>
		<item>
		<title>Update: The BuzzCloud for 2010</title>
		<link>http://larsjuhljensen.wordpress.com/2011/02/04/update-the-buzzcloud-for-2010/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/02/04/update-the-buzzcloud-for-2010/#comments</comments>
		<pubDate>Fri, 04 Feb 2011 11:35:53 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Update]]></category>
		<category><![CDATA[text mining]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1103</guid>
		<description><![CDATA[With about one month delay relative to the release of the new baseline of PubMed, here is the updated BuzzCloud visualization for what was hot and up-coming 2010 (click image for larger interactive version): Here is a quick overview of some of the trends that I found interesting: Geoepidemiology. A bit of searching in PubMed [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1103&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>With about one month delay relative to the release of the new baseline of PubMed, here is the updated <a href="http://larsjuhljensen.wordpress.com/2008/02/29/resource-the-buzzcloud-visualization-of-buzzwords/">BuzzCloud visualization</a> for what was hot and up-coming 2010 (click image for larger interactive version):</p>
<p><a href="http://red.jensenlab.org/BuzzClouds/BuzzCloud2010.html"><img src="http://larsjuhljensen.files.wordpress.com/2011/02/buzzcloud2010.png?w=380" alt="" title="BuzzCloud 2010"   class="aligncenter size-full wp-image-1104" /></a></p>
<p>Here is a quick overview of some of the trends that I found interesting:</p>
<ul>
<li><strong>Geoepidemiology.</strong> A bit of searching in PubMed reveals this to be a buzzword primarily due to the journal <a href="http://www.elsevier.com/wps/find/journaldescription.cws_home/622356/description">Autoimmunity Reviews</a>, which for some reason decided to publish 19 papers with this word in the title in 2010.</li>
<li><strong>Network pharmacology</strong> and <strong>systems pharmacology.</strong> Due to my personal interests, these buzzword caught my eye although they were mentioned in only 7 and 11 papers from 2010, respectively. I would have been more pleased of one of those had not been in <a href="http://en.wikipedia.org/wiki/Medical_Hypotheses">a journal with a history of publishing pseudoscience</a>.</li>
<li><strong>Metatranscriptomics</strong> and <strong>viral metagenomics</strong>. With metagenomics becoming reality rather than mere buzz, related and derived terms are predictably following suit.</li>
<li><strong>Orbitrap technology</strong> and <strong>iTRAQ proteomics</strong>. Like metagenomics, large-scale proteomics has become an established field. This is well reflected by two of the best-known proteomics technologies appearing in the 2010 BuzzCloud.</li>
<li><strong>Astrology.</strong> Falling firmly in the &#8220;dislike&#8221; category, I can only hope that it will be gone in next year&#8217;s BuzzCloud.</li>
</ul>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1103/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1103/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1103/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1103/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1103/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1103/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1103/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1103/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1103/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1103/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1103/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1103/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1103/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1103/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1103&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/02/04/update-the-buzzcloud-for-2010/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/02/buzzcloud2010.png" medium="image">
			<media:title type="html">BuzzCloud 2010</media:title>
		</media:content>
	</item>
		<item>
		<title>Announcement: MPIB summer school on computational MS-based proteomics</title>
		<link>http://larsjuhljensen.wordpress.com/2011/02/02/announcement-mpib-summer-school-on-computational-ms-based-proteomics/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/02/02/announcement-mpib-summer-school-on-computational-ms-based-proteomics/#comments</comments>
		<pubDate>Wed, 02 Feb 2011 11:00:19 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Announcement]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1176</guid>
		<description><![CDATA[<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1176&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.biochem.mpg.de/en/rd/summerschool/"><img src="http://larsjuhljensen.files.wordpress.com/2011/03/maxquant-2011.jpg?w=380" alt="" title="MaxQuant-2011"   class="aligncenter size-full wp-image-1177" /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1176/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1176/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1176/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1176/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1176/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1176/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1176/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1176/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1176/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1176/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1176/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1176/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1176/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1176/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1176&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/02/02/announcement-mpib-summer-school-on-computational-ms-based-proteomics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/03/maxquant-2011.jpg" medium="image">
			<media:title type="html">MaxQuant-2011</media:title>
		</media:content>
	</item>
		<item>
		<title>Announcement: EMBO practical course on computational biology</title>
		<link>http://larsjuhljensen.wordpress.com/2011/02/02/announcement-embo-practical-course-on-computational-biology/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/02/02/announcement-embo-practical-course-on-computational-biology/#comments</comments>
		<pubDate>Wed, 02 Feb 2011 10:00:06 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Announcement]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1185</guid>
		<description><![CDATA[<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1185&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://cwp.embo.org/pc11-07/"><img src="http://larsjuhljensen.files.wordpress.com/2011/03/computationalbiology-2011.jpeg?w=380" alt="" title="ComputationalBiology-2011"   class="aligncenter size-full wp-image-1186" /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1185/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1185/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1185/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1185/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1185/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1185/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1185/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1185/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1185/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1185/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1185/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1185/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1185/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1185/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1185&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/02/02/announcement-embo-practical-course-on-computational-biology/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/03/computationalbiology-2011.jpeg" medium="image">
			<media:title type="html">ComputationalBiology-2011</media:title>
		</media:content>
	</item>
		<item>
		<title>Commentary: The GPU computing fallacy</title>
		<link>http://larsjuhljensen.wordpress.com/2011/01/28/commentary-the-gpu-computing-fallacy/</link>
		<comments>http://larsjuhljensen.wordpress.com/2011/01/28/commentary-the-gpu-computing-fallacy/#comments</comments>
		<pubDate>Fri, 28 Jan 2011 11:13:34 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Commentary]]></category>
		<category><![CDATA[alignment]]></category>
		<category><![CDATA[CUDA]]></category>
		<category><![CDATA[GPU computing]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=1019</guid>
		<description><![CDATA[Modern graphics processors (GPUs) deliver considerably more brute force computational power than traditional processors (CPUs). With NVIDIA&#8217;s launch of CUDA, general purpose GPU computing has become greatly simplified, and many research groups around the world have consequently experimented with how one can harvest the power of GPUs to speed up scientific computing. This is also [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1019&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Modern graphics processors (GPUs) deliver considerably more brute force computational power than traditional processors (CPUs). With NVIDIA&#8217;s launch of <a href="http://www.nvidia.com/object/cuda_home_new.html">CUDA</a>, general purpose GPU computing has become greatly simplified, and many research groups around the world have consequently experimented with how one can harvest the power of GPUs to speed up scientific computing.</p>
<p>This is also the case for bioinformatics algorithms. NVIDIA advertises a number of applications that have been adapted to make use of GPUs, including several <a href="http://www.nvidia.com/object/bio_info_life_sciences.html">applications for bioinformatics and life sciences</a>, which supposedly speed up bioinformatics algorithms by an order of magnitude or more. </p>
<p>In this commentary I will focus primarily on two GPU-accelerated versions of NCBI-BLAST, namely <a href="https://sites.google.com/site/liuweiguohome/software">CUDA-BLAST</a> and <a href="http://eudoxus.cheme.cmu.edu/gpublast/gpublast.html">GPU-BLAST</a>. I do so not to specifically criticize these two programs, but because BLAST is the single most commonly used bioinformatics tool and thus a prime example for illustrating whether GPU acceleration of bioinformatics algorithms pays off.</p>
<p>Whereas CUDA-BLAST to the best of my knowledge has not been published in a peer-reviewed journal, GPU-BLAST is described in <a href="http://dx.doi.org/10.1093/bioinformatics/btq644">the following Bioinformatics paper</a> by Vouzis and Sahinidis:</p>
<blockquote><p><strong>GPU-BLAST: using graphics processors to accelerate protein sequence alignment</strong></p>
<p><strong>Motivation:</strong> The Basic Local Alignment Search Tool (BLAST) is one of the most widely used bioinformatics tools. The widespread impact of BLAST is reflected in over 53 000 citations that this software has received in the past two decades, and the use of the word ‘blast’ as a verb referring to biological sequence comparison. Any improvement in the execution speed of BLAST would be of great importance in the practice of bioinformatics, and facilitate coping with ever increasing sizes of biomolecular databases.</p>
<p><strong>Results:</strong> Using a general-purpose graphics processing unit (GPU), we have developed GPU-BLAST, an accelerated version of the popular NCBI-BLAST. The implementation is based on the source code of NCBI-BLAST, thus maintaining the same input and output interface while producing identical results. In comparison to the sequential NCBI-BLAST, the speedups achieved by GPU-BLAST range mostly between 3 and 4.</p></blockquote>
<p>It took me a while to figure out from where the 3-4x speedup came. I eventually found it in Figure 4B of the paper. GPU-BLAST achieves an approximately 3.3x speedup over NCBI-BLAST in only one situation, namely if it is used to perform ungapped sequence similarity searches and only one of six CPU cores is used:</p>
<p><img class="aligncenter size-full wp-image-1047" title="Vouzis-Fig4B" src="http://larsjuhljensen.files.wordpress.com/2011/01/vouzis-fig4b.png?w=380" alt=""   /></p>
<p><em>Speedup of GPU-BLAST over NCBI-BLAST as function of number of CPU threads used. Figure by Vouzis and Sahinidis.</em></p>
<p>The vast majority of use cases for BLAST require gapped alignments, however, in which case GPU-BLAST never achieves even a 3x speedup on the hardware used by the authors. Moreover, nobody concerned about the speed of BLAST would buy a multi-core server and leave all but one core idle. The most relevant speedup is thus the speedup achieved by using all CPU cores and the GPU vs. only the CPU cores, in which case GPU-BLAST achieves only a 1.5x speedup over NCBI-BLAST.</p>
<p>The benchmark by NVIDIA does not fair much better. Their 10x speedup comes from comparing CUDA-BLAST to NCBI-BLAST using only a single CPU core. The moment one compares to NCBI-BLAST running with 4 threads on their quad-core Intel i7 CPU, the speedup drops to 3x. However, the CPU supports hyperthreading. To get the full performance out of it, one should thus presumably run NCBI-BLAST with 8 threads, which I estimate will reduce the speedup of CUDA-BLAST vs. NCBI-BLAST to 2.5x at best.</p>
<p>Even these numbers are not entirely fair. They are based on the 3.5x or 4x speedup that one gets by running a single instance of BLAST with 4 or 6 threads, respectively. The typical situation when the speed of BLAST becomes relevant, however, is when you have a large number of sequences that need to be searched against a database. This is an <a href="http://en.wikipedia.org/wiki/Embarrassingly_parallel">embarrassingly parallel</a> problem; by partitioning the query sequences and running multiple single-threaded instances of BLAST, you can get a 6x speedup on either platform (personal experience shows that running 8 simultaneous BLAST searches on a quad-core CPU with hyperthreading gives approximately 6x speedup).</p>
<p><strong>It is not just BLAST</strong></p>
<p>Optimists could argue that perhaps BLAST is just one of few bioinformatics problems that do not benefit from GPU computing. However, reading the recent literature, I think that GPU-BLAST is a representative example. Most publications about GPU acceleration of algorithms relevant to bioinformatics report speedups of at most 10x. Typically, this performance number represents the speedup that can be attained relative to a single-threaded version of the program running on the CPU, hence leaving most of the CPU cores standing idle. Not exactly a fair comparison.</p>
<p>Davis et al. recently published <a href="http://dx.doi.org/10.1093/bioinformatics/btq638">a sobering paper in Bioinformatics</a> in which they made exactly that point:</p>
<blockquote><p><strong>Real-world comparison of CPU and GPU implementations of SNPrank: a network analysis tool for GWAS</strong></p>
<p><strong>Motivation:</strong> Bioinformatics researchers have a variety of programming languages and architectures at their disposal, and recent advances in graphics processing unit (GPU) computing have added a promising new option. However, many performance comparisons inflate the actual advantages of GPU technology. In this study, we carry out a realistic performance evaluation of SNPrank, a network centrality algorithm that ranks single nucleotide polymorhisms (SNPs) based on their importance in the context of a phenotype-specific interaction network. Our goal is to identify the best computational engine for the SNPrank web application and to provide a variety of well-tested implementations of SNPrank for Bioinformaticists to integrate into their research.</p>
<p><strong>Results:</strong> Using SNP data from the Wellcome Trust Case Control Consortium genome-wide association study of Bipolar Disorder, we compare multiple SNPrank implementations, including Python, Matlab and Java as well as CPU versus GPU implementations. When compared with naïve, single-threaded CPU implementations, the GPU yields a large improvement in the execution time. However, with comparable effort, multi-threaded CPU implementations negate the apparent advantage of GPU implementations.</p></blockquote>
<p>Kudos for that. They could have published yet another paper with the title &#8220;N-fold speedup of algorithm X by GPU computing&#8221;. Instead they honestly reported that if one puts the same effort into parallelizing the CPU implementation as it takes to write a massively parallel GPU implementation, one gets about the same speedup.</p>
<p><strong>GPUs cost money</strong></p>
<p>It gets worse. Almost all papers on GPU computing ignore the detail that powerful GPU cards are expensive. It is not surprising that you can make an algorithm run faster by buying a piece of hardware that costs as much if not more than the computer itself. You could have spent that money buying a second computer instead. What matters is not the performance but the price/performance ratio. You do not see anyone publishing papers with titles like &#8220;N-fold speed up of algoritm X by using N computers&#8221;.</p>
<p>Let us have a quick look at the hardware platforms used for benchmarking the two GPU-accelerated implementations of BLAST. Vouzis and Sahinidis used a server with an Intel Xeon X5650 CPU, which I was able to find for under $3000. For acceleration they used a Tesla C2050 GPU card, which costs another $2500. The hardware necessary to make BLAST ~1.5x faster made the computer ~1.8x more expensive. NVIDIA used a different setup consisting of a server equipped with an Intel i7-920, which I could find for $1500, and two Tesla C1060 GPU cards costing $1300 each. In other words, they used a 2.7x more expensive computer to make BLAST 2.5x faster at best. The bottom line is that the increase in hardware costs outstripped the speed increase in both cases.</p>
<p><strong>But what about the energy savings?</strong></p>
<p>&#8230; I hear the die-hard GPU-computing enthusiasts cry. One of the selling arguments for GPU computing is that GPUs are much more energy efficient than CPUs. I will not question the fact that the peak Gflops delivered by a GPU exceeds that of CPUs using the same amount of energy. But does this theoretical number translate into higher energy efficiency when applied to a real-world problem such as BLAST?</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2011/01/tesla_c1060_3qtr_low.png"><img class="aligncenter size-full wp-image-1027" title="Tesla_c1060_3qtr_low" src="http://larsjuhljensen.files.wordpress.com/2011/01/tesla_c1060_3qtr_low.png?w=380" alt=""   /></a></p>
<p><em>The big fan on an NVIDIA Tesla GPU card is not there for show. (<a href="http://www.nvidia.com/object/product_tesla_c1060_us.html">Picture from NVIDIA&#8217;s website</a>.)</em></p>
<p>As anyone who has build a gaming computer in recent years can testify, modern day GPUs use as much electrical power as a CPU if not more. NVIDIA Tesla Computing Processors are no exception. The two Tesla C1060 cards in the machine used by NVIDIA to benchmark CUDA-BLAST use 187.7 Watts each, or 375.6 Watts in total. By comparison a basic Intel i7 system like the one used by NVIDIA uses less than 200 Watts. The two Tesla C1060 cards thus triple the power consumption while delivering at most 2.5 times the speed. Similarly, the single Tesla C2050 card used by Vouzis and Sahinidis uses 238 Watts, which is around the same as the power requirement of their base hexa-core Intel Xeon system, thereby doubling the power consumption for less than a 1.5-fold speedup. In other words, using either of the two GPU-accelerated versions of BLAST appears to be less energy efficient than using NCBI-BLAST.</p>
<p><strong>Conclusions</strong></p>
<p>Many of the claims regarding speedup of various bioinformatics algorithms using GPU computing are based on faulty comparisons. Typically, the massively parallel GPU implementation of an algorithm is compared to a serial version that makes use of only a fraction of the CPU&#8217;s compute power. Also, the considerable costs associated with GPU computing processors, both in terms of initial investment and power consumption, are usually ignored. Once all of this has been corrected for, GPU computing presently looks like a very bad deal.</p>
<p>There is a silver lining, though. First, everyone uses very expensive Tesla boards in order to achieve the highest possible speedup over the CPU implementations, whereas high-end gaming graphics cards might provide better value for money. However, the evidence for this remains to be seen. Second, certain specific problems such as molecular dynamics probably benefit more from GPU acceleration than BLAST does. In that case, you should be aware that you are buying hardware to speed up one specific type of analysis rather than bioinformatics analyses in general. Third, it is difficult to make predictions &#8211; especially about the future. It is possible that future generations of GPUs will change the picture, but that is no reason for buying expensive GPU accelerators today.</p>
<p>The message then is clear. If you are a bioinformatician who likes to live on the bleeding edge while wasting money and electricity, get a GPU compute server. If on the other hand you want something generally useful and well tested and quite a lot faster than a GPU compute server … get yourself some computers.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/1019/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/1019/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/1019/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/1019/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/1019/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/1019/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/1019/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/1019/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/1019/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/1019/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/1019/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/1019/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/1019/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/1019/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=1019&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2011/01/28/commentary-the-gpu-computing-fallacy/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/01/vouzis-fig4b.png" medium="image">
			<media:title type="html">Vouzis-Fig4B</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2011/01/tesla_c1060_3qtr_low.png" medium="image">
			<media:title type="html">Tesla_c1060_3qtr_low</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: Three-dimensional DNA structure</title>
		<link>http://larsjuhljensen.wordpress.com/2010/11/02/analysis-three-dimensional-dna-structure/</link>
		<comments>http://larsjuhljensen.wordpress.com/2010/11/02/analysis-three-dimensional-dna-structure/#comments</comments>
		<pubDate>Tue, 02 Nov 2010 08:59:43 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[DNA structure]]></category>
		<category><![CDATA[expression]]></category>
		<category><![CDATA[pathways]]></category>
		<category><![CDATA[protein interactions]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=951</guid>
		<description><![CDATA[A few months ago Bill Noble&#8217;s lab at University of Washington published a letter in Nature on a three-dimensional model of the complete nuclear genome of budding yeast: A three-dimensional model of the yeast genome Layered on top of information conveyed by DNA sequence and chromatin are higher order structures that encompass portions of chromosomes, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=951&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A few months ago <a href="http://noble.gs.washington.edu/">Bill Noble&#8217;s lab</a> at <a href="http://www.washington.edu/">University of Washington</a> published <a href="http://dx.doi.org/10.1038/nature08973">a letter</a> in <a href="http://www.nature.com/nature/">Nature</a> on a three-dimensional model of the complete nuclear genome of budding yeast:</p>
<blockquote><p><strong>A three-dimensional model of the yeast genome</strong></p>
<p>Layered on top of information conveyed by DNA sequence and chromatin are higher order structures that encompass portions of chromosomes, entire chromosomes, and even whole genomes. Interphase chromosomes are not positioned randomly within the nucleus, but instead adopt preferred conformations. Disparate DNA elements co-localize into functionally defined aggregates or ‘factories’ for transcription and DNA replication. In budding yeast, Drosophila and many other eukaryotes, chromosomes adopt a Rabl configuration, with arms extending from centromeres adjacent to the spindle pole body to telomeres that abut the nuclear envelope. Nonetheless, the topologies and spatial relationships of chromosomes remain poorly understood. Here we developed a method to globally capture intra- and inter-chromosomal interactions, and applied it to generate a map at kilobase resolution of the haploid genome of Saccharomyces cerevisiae. The map recapitulates known features of genome organization, thereby validating the method, and identifies new features. Extensive regional and higher order folding of individual chromosomes is observed. Chromosome XII exhibits a striking conformation that implicates the nucleolus as a formidable barrier to interaction between DNA sequences at either end. Inter-chromosomal contacts are anchored by centromeres and include interactions among transfer RNA genes, among origins of early DNA replication and among sites where chromosomal breakpoints occur. Finally, we constructed a three-dimensional model of the yeast genome. Our findings provide a glimpse of the interface between the form and function of a eukaryotic genome.</p></blockquote>
<p>Having previously worked with predicted 3D structure of DNA, such as intrinsic curvature, I was intrigued by the availability of a 3D structure of a complete eukaryotic genome. Based on past analyses of 1D distances in DNA, I expected that the 3D distance between two genes in the genome would correlate with expression, protein interactions, and metabolic pathways.</p>
<p>To test if 3D neighborhood correlates with function and/or regulation, I collected three large sets of protein pairs, namely pairs of co-expressed genes from the <a href="http://string-db.org/">STRING</a> database (Pearson correlation coefficient &gt;0.7), interacting protein pairs from the <a href="http://thebiogrid.org/">BioGRID</a> database, and pairs of genes assigned to the same pathway by the <a href="http://www.genome.jp/kegg/">KEGG</a> database. I subsequently mapped these onto the set of 3D neighbors listed in the supplementary information of the paper, including only 3D neighbors on different chromosomes (in order to eliminate correlations caused by 1D rather than 3D distance). I also mapped the three sets of gene pairs onto a shuffled version of the 3D neighbors, in order to estimate the overlaps that can be expected at random. The results are summarized in the table below:</p>
<table>
<thead>
<tr>
<th></th>
<th>3D neighbors</th>
<th>Shuffled neighbors</th>
</tr>
</thead>
<tbody>
<tr>
<th>Coexpressed (STRING)</th>
<td style="text-align:right;">58</td>
<td style="text-align:right;">61</td>
</tr>
<tr>
<th>Interacting (BioGRID)</th>
<td style="text-align:right;">2151</td>
<td style="text-align:right;">2122</td>
</tr>
<tr>
<th>Same pathway (KEGG)</th>
<td style="text-align:right;">357</td>
<td style="text-align:right;">344</td>
</tr>
</tbody>
</table>
<p>To make a long story short, the numbers show that 3D genomic neighbors appear to be no more likely to be coexpressed, to interact, or to be involved in the same pathway than random pairs. It could be that they way I perform the analysis is too simplistic or that the data are too noisy to show a signal. However, it is also possible that the 3D structural organization of the genome simply doesn&#8217;t have much impact on gene regulation and function.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/951/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/951/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/951/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/951/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/951/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/951/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/951/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/951/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/951/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/951/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/951/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/951/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/951/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/951/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=951&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2010/11/02/analysis-three-dimensional-dna-structure/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: Half of published URLs are dysfunctional a decade later</title>
		<link>http://larsjuhljensen.wordpress.com/2010/08/22/analysis-half-of-published-urls-are-dysfunctional-a-decade-later/</link>
		<comments>http://larsjuhljensen.wordpress.com/2010/08/22/analysis-half-of-published-urls-are-dysfunctional-a-decade-later/#comments</comments>
		<pubDate>Sun, 22 Aug 2010 14:06:16 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=952</guid>
		<description><![CDATA[As a small aside when setting up a local mirror of Medline, I extracted 15,915 URLs that were mentioned in the abstracts. Checking them revealed that 12,354 of them (78%) were functional, which may not seem that bad. However, plotting the percentage of dysfunctional URLs as a function of publication year reveals a less pleasant [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=952&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>As a small aside when setting up a local mirror of Medline, I extracted 15,915 URLs that were mentioned in the abstracts. Checking them revealed that 12,354 of them (78%) were functional, which may not seem that bad. However, plotting the percentage of dysfunctional URLs as a function of publication year reveals a less pleasant trend:</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2010/08/dysfunctional_urls.png"><img class="aligncenter size-full wp-image-958" title="Dysfunctional URLs" src="http://larsjuhljensen.files.wordpress.com/2010/08/dysfunctional_urls.png?w=380" alt="Dysfunctional URLs"   /></a></p>
<p>After just 10 years, half of all published URLs are no longer functional, and do not redirect to the new location of the service (if one exists). The fairly high success rate overall is merely a consequence of most URLs having been published within the last few years. Unless the persistence of URLs is improving (which I see no sign of in the plot), we can thus expect to have thousands of URLs in the published literature that are no longer valid.</p>
<p><strong>Edit:</strong> <a href="http://webapps.oru.edu/facultyplace/view_profile.php?user_id=74">Andrew Lang</a> pointed out <a href="http://www.allacademic.com//meta/p_mla_apa_research_citation/0/1/1/6/5/pages11654/p11654-1.php">a similar study of URLs cited in communications journals</a>.</p>
<p><strong>Edit:</strong> <a href="http://duncan.hull.name/">Duncan Hull</a> pointed out <a href="http://dx.doi.org/10.1093/bioinformatics/btn127">a paper on URL decay in Medline</a> by <a href="http://www.omrf.org/OMRF/Research/09/WrenJ.asp">Jonathan Wren</a>, which reminded me of <a href="http://bioinformatics.oxfordjournals.org/cgi/content/abstract/20/5/668">an even earlier paper on the topic</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/952/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/952/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/952/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/952/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/952/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/952/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/952/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/952/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/952/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/952/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/952/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/952/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/952/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/952/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=952&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2010/08/22/analysis-half-of-published-urls-are-dysfunctional-a-decade-later/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2010/08/dysfunctional_urls.png" medium="image">
			<media:title type="html">Dysfunctional URLs</media:title>
		</media:content>
	</item>
		<item>
		<title>Job: Bioinformatics position at Intomics A/S</title>
		<link>http://larsjuhljensen.wordpress.com/2010/07/14/job-bioinformatics-position-at-intomics-as/</link>
		<comments>http://larsjuhljensen.wordpress.com/2010/07/14/job-bioinformatics-position-at-intomics-as/#comments</comments>
		<pubDate>Wed, 14 Jul 2010 11:41:25 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Job]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=932</guid>
		<description><![CDATA[At Intomics A/S, we are looking for a bioinformatician to perform contract research and develop customized solutions. The job will primarily involve solving data analysis problems for clients in the pharmaceutical industry. For further details, please read the job advert below the fold. Bioinformatics position Would you like to work with some of the worlds [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=932&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>At <a href="http://www.intomics.com">Intomics A/S</a>, we are looking for a bioinformatician to perform contract research and develop customized solutions. The job will primarily involve solving data analysis problems for clients in the pharmaceutical industry. For further details, please read the job advert below the fold.</p>
<p><span id="more-932"></span></p>
<blockquote><p><strong>Bioinformatics position</strong></p>
<p>Would you like to work with some of the worlds leading experts in large-scale analysis of data from the life sciences and help our clients develop better products for the future?</p>
<p>Intomics A/S is seeking a bioinformatician to work with our existing team in providing customized solutions to our clients in the pharmaceutical industry. You will play an important role in designing and implementing solutions and carrying out innovative data analyses.</p>
<p>Our preferred candidate has the following profile:</p>
<ul>
<li>Bioinformatics or computational biology background</li>
<li>Experience with large-scale data mining</li>
<li>Strong experience with programming (e.g. Perl, Python, Java), databases (e.g. MySQL, PostgreSQL), and Unix</li>
<li>Excellent communication skills and the ability to work in a team</li>
<li>Strong reporting and documentation skills</li>
<li>A PhD in bioinformatics or systems biology is an advantage, but not a requirement</li>
<li>Experience with text mining is further an advantage</li>
</ul>
<p>We are offering an exiting and challenging job with a competitive salary in a friendly working environment. You will get the opportunity to develop your qualifications and skills, and participate in the development of innovative solutions. Some traveling must be expected.</p>
<p>Applications should be submitted before 15th of August 2010. You can either send your application electronically to applications@intomics.com marked with “application-1047” in the subject heading or by mail to Intomics A/S, Diplomvej 373, 2800 Lyngby, Denmark.</p>
<p>Enquiries about the position can be made to CEO Thomas S. Jensen, tel: +45 88807979 or skot@intomics.com. All interested candidates irrespective of age, gender, race, or religion are encouraged to apply.</p>
<p>Intomics A/S, located in Lyngby, North of Copenhagen, Denmark, is a contract research organization that specializes in providing tailor-made solutions within data mining, bioinformatics and systems biology for the pharmaceutical industry.</p>
<p><img class="size-full wp-image-933 alignnone" title="Intomics logo" src="http://larsjuhljensen.files.wordpress.com/2010/07/intomics-logo.png?w=380" alt="Intomics"   /></p></blockquote>
<p><em>Full disclosure: I am a co-founder of and scientific advisor for Intomics A/S.</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/932/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/932/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/932/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/932/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/932/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/932/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/932/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/932/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/932/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/932/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/932/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/932/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/932/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/932/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=932&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2010/07/14/job-bioinformatics-position-at-intomics-as/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2010/07/intomics-logo.png" medium="image">
			<media:title type="html">Intomics logo</media:title>
		</media:content>
	</item>
		<item>
		<title>Commentary: When Open Access isn&#8217;t</title>
		<link>http://larsjuhljensen.wordpress.com/2010/06/27/commentary-when-open-access-isnt/</link>
		<comments>http://larsjuhljensen.wordpress.com/2010/06/27/commentary-when-open-access-isnt/#comments</comments>
		<pubDate>Sun, 27 Jun 2010 07:53:56 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Commentary]]></category>
		<category><![CDATA[open access]]></category>
		<category><![CDATA[PLoS]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=918</guid>
		<description><![CDATA[This week, PLoS ONE published an interesting paper by Bo-Christer Björk and coworkers on the free global availability of articles from scientific journals. One of the principal findings in this study is that 20.4% of articles published in 2008 are now available as Open Access (OA): Open Access to the Scientific Journal Literature: Situation 2009 [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=918&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>This week, <a href="www.plosone.org">PLoS ONE</a> published <a href="http://dx.doi.org/10.1371/journal.pone.0011273">an interesting paper</a> by Bo-Christer Björk and coworkers on the free global availability of articles from scientific journals. One of the principal findings in this study is that 20.4% of articles published in 2008 are now available as Open Access (OA):</p>
<blockquote><p><strong>Open Access to the Scientific Journal Literature: Situation 2009</strong></p>
<p><strong>Background:</strong> The Internet has recently made possible the free global availability of scientific journal articles. Open Access (OA) can occur either via OA scientific journals, or via authors posting manuscripts of articles published in subscription journals in open web repositories. So far there have been few systematic studies showing how big the extent of OA is, in particular studies covering all fields of science.</p>
<p><strong>Methodology/Principal Findings:</strong> The proportion of peer reviewed scholarly journal articles, which are available openly in full text on the web, was studied using a random sample of 1837 titles and a web search engine. Of articles published in 2008, 8,5% were freely available at the publishers’ sites. For an additional 11,9% free manuscript versions could be found using search engines, making the overall OA percentage 20,4%. Chemistry (13%) had the lowest overall share of OA, Earth Sciences (33%) the highest. In medicine, biochemistry and chemistry publishing in OA journals was more common. In all other fields author-posted manuscript copies dominated the picture.</p>
<p><strong>Conclusions/Significance:</strong> The results show that OA already has a significant positive impact on the availability of the scientific journal literature and that there are big differences between scientific disciplines in the uptake. Due to the lack of awareness of OA-publishing among scientists in most fields outside physics, the results should be of general interest to all scholars. The results should also interest academic publishers, who need to take into account OA in their business strategies and copyright policies, as well as research funders, who like the NIH are starting to require OA availability of results from research projects they fund. The method and search tools developed also offer a good basis for more in-depth studies as well as longitudinal studies.</p></blockquote>
<p>Having just set up <a href="http://reflect.cpr.ku.dk/pmc/">a mirror</a> of the OA subset of <a href="http://www.ncbi.nlm.nih.gov/pmc/">PubMed Central</a>, I know that it contains only ~10% of the articles deposited in PubMed Central and only ~1% of the articles indexed by <a href="http://www.ncbi.nlm.nih.gov/pubmed/">PubMed</a>. It was thus with equal doses of joy and scepticism that I read numbers reported by Bo-Christer Björk and coworkers.</p>
<p>It soon became clear to me that the study did not adhere to the OA definition by the <a href="http://www.soros.org/openaccess/read.shtml">Budapest Open Access Initiative</a>, which is as follows:</p>
<blockquote><p>By &#8216;open access&#8217; to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.</p></blockquote>
<p>The Bo-Christer Björk et al. do not define what exactly they mean by OA. However, from reading their paper is is pretty clear that any article for which they can get hold of free full text is counted as OA. The license under which the copy is distributed does not to matter, and they thus count the 90% of articles in PubMed Central that are published under non-OA licenses as OA. It does not even seem to matter if the free full text is legal or not, implying that any article of which an illegal copy can be found somewhere on the web is counted as OA.</p>
<p>I have heard of Gold OA and Green OA. It is tempting to call this Black OA. But I won&#8217;t. Because it just isn&#8217;t OA.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/918/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/918/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/918/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/918/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/918/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/918/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/918/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/918/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/918/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/918/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/918/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/918/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/918/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/918/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=918&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2010/06/27/commentary-when-open-access-isnt/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: Markov clustering and the case of the unsupported protein complexes</title>
		<link>http://larsjuhljensen.wordpress.com/2010/03/03/analysis-markov-clustering-and-the-case-of-the-unsupported-protein-complexes/</link>
		<comments>http://larsjuhljensen.wordpress.com/2010/03/03/analysis-markov-clustering-and-the-case-of-the-unsupported-protein-complexes/#comments</comments>
		<pubDate>Wed, 03 Mar 2010 00:16:22 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[clustering]]></category>
		<category><![CDATA[protein complexes]]></category>
		<category><![CDATA[protein interactions]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=839</guid>
		<description><![CDATA[In 2006, Krogan and coworkers published a paper in Nature describing a global analysis of protein complexes in budding yeast. This resulted in a network of 7,123 protein-protein interactions involving 2,708 proteins, which was organized into 547 protein complexes using the Markov clustering algorithm. Considering my previous two posts, it probably comes as a surprise [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=839&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In 2006, Krogan and coworkers published <a href="http://dx.doi.org/10.1038/nature04670">a paper in Nature</a> describing a global analysis of protein complexes in budding yeast. This resulted in a network of 7,123 protein-protein interactions involving 2,708 proteins, which was organized into 547 protein complexes using the Markov clustering algorithm.</p>
<p>Considering my previous <a href="http://larsjuhljensen.wordpress.com/2010/03/01/analysis-markov-clustering-and-the-case-of-the-unnatural-clusters/">two</a> <a href="http://larsjuhljensen.wordpress.com/2010/03/02/analysis-markov-clustering-and-the-case-of-the-nonhomologous-orthologs/">posts</a>, it probably comes as a surprise to nobody that I wanted to check if the issue of unnatural clusters also affected this study. Albert Palleja, a postdoc in my group, thus extracted the 547 sub-networks corresponding the protein complexes and applied single-linkage clustering to check if all clusters corresponded to connected sub-networks.</p>
<p>It turned out that 9 of the 547 protein complexes do not correspond to connected sub-networks in the protein interaction network that formed the basis for the clustering. Two complexes each contain two additional subunits that have no interactions with any of the other subunits of the proposed complex, five complexes contain one additional subunit with no interactions to other subunits, and two complexes are proposed hetero-dimers made up of subunits that do not interact according to the interaction network. These complexes are visualized in the figure below with the erroneous subunits highlighted in <span style="color:red;">red</span>:</p>
<p><img class="aligncenter size-full wp-image-842" title="Erroneous protein complexes produced by MCL" src="http://larsjuhljensen.files.wordpress.com/2010/02/mcl_krogan.png?w=380" alt=""   /></p>
<p>To check if these additional subunits are in any way supported by the experimental data presented in the paper, I downloaded the set of raw purification from the <a href="http://interactome-cmp.ucsf.edu">Krogan Lab Interactome Database</a>. For 4 of the 9 complexes, the additional subunits are weakly supported by at least one purification. It should be noted, however, that this evidence was not judged to be sufficiently reliable by the authors themselves to include the interaction in the core network based on which the complexes were derived.</p>
<p>To make a long story short, this analysis shows that 9 of the 547 protein complexes published by Krogan and coworkers contain one or more subunits that are not supported by the interaction network from which the complexes were derived. Of these, 5 complexes contain subunits that have no support in the underlying experimental data, and which are purely artifacts of using the MCL algorithm without without enforcing that clusters must correspond to connected sub-networks.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/839/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/839/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/839/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/839/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/839/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/839/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/839/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/839/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/839/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/839/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/839/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/839/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/839/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/839/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=839&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2010/03/03/analysis-markov-clustering-and-the-case-of-the-unsupported-protein-complexes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2010/02/mcl_krogan.png" medium="image">
			<media:title type="html">Erroneous protein complexes produced by MCL</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: Markov clustering and the case of the nonhomologous orthologs</title>
		<link>http://larsjuhljensen.wordpress.com/2010/03/02/analysis-markov-clustering-and-the-case-of-the-nonhomologous-orthologs/</link>
		<comments>http://larsjuhljensen.wordpress.com/2010/03/02/analysis-markov-clustering-and-the-case-of-the-nonhomologous-orthologs/#comments</comments>
		<pubDate>Tue, 02 Mar 2010 14:03:37 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[alignment]]></category>
		<category><![CDATA[clustering]]></category>
		<category><![CDATA[orthologous groups]]></category>
		<category><![CDATA[orthologs]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=854</guid>
		<description><![CDATA[In the previous blog post I described how the MCL algorithm can sometimes produce unnatural clusters with disconnected parts. The C implementation of MCL has an option to suppress this behavior (--force-connected=y), but I suspect that it is rarely used. I have thus taken a closer look at some notable applications of MCL in bioinformatics [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=854&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://larsjuhljensen.wordpress.com/2010/03/01/analysis-markov-clustering-and-the-case-of-the-unnatural-clusters/">the previous blog post</a> I described how the MCL algorithm can sometimes produce unnatural clusters with disconnected parts. The C implementation of MCL has an option to suppress this behavior (<code>--force-connected=y</code>), but I suspect that it is rarely used. I have thus taken a closer look at some notable applications of MCL in bioinformatics to see if unnatural clusters arise in real data sets.</p>
<p>Here I will focus on <a href="http://www.orthomcl.org">OrthoMCL-DB</a>, which is a database of orthologous groups of protein sequences. These were constructed by applying the MCL algorithm to the normalized results of an all-against-all BLAST search of the protein sequences.</p>
<p>To check the connectivity of the resulting orthologous groups, I downloaded OrthoMCL version 4 including the 13+ GB of gzipped BLAST results that formed the basis for the MCL clustering. I wish to thanks to the OrthoMCL-DB team for being very helpful and making this large data set available to me.</p>
<p>A few Perl scripts and CPU hours later, Albert Palleja and I had extracted the BLAST network for each of the 116,536 orthologous groups and performed single-linkage clustering to check if any of them contained disconnected parts. We found that this was the case for the following 28 orthologous groups:</p>
<table width="100%">
<thead>
<tr>
<th>Orthologous group</th>
<th>Protein</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_10123">OG4_10123</a></td>
<td>tcru|Tc00.1047053448329.10</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_10133">OG4_10133</a></td>
<td>cmer|CMS291C</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_11608">OG4_11608</a></td>
<td>bmor|BGIBMGA011561</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_13082">OG4_13082</a></td>
<td>lbic|eu2.Lbscf0004g03370</td>
</tr>
<tr>
<td valign="top"><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_17434">OG4_17434</a></td>
<td><span style="color:red;">cint|ENSCINP00000028818<br />
nvec|e_gw.40.282.1</span></td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_20715">OG4_20715</a></td>
<td>mbre|fgenesh2_pg.scaffold_4000474</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_20953">OG4_20953</a></td>
<td>tpal|NP_218832</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_21182">OG4_21182</a></td>
<td>tvag|TVAG_333570</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_24433">OG4_24433</a></td>
<td>tmar|NP_229533</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_29163">OG4_29163</a></td>
<td>tcru|Tc00.1047053508221.76</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_32884">OG4_32884</a></td>
<td>gzea|FGST_11535</td>
</tr>
<tr>
<td valign="top"><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_36484">OG4_36484</a></td>
<td>cbri|WBGene00088730<br />
cjej|YP_002344482</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_39391">OG4_39391</a></td>
<td>ddis|DDB_G0279421</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_43780">OG4_43780</a></td>
<td>cpar|cgd3_1080</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_44179">OG4_44179</a></td>
<td>atha|NP_177880</td>
</tr>
<tr>
<td valign="top"><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_44684">OG4_44684</a></td>
<td>bmal|YP_104794<br />
rbal|NP_868387</td>
</tr>
<tr>
<td valign="top"><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_45409">OG4_45409</a></td>
<td><span style="color:red;">rcom|29647.m002000<br />
rcom|29848.m004679</span></td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_50671">OG4_50671</a></td>
<td>pram|C_scaffold_62000023</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_50712">OG4_50712</a></td>
<td>bpse|YP_331887.1</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_52326">OG4_52326</a></td>
<td>bmaa|14961.m05365</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_52455">OG4_52455</a></td>
<td>bmal|YP_338428</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_55725">OG4_55725</a></td>
<td>apis|XP_001952076</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_57272">OG4_57272</a></td>
<td>bbov|XP_001610684.1</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_58797">OG4_58797</a></td>
<td>hwal|YP_659316</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_61264">OG4_61264</a></td>
<td>crei|122343</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_68577">OG4_68577</a></td>
<td>bmor|BGIBMGA000864</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_71107">OG4_71107</a></td>
<td>cbur|NP_819756</td>
</tr>
<tr>
<td><a href="http://orthomcl.org/cgi-bin/OrthoMclWeb.cgi?rm=sequenceList&amp;groupac=OG4_84041">OG4_84041</a></td>
<td>tcru|Tc00.1047053479883.10</td>
</tr>
</tbody>
</table>
<p>For convenience, the orthologous groups are linked to the corresponding web pages in OrthoMCL-DB, which enable viewing of Pfam domain architectures and multiple sequence alignments. Cursory inspection suggests that the majority of the of the sequences listed in the table do not belong to the orthologous groups in question.</p>
<p>Of the 28 orthologous groups, 24 groups contain a single protein with no BLAST hits to other group members, 2 groups each contain 2 such singletons, and the remaining 2 groups each contain 2 proteins that show weak similarity to each other but not to any other group members. The latter proteins are highlighted in <span style="color:red;">red</span>.</p>
<p>In summary, this analysis shows that the unnatural clustering by MCL reported for a toy example in <a href="http://larsjuhljensen.wordpress.com/2010/03/01/analysis-markov-clustering-and-the-case-of-the-unnatural-clusters/">the previous post</a> also affects the results of real-world bioinformatics applications of the algorithm.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/854/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=854&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2010/03/02/analysis-markov-clustering-and-the-case-of-the-nonhomologous-orthologs/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: Markov clustering and the case of the unnatural clusters</title>
		<link>http://larsjuhljensen.wordpress.com/2010/03/01/analysis-markov-clustering-and-the-case-of-the-unnatural-clusters/</link>
		<comments>http://larsjuhljensen.wordpress.com/2010/03/01/analysis-markov-clustering-and-the-case-of-the-unnatural-clusters/#comments</comments>
		<pubDate>Mon, 01 Mar 2010 12:33:30 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[clustering]]></category>
		<category><![CDATA[networks]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=808</guid>
		<description><![CDATA[The MCL (Markov CLustering) algorithm was invented/discovered by Stijn van Dongen and was published in 2000. It has since become highly popular in bioinformatics and has proven to perform well on a variety of different problems. It was also the method of choice when my postdoc Albert Palleja needed to cluster the human interaction network [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=808&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.micans.org/mcl/">MCL</a> (Markov CLustering) algorithm was invented/discovered by Stijn van Dongen and was published in 2000. It has since become highly popular in bioinformatics and has proven to perform well on a variety of different problems.</p>
<p>It was also the method of choice when my postdoc Albert Palleja needed to cluster the human interaction network from the <a href="http://string-db.org">STRING</a> database. However, we got strange results. More specifically, we observed that some clusters contained proteins that had no interactions with any other proteins within the same cluster. I call these <em>unnatural clusters</em>; this should be seen as a contrast to <em>natural clusters</em>, which are characterized by the presence of many edges between the members of a cluster.</p>
<p>After we had spent a week unsuccessfully trying to find out what we were doing wrong, I finally asked myself if it could be that we were not doing anything wrong. Might it be that applying the MCL algorithm to a protein interaction network can result in clusters of non-interacting proteins?</p>
<p>To test this, I constructed the following toy network consisting of only 10 nodes and 12 edges:</p>
<p><img class="aligncenter size-full wp-image-806" title="Example network before MCL clustering" src="http://larsjuhljensen.files.wordpress.com/2010/02/mcl-1.png?w=380" alt=""   /></p>
<p>Assigning a weight of 1 to all edges and running this network through MCL using an inflation factor (the key parameter in the MCL algorithm) between 1.734 and 3.418 yields five clusters. In the figure below, the nodes are colored according to which cluster they belong to:</p>
<p><img class="aligncenter size-full wp-image-807" title="Example network after MCL clustering" src="http://larsjuhljensen.files.wordpress.com/2010/02/mcl-2.png?w=380" alt=""   /></p>
<p>Note the black cluster which consists of two proteins, X and Y, despite the two nodes only being connected via nodes that are not part of the same cluster. This example clearly shows that the MCL algorithm is indeed capable of producing unnatural clusters containing nodes with no direct edges to any other members in the cluster.</p>
<p>In my view this is not as such a error in the the MCL algorithm. The algorithm is based on simulation of flow in the graph. The nodes X and Y are clustered due to the strong flow between them via nodes A, C, E, and G. However, I think it is fair to say that this behavior will catch many users by surprise and that it can give rise to misleading results when applying MCL to certain types of networks.</p>
<p><strong>Edit:</strong> I suspect that this is the same issue that was <a href="http://listserver.ebi.ac.uk/pipermail/mcl-users/2010-January/000072.html">reported on the Mcl-users mailing list by Sungwon Jung</a>. Using the <code>--force-connected=y</code> option prevents the undesirable clustering of X and Y.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/808/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/808/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/808/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/808/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/808/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/808/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/808/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/808/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/808/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/808/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/808/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/808/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/808/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/808/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=808&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2010/03/01/analysis-markov-clustering-and-the-case-of-the-unnatural-clusters/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2010/02/mcl-1.png" medium="image">
			<media:title type="html">Example network before MCL clustering</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2010/02/mcl-2.png" medium="image">
			<media:title type="html">Example network after MCL clustering</media:title>
		</media:content>
	</item>
		<item>
		<title>Resource: Real-time text mining in Second Life using the Reflect API</title>
		<link>http://larsjuhljensen.wordpress.com/2010/02/27/resource-real-time-text-mining-in-second-life-using-the-reflect-api/</link>
		<comments>http://larsjuhljensen.wordpress.com/2010/02/27/resource-real-time-text-mining-in-second-life-using-the-reflect-api/#comments</comments>
		<pubDate>Sat, 27 Feb 2010 08:18:34 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Resource]]></category>
		<category><![CDATA[Second Life]]></category>
		<category><![CDATA[text mining]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=815</guid>
		<description><![CDATA[Sometimes things just come together at the right time. The past few weeks Heiko Horn, Sune Frankild, and I have made much progress on the new version of Reflect, which we hope to put into production very soon. One of the major new features is that Reflect can now be accessed as REST and SOAP [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=815&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Sometimes things just come together at the right time. The past few weeks Heiko Horn, Sune Frankild, and I have made much progress on <a href="http://reflect.cbs.dtu.dk">the new version of Reflect</a>, which we hope to put into production very soon. One of the major new features is that Reflect can now be accessed as <a href="http://en.wikipedia.org/wiki/Representational_State_Transfer">REST</a> and <a href="http://en.wikipedia.org/wiki/SOAP">SOAP</a> web services. When Linden Lab made available <a href="http://secondlife.com/beta-viewer/">the beta version of Second Life viewer 2</a>, which enables you to place a web browser on a face of a 3D object, I simply had to try to put the two together to provide real-time text mining inside Second Life.</p>
<p>The system works as follows. The Reflect Second Life object contains an LSL script that listens to everything that is said in local chat. It sends any text that it picks up to the Reflect REST web service, which returns a simple XML document listing the entities (proteins and small molecules) that were mentioned in the text. The LSL script parses this XML, constructs a URL pointing to the Reflect popup that corresponds to the set of entities in question, and sets this as the shared media to be shown on the Reflect object in Second Life.</p>
<p>The result is an information board that automatically pulls up possibly relevant information related to what people close to it are talking about. The picture below shows the result of me typing a sentence that mentioned human and mouse IL-5 (click for a larger version).</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2010/02/reflect_sl_high.png"><img class="aligncenter size-full wp-image-814" title="Reflect popup in Second Life" src="http://larsjuhljensen.files.wordpress.com/2010/02/reflect_sl_low.png?w=380" alt=""   /></a></p>
<p>I am well aware that this may not be particularly useful to very many people in Second Life. However, I think it is a nice technology demo of how much can be accomplished with the new Reflect API and just a few lines code.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/815/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/815/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/815/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/815/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/815/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/815/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/815/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/815/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/815/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/815/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/815/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/815/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/815/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/815/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=815&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2010/02/27/resource-real-time-text-mining-in-second-life-using-the-reflect-api/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2010/02/reflect_sl_low.png" medium="image">
			<media:title type="html">Reflect popup in Second Life</media:title>
		</media:content>
	</item>
		<item>
		<title>Exercise: Using the STITCH database</title>
		<link>http://larsjuhljensen.wordpress.com/2010/01/26/exercise-using-the-stitch-database/</link>
		<comments>http://larsjuhljensen.wordpress.com/2010/01/26/exercise-using-the-stitch-database/#comments</comments>
		<pubDate>Tue, 26 Jan 2010 11:57:36 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Exercise]]></category>
		<category><![CDATA[networks]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=791</guid>
		<description><![CDATA[The STITCH database contains functional associations among proteins and small molecules. Try searching STITCH with the human thymidylate synthase (TYMS) protein as input. The resulting network includes several small molecules. Questions: Can you identify the products of thymidylate synthase among them? Are the reactants also present in the network? Sometimes the proteins or small molecules [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=791&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://stitch-db.org">STITCH</a> database contains functional associations among proteins and small molecules.</p>
<p>Try searching STITCH with the human <a href="http://en.wikipedia.org/wiki/Thymidylate_synthase">thymidylate synthase</a> (TYMS) protein as input. The resulting network includes several small molecules.</p>
<p><strong>Questions:</strong></p>
<ul>
<li>Can you identify the products of thymidylate synthase among them?</li>
<li>Are the reactants also present in the network?</li>
</ul>
<p>Sometimes the proteins or small molecules that you search for may not be immediately shown by STITCH. To find what you are looking for you may have to extend the network.</p>
<p><strong>Questions:</strong></p>
<ul>
<li>Do the small molecules that were missing in the questions above appear when clicking the Add nodes button?
</li>
<li>Can you construct a clearer network with fewer interactions by changing the network parameters at the bottom of the page?</li>
</ul>
<p>Thymidine is required for DNA replication and repair to take place, and inhibition of thymidine synthase is thus harmful to proliferating cells. Indeed, most of the small molecules in the network are drugs used for chemotherapy.</p>
<p><strong>Questions:</strong></p>
<ul>
<li>Are these drugs structurally similar to each other?</li>
<li>Are they similar to substrate of thymidylate synthase?</li>
<li>Can you suggest a mechanism of action?</li>
</ul>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/791/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/791/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/791/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/791/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/791/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/791/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/791/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/791/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/791/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/791/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/791/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/791/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/791/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/791/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=791&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2010/01/26/exercise-using-the-stitch-database/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: Correlating the PLoS article level metrics</title>
		<link>http://larsjuhljensen.wordpress.com/2010/01/15/analysis-correlating-the-plos-article-level-metrics/</link>
		<comments>http://larsjuhljensen.wordpress.com/2010/01/15/analysis-correlating-the-plos-article-level-metrics/#comments</comments>
		<pubDate>Fri, 15 Jan 2010 11:03:05 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[article level metrics]]></category>
		<category><![CDATA[citations]]></category>
		<category><![CDATA[PLoS]]></category>
		<category><![CDATA[publishing]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=758</guid>
		<description><![CDATA[A few months ago, the Public Library of Science (PLoS) made available a spreadsheet with article level metrics. Although others have already analyzed these data (see posts by Mike Chelen), I decided to take a closer look at the PLoS article level metrics. The data set consists of 20 different article level metrics. However, some [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=758&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A few months ago, the <a href="http://www.plos.org">Public Library of Science</a> (PLoS) made available a spreadsheet with <a href="http://article-level-metrics.plos.org">article level metrics</a>. Although others have already analyzed these data (see <a href="http://friendfeed.com/opensci-plos-alm">posts by Mike Chelen</a>), I decided to take a closer look at the PLoS article level metrics.</p>
<p>The data set consists of 20 different article level metrics. However, some of these are very sparse and some are partially redundant. I thus decided to filter/merge these to create a reduced set of only 6 metrics:</p>
<ol>
<li><strong>Blog posts.</strong> This value is the sum of <em>Blog Coverage &#8211; Postgenomic</em>, <em>Blog Coverage &#8211; Nature Blogs</em>, and <em>Blog Coverage &#8211; Bloglines</em>. A single blog post may obviously be picked up by multiple of these resources and hence be counted more than once. Being unable to count unique blog posts referring to a publication, I decided to aim for maximal coverage by using the sum rather than using data for only a single resource.</li>
<li><strong>Bookmarks.</strong> This value is the sum of <em>Social Bookmarking &#8211; CiteULike</em> and <em>Social Bookmarking &#8211; Connotea</em>. One cannot rule out that a single user bookmarks the same publication in both CiteULike and Connotea, but I would assume that most people use one or the other for bookmarking.</li>
<li><strong>Citations.</strong> This value is the sum of <em>Citations &#8211; CrossRef</em>, <em>Citations &#8211; PubMed Central</em>, and <em>Citations &#8211; Scopus</em>. I decided to use the sum to be consistent with the other metrics, but a single citation may obviously be picked up by more than one of these resources.</li>
<li><strong>Downloads.</strong> This value is called <em>Combined Usage (HTML + PDF + XML)</em> in the original data set and is the sum of <em>Total HTML Page Views</em>, <em>Total PDF Downloads</em>, and <em>Total XML Downloads</em>. Again the sum is used to be consistent.</li>
<li><strong>Ratings.</strong> This value is called <em>Number of Ratings</em> in the original data set. Because of the small number of articles with rating, notes, and comments, I decided to discard the related values <em>Average Rating</em>, <em>Number of Note threads</em>, <em>Number of replies to Notes</em>, <em>Number of Comment threads</em>, <em>Number of replies to Comments</em>, and <em>Number of &#8216;Star Ratings&#8217; that also include a text comment</em>.</li>
<li><strong>Trackbacks.</strong> This value is called <em>Number of Trackbacks</em> in the original data set. I was greatly in doubt whether to merge this into the blog post metric, but in the end decided against doing so because trackbacks do not necessarily originate from blog posts.</li>
</ol>
<p>Calculating all pairwise correlations among these metrics is obviously trivial. However, one has to be careful when interpreting the correlations as there are at least two major confounding factors. First, it is important to keep in mind that the PLoS article level metrics have been collected across several journals. Some of these journals are high impact journals such as <a href="http://www.plosbiology.org">PLoS Biology</a> and <a href="http://www.plosmedicine.org">PLoS Medicine</a>, whereas others are lower impact journals such as <a href="http://www.plosone.org">PLoS ONE</a>. One would expect that papers published in the former two journals will on average have higher values for most metrics than the latter journal. Papers published in journals with a web-savvy readership, e.g. <a href="http://www.ploscompbiol.org">PLoS Computational Biology</a>, are more likely to receive blog posts and social bookmarks. Second, the age of a paper matters. Both downloads and in particular citations accumulate over time. To correct for these confounding factors, I constructed a normalized set of article level metrics, in which each metric for a given article was divided by the average for articles published the same year in the same journal.</p>
<p>I next calculated all pairwise Pearson correlation coefficients among the reduced set of article level metrics. To see the effect of the normalization, I did this for both the raw and the normalized metrics. I visualized the correlation coefficients as a heat map, showing the results for the raw metrics above the diagonal and the results for the normalized metrics below the diagonal.</p>
<p><img class="aligncenter size-full wp-image-775" title="Correlation plot for the PLoS Article Level Metrics" src="http://larsjuhljensen.files.wordpress.com/2010/01/plos-alm-corr3.png?w=380" alt=""   /></p>
<p>There are a several interesting observations to be made from this figure:</p>
<ul>
<li><em>Downloads</em> correlate strongly with all the other metrics. This is hardly surprising, but it is reassuring to see that these correlations are not trivially explained by age and journal effects.</li>
<li><em>Bookmarks</em> is the metric that apart from number of downloads correlates most strongly with <em>Citations</em>. This makes good sense since CiteULike and Connotea are commonly used as reference managers. If you add a paper to you bibliography database, you will likely cite it at some point.</li>
<li><em>Blog posts</em> and <em>Trackbacks</em> correlate well with <em>Downloads</em> but poorly with <em>citations</em>. This may reflect that blog posts about research papers are often targeted towards a broad audience; if most of the readers of the blog posts are laymen or researchers from other fields, they will be unlikely to cite the papers covered in the blog posts.</li>
<li><em>Ratings</em> correlates fairly poorly with every other metric. Combined with the low number of ratings, this makes me wonder if the option to rate papers on the journal web sites is all that useful.</li>
</ul>
<p>Finally, I will point out one additional metrics that I would very much like to see added in future versions of this data set, namely microblogging. I personally discover many papers through others mentioning them on <a href="http://twitter.com">Twitter</a> or <a href="http://friendfeed.com">FriendFeed</a>. Because of the much smaller the effort involved in microblogging a paper as opposed to writing a full blog post about it, I suspect that the number of tweets that link to a paper would be a very informative metric.</p>
<p><strong>Edit:</strong> I made a mistake in the normalization program, which I have now corrected. I have updated the figure and the conclusions to reflect the changes. It should be noted that some comments to this post were made prior to this correction.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/758/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/758/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/758/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/758/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/758/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/758/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/758/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/758/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/758/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/758/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/758/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/758/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/758/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/758/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=758&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2010/01/15/analysis-correlating-the-plos-article-level-metrics/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2010/01/plos-alm-corr3.png" medium="image">
			<media:title type="html">Correlation plot for the PLoS Article Level Metrics</media:title>
		</media:content>
	</item>
		<item>
		<title>Job: Bioinformatics scientist in Protein Production Unit of the NNF Center for Protein Research</title>
		<link>http://larsjuhljensen.wordpress.com/2010/01/07/job-bioinformatics-scientist-in-protein-production-unit-of-the-nnf-center-for-protein-research/</link>
		<comments>http://larsjuhljensen.wordpress.com/2010/01/07/job-bioinformatics-scientist-in-protein-production-unit-of-the-nnf-center-for-protein-research/#comments</comments>
		<pubDate>Thu, 07 Jan 2010 13:40:00 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Job]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=728</guid>
		<description><![CDATA[At the Novo Nordisk Foundation Center for Protein Research we are looking for a scientist to provide bioinformatics support for the Protein Production Unit. For further details, please see the job advert below the fold. Scientist, Bioinformatics, Protein Production Unit Tenure: Until 31st Dec. 2012, with a potential extension of five years Employment Conditions Employment [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=728&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>At the <a href="http://www.cpr.ku.dk">Novo Nordisk Foundation Center for Protein Research</a> we are looking for a scientist to provide bioinformatics support for the Protein Production Unit. For further details, please see the job advert below the fold.<br />
<span id="more-728"></span></p>
<blockquote><p><strong>Scientist, Bioinformatics, Protein Production Unit</strong></p>
<p>Tenure: Until 31st Dec. 2012, with a potential extension of five years</p>
<p><strong>Employment Conditions</strong><br />
Employment will be in accordance with the provisions of the collective agreement between the Danish Government and AC (the Danish Confederation of Professional Associations). The position will be at the level of postdoctoral fellow. To the basic salary, a monthly contribution to a pension fund is added (17.1% of the salary), and a supplement could be negotiated, dependent on the candidate’s experiences and qualifications. In all cases, ability to perform the job will be the primary consideration, and thus we encourage all &#8211; regardless of their personal background and status &#8211; to apply.</p>
<p><strong>Background</strong><br />
The Novo Nordisk Foundation Center for Protein Research (CPR) has recently been established at the Faculty of Health Sciences, University of Copenhagen, to promote basic and applied discovery research on human proteins of medical relevance. The Center comprises a wide range of expertise and resources, from in silico target identification to proteomics, high throughput protein production and characterization, chemical biology, disease mechanisms and protein therapeutics. To date, around 60 scientists have been recruited (out of an expected ~150). To be able to carry out these activities and to establish the CPR as an internationally competitive organization focused on medically relevant proteins, we are now emphasizing and adding resources to our Protein Production Unit.</p>
<p>The Protein Production Unit at the Center is responsible for the establishment of a systematic, efficient and cost effective approach; using high throughput protein production, purification and characterization methods. The Unit should be able to produce and characterize a large number of medically relevant proteins. The unit is expected to grow to around 15 staff members within the near future.</p>
<p>The Protein Production Unit is a key research group at the Center &#8211; facilitating collaborations internally and externally &#8211; to further our knowledge of medically relevant proteins, using a protein family and pathway approach. In addition to collaborative scientific projects, the Unit is expected to generate results of scientific significance and impact, primarily focused on methods development and optimization. Thus, to strengthen our capabilities in bioinformatics we are now seeking an excellent scientist in this field, to actively participate in and support our research projects.</p>
<p><strong>Relationships</strong></p>
<ul>
<li>The post holder will report to the Head of Protein Production Unit, and ultimately to the Managing Director of the Center.</li>
<li>The post holder is expected to interact with staff at all levels, both internally and externally, regarding relevant research topics.</li>
</ul>
<p><strong>Job Description</strong></p>
<ol>
<li>To take major responsibility for the development, implementation and maintenance of the CPR target and workflow bioinformatics databases and systems, with specific focus on our protein target list.</li>
<li>To develop novel approaches and methods for sequence searching (including entry clone identification) data warehousing, visualisation and mining in order to improve CPR staff efficiency and productivity.</li>
<li>Take main responsibility for protein domain analysis, including e.g. PFAM and SMART domain definitions, fold recognition, secondary structure prediction and construct design for protein expression.</li>
<li>To take responsibility for functional as well as structural analysis of prioritized protein targets and design of rationally modified variants of proteins, including both site directed mutagenesis and chemical modifications.</li>
<li>To assist in the development and implementation of  methodologies and technologies for bio- and structural informatics relating to the target classes/biology areas pursued by the CPR, and thus ensuring that  our research units meet their targets and deliverables.</li>
<li>To participate in networks and collaborations with the international research community in general and the regional research community in particular.</li>
<li>To carry out any other relevant duties that may reasonably be associated with the post and which may be required from time to time.</li>
<li>To demonstrate international reputation through scientific publications and presentations at international symposia.</li>
</ol>
<p><strong>Experience/skills required</strong><br />
<em>Essential criteria</em></p>
<ul>
<li>International scientific reputation in relevant area.</li>
<li>Documented expertise in bioinformatics, including experience in computer programming for the analysis of biological sequences.</li>
<li>Experience with protein design for recombinant expression in E. coli and eukaryotic cells, including site-directed deletion/mutations.</li>
<li>Solid knowledge relating to handling of computers, databases, molecular modeling tools as well as programming.</li>
<li>Good scientific publication and methods development record.</li>
<li>Proven troubleshooting and analytical ability, excellent attention to detail.</li>
<li>A track record in devising innovative scientific or technical solutions.</li>
<li>Excellent communication skills, both oral and written.</li>
</ul>
<p><em>Desirable criteria</em></p>
<ul>
<li>Scientific knowledge of the process, from molecular biology to protein characterization and functional analysis.</li>
<li>Experience with protein structure determination and/or modelling.</li>
<li>Experience from the high throughput and parallel processing of protein targets, e.g. from Structural or Functional Genomics organizations.</li>
<li>Experience in analysis and visualisation of complex data.</li>
</ul>
<p><strong>Qualifications required</strong><br />
A PhD in Bioinformatics or other relevant area is required. Solid experience and excellent track record from leading laboratories in the research field will be considered a distinct advantage.</p>
<p>For further information, please contact Michael Sundström, Managing Director, CPR, michael.sundstrom@cpr.ku.dk</p>
<p>Your application marked “211-0038/09-3850 Scientist, Bioinformatics ” including a CV must be at the NNF Center for Protein Research no later than January 22nd 2010 . Your application should be sent by e-mail to jobs@contact.cpr.ku.dk</p></blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/728/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/728/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/728/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/728/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/728/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/728/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/728/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/728/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/728/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/728/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/728/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/728/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/728/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/728/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=728&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2010/01/07/job-bioinformatics-scientist-in-protein-production-unit-of-the-nnf-center-for-protein-research/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>
	</item>
		<item>
		<title>Editorial: What is the difference between Twitter and Second Life?</title>
		<link>http://larsjuhljensen.wordpress.com/2010/01/06/editorial-what-is-the-difference-between-twitter-and-second-life/</link>
		<comments>http://larsjuhljensen.wordpress.com/2010/01/06/editorial-what-is-the-difference-between-twitter-and-second-life/#comments</comments>
		<pubDate>Wed, 06 Jan 2010 08:38:03 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Editorial]]></category>
		<category><![CDATA[Second Life]]></category>
		<category><![CDATA[web interface]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=719</guid>
		<description><![CDATA[I admit that this may seem a strange question. One is a microblogging platform that allows you read and write messages of at most 140 characters. The other is a 3D virtual world. They are, however, both communication tools, and I think there is a completely different reason why Twitter is so much more useful [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=719&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I admit that this may seem a strange question. One is a microblogging platform that allows you read and write messages of at most 140 characters. The other is a 3D virtual world. They are, however, both communication tools, and I think there is a completely different reason why Twitter is so much more useful to me than Second Life is.</p>
<p>If I go to <a href="http://twitter.com">the Twitter web site</a> and log in, I see this:</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2010/01/web_twitter.png"><img class="aligncenter size-full wp-image-725" title="How the web interface of Twitter looks" src="http://larsjuhljensen.files.wordpress.com/2010/01/web_twitter_small.png?w=380" alt=""   /></a></p>
<p>It is Twitter. It immediately shows me the main content: tweets. It also allows me to create content, that is to tweet.</p>
<p>By contrast, if I go to <a href="http://secondlife.com">the Second Life web site</a> and log in, I see this:</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2010/01/web_sl1.png"><img class="aligncenter size-full wp-image-721" title="How the web interface of Second Life looks" src="http://larsjuhljensen.files.wordpress.com/2010/01/web_sl1_small.png?w=380" alt=""   /></a></p>
<p>It is not Second Life. It is a complex web interface that gives me access to account administration tools and shows me lists of blog posts by Linden lab, comments from Second Life users, items for sale on Xstreet SL, and video tutorials. In the lower left corner it shows me the only really useful information, namely which of my friends are online. That is, the friends that I would have been able to chat with, had I been in Second Life and not on the web site, which does not allow you to read or write messages.</p>
<p>Imagine if the web interface of Second Life would instead show me this:</p>
<p><a href="http://larsjuhljensen.files.wordpress.com/2010/01/web_sl2.png"><img class="aligncenter size-full wp-image-723" title="How the web interface of Second Life should look" src="http://larsjuhljensen.files.wordpress.com/2010/01/web_sl2_small.png?w=380" alt=""   /></a></p>
<p>It would be Second Life. It would immediately show me the main content: the virtual world. It would also allow me to interact with the content, that is to move around, to chat with people, and even to create content. Do you think Second Life would have more users if it would run inside your web browser? I think so. Linden Lab is and has been focusing on improving the initial user experience in Second Life to improve the retention rate (i.e. the fraction of new users that continue to come back). I am not saying that this is not important, but I think that most of the potential users are lost long before they even get into the virtual world.</p>
<p>This is by no means a problem that is specific to Second Life. Today, asking users to install a piece of software on their computer will cause the majority of people to shy away before they have tried your product. Even just asking users to create an account will cause many to turn around and walk away. When it comes to social networks, the decisive factor is users. If your friends are not there, why should you? Imagine a virtual world that would run in your web browser and which you could sign into using OpenID, Twitter Connect, or Facebook Connect. Would your friends be there? Would you?</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/719/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/719/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/719/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/719/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/719/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/719/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/719/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/719/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/719/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/719/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/719/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/719/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/719/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/719/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=719&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2010/01/06/editorial-what-is-the-difference-between-twitter-and-second-life/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2010/01/web_twitter_small.png" medium="image">
			<media:title type="html">How the web interface of Twitter looks</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2010/01/web_sl1_small.png" medium="image">
			<media:title type="html">How the web interface of Second Life looks</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2010/01/web_sl2_small.png" medium="image">
			<media:title type="html">How the web interface of Second Life should look</media:title>
		</media:content>
	</item>
		<item>
		<title>Job: Postdoctoral position in RNA bioinformatics and systems biology</title>
		<link>http://larsjuhljensen.wordpress.com/2009/12/27/job-postdoctoral-position-in-rna-bioinformatics-and-systems-biology/</link>
		<comments>http://larsjuhljensen.wordpress.com/2009/12/27/job-postdoctoral-position-in-rna-bioinformatics-and-systems-biology/#comments</comments>
		<pubDate>Sun, 27 Dec 2009 20:10:19 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Job]]></category>
		<category><![CDATA[RNA]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=707</guid>
		<description><![CDATA[In collaboration with Jan Gorodkin at the Faculty for Life Sciences at University of Copenhagen, I will be starting up a project related to non-coding RNAs and their interactions with mRNAs. We have secured funding for the project and thus now searching for the right person to fill a postdoc position. For further details, please [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=707&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In collaboration with <a href="http://genome.ku.dk/~gorodkin/">Jan Gorodkin</a> at the <a href="http://www.life.ku.dk/">Faculty for Life Sciences</a> at <a href="http://www.ku.dk/">University of Copenhagen</a>, I will be starting up a project related to non-coding RNAs and their interactions with mRNAs. We have secured funding for the project and thus now searching for the right person to fill a postdoc position. For further details, please read the job advert below the fold.<br />
<span id="more-707"></span></p>
<blockquote><p><strong>Post doc in RNA Bioinformatics and systems biology</strong><br />
Department of Basic Animal and Veterinary Sciences wishes to appoint a post doc within RNA bioinformatics and systems biology from February 1st, 2010 or soon thereafter. The appointment is for three years.</p>
<p><strong>Job description</strong><br />
Within the last few years non-coding RNAs have shown to be essential molecular players and in the mammalian genome there is room for a huge number of ncRNAs. Furthermore, <em>in silico</em> predicts hundreds of thousands structured RNAs in the genome. In the project ncRNAs will be correlated with protein coding genes with respect to putative physical interactions and possible co-expression patterns. RNA-RNA interactions will be applied to protein-coding genes and in the context of resources of gene interactions (STITCH) further interactions of small molecules will be made. In addition literature mining will be applied as an additional approach seeking information of interaction. The project will be carried out in collaboration with Lars Juhl Jensen at the Novo Nordisk Foundation Center for Protein Research, University of Copenhagen.</p>
<p><strong>Qualification requirements</strong><br />
The following requirements should be fulfilled:</p>
<ul>
<li> A PhD degree or similar in Bioinformatics, computational biology or a related area.</li>
<li> Algorithms for structural alignment or RNA sequences.</li>
<li> General knowledge about RNA folding and RNA gene search algorithms.</li>
<li> Perl, Python or similar.</li>
<li> One of: C, C++ or Java.</li>
<li> Fluency in English. Life generally encourages employees who do not speak Danish to acquire a working knowledge of the language.</li>
</ul>
<p>Experience with (development of) computational methods for RNA bioinformatics and/or literature mining is an advantage.</p>
<p>The post doc is also required to be enterprising and to possess good interpersonal skills.</p></blockquote>
<p>For details on how to apply, please refer to <a href="http://www.offentlige-stillinger.dk/sites/cfml/kbhuni/kbhuniVis.cfm?plugin=1&amp;englishJobs=Yes&amp;nJobNo=182806&amp;nLangNo=2">the official job announcement</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/707/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/707/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/707/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/707/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/707/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/707/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/707/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/707/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/707/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/707/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/707/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/707/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/707/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/707/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=707&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2009/12/27/job-postdoctoral-position-in-rna-bioinformatics-and-systems-biology/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>
	</item>
		<item>
		<title>Update: The BuzzCloud for 2009</title>
		<link>http://larsjuhljensen.wordpress.com/2009/12/22/update-the-buzzcloud-for-2009/</link>
		<comments>http://larsjuhljensen.wordpress.com/2009/12/22/update-the-buzzcloud-for-2009/#comments</comments>
		<pubDate>Tue, 22 Dec 2009 14:24:59 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Update]]></category>
		<category><![CDATA[text mining]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=678</guid>
		<description><![CDATA[It is that time of the year again: NCBI has rolled out the new PubMed baseline, and it is my pleasure to present you with the latest and greatest of biomedical buzzwords. I present to you the BuzzCloud 2009 (click for a larger interactive version): In case you have no idea what a BuzzCloud is, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=678&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>It is that time of the year again: NCBI has rolled out the new PubMed baseline, and it is my pleasure to present you with the latest and greatest of biomedical buzzwords. I present to you the BuzzCloud 2009 (click for a larger interactive version):</p>
<p><a href="http://www.bork.embl.de/~jensen/BuzzClouds/BuzzCloud2009.html"><img class="aligncenter size-full wp-image-679" title="BuzzCloud 2009" src="http://larsjuhljensen.files.wordpress.com/2009/12/buzzcloud2009.png?w=380" alt=""   /></a></p>
<p>In case you have no idea what a BuzzCloud is, it is a visualization of some of the most trendy words in PubMed. To make a long story short, the size of the word represents how many times it was mentioned in the past, whereas the brightness represents how much it was mentioned in the year compared to the previous ten years. For more details, please refer to <a href="http://larsjuhljensen.wordpress.com/2008/02/29/resource-the-buzzcloud-visualization-of-buzzwords/">the original blog post</a>.</p>
<p>The three largest words on the BuzzCloud 2009 are all reruns from earlier years: metagenomics and synthetic biology were both first seen on the BuzzCloud 2004) and click chemistry appeared in 2006. One can only conclude that these research areas continue to grow.</p>
<p>At the other end of the scale we have the small and bright words. These are the words that are rising most rapidly but have not appeared that many times in PubMed yet. Below are three selected examples that I think may be of particular interest to the readership of this blog.</p>
<ul>
<li><strong>Personal genomics.</strong> No surprise here except that I expected this word would have turned up much earlier considering the broad publicity of the <a href="http://www.1000genomes.org">1000 Genomes Project</a> and the <a href="http://www.personalgenomes.org">Personal Genome Project</a>.</li>
<li><strong>Proteogenomics.</strong> Why we need a separate word for referring to the combination of proteomics and genomics is beyond me. There is even <a href="http://dx.doi.org/10.1101/gr.074344.107">a paper on comparative proteogenomics</a> published in <a href="http://genome.cshlp.org">Genome Research</a>. One can only wonder when someone will compare metabolomics, proteomics, transcriptomics, and genomics data across environmental samples and coin the term <em>comparative metametaboproteotranscriptogenomics</em>.</li>
<li><strong>Translational bioinformatics.</strong> Where bioinformatics meets clinical medicine (see <a href="http://rbaltman.wordpress.com/2009/03/18/translational-bioinformatics/">blog post by Russ Altman</a>). I think that bioinformaticians are indeed increasingly working on medically relevant data, which in my view is a good thing. It just makes me wonder what happened to medical informatics?</li>
</ul>
<p>On a closing note, I am again pleasantly surprised how well the words picked up by a completely automated procedure fit with the ongoing activities in my lab. It is almost eerie.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/678/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/678/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/678/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/678/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/678/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/678/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/678/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/678/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/678/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/678/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/678/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/678/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/678/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/678/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=678&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2009/12/22/update-the-buzzcloud-for-2009/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2009/12/buzzcloud2009.png" medium="image">
			<media:title type="html">BuzzCloud 2009</media:title>
		</media:content>
	</item>
		<item>
		<title>Poll: Do you publish in the NAR database issue, and do you read what others publish there?</title>
		<link>http://larsjuhljensen.wordpress.com/2009/12/21/poll-do-you-publish-in-the-nar-database-issue-and-do-you-read-what-others-publish-there/</link>
		<comments>http://larsjuhljensen.wordpress.com/2009/12/21/poll-do-you-publish-in-the-nar-database-issue-and-do-you-read-what-others-publish-there/#comments</comments>
		<pubDate>Mon, 21 Dec 2009 17:13:42 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Poll]]></category>
		<category><![CDATA[database]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=683</guid>
		<description><![CDATA[The annual database issue of Nucleic Acids Research is now online. This year it contains a staggering 135 papers, which should be enough to keep all bioinformaticians busy over Christmas. This makes me wonder how many of the readers of this blog have published in the NAR database issue (not necessarily this year), and how [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=683&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://nar.oxfordjournals.org/content/vol38/suppl_1/index.dtl">The annual database issue</a> of <a href="http://nar.oxfordjournals.org/">Nucleic Acids Research</a> is now online. This year it contains a staggering 135 papers, which should be enough to keep all bioinformaticians busy over Christmas.</p>
<p>This makes me wonder how many of the readers of this blog have published in the NAR database issue (not necessarily this year), and how many of you actually read what others publish there. I have thus set up a highly unscientific poll:</p>
<a href="http://polldaddy.com/poll/2414430/">View This Poll</a>
<p>The terms <em>many</em>, <em>some</em>, and <em>very few</em> are obviously somewhat fuzzy. As a rough guideline, I would define <em>many</em> as &gt;10 papers per issue, some as 5-10, and <em>very few</em> as &lt;5.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/683/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/683/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/683/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/683/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/683/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/683/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/683/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/683/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/683/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/683/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/683/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/683/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/683/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/683/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=683&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2009/12/21/poll-do-you-publish-in-the-nar-database-issue-and-do-you-read-what-others-publish-there/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: Limited agreement among lists of Cdc28p substrates</title>
		<link>http://larsjuhljensen.wordpress.com/2009/11/03/analysis-limited-agreement-among-lists-of-cdc28p-substrates/</link>
		<comments>http://larsjuhljensen.wordpress.com/2009/11/03/analysis-limited-agreement-among-lists-of-cdc28p-substrates/#comments</comments>
		<pubDate>Tue, 03 Nov 2009 20:39:21 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[cell cycle]]></category>
		<category><![CDATA[phosphorylation]]></category>
		<category><![CDATA[proteomics]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=619</guid>
		<description><![CDATA[A collaboration between the Morgan lab at UCSF and the Gygi lab at Harvard has resulted in a paper by Holt et al. in Science, which reports the identification of several hundred substrates of the central cell-cycle kinase Cdc28p (also known as Cdk1) in the budding yeast Saccharomyces cerevisiae: Global analysis of Cdk1 substrate phosphorylation [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=619&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A collaboration between the <a href="http://physio.ucsf.edu/morgan/">Morgan lab</a> at <a href="http://www.ucsf.edu">UCSF</a> and the <a href="http://gygi.med.harvard.edu">Gygi lab</a> at <a href="http://www.harvard.edu">Harvard</a> has resulted in <a href="http://dx.doi.org/10.1126/science.1172867">a paper by Holt et al.</a> in <a href="http://www.sciencemag.org">Science</a>, which reports the identification of several hundred substrates of the central cell-cycle kinase <a href="http://www.cyclebase.org/displaygene.action?geneName=YBR160W&amp;taxId=4932">Cdc28p</a> (also known as Cdk1) in the budding yeast <em>Saccharomyces cerevisiae</em>:</p>
<blockquote><p><strong>Global analysis of Cdk1 substrate phosphorylation sites provides insights into evolution.</strong></p>
<p>To explore the mechanisms and evolution of cell-cycle control, we analyzed the position and conservation of large numbers of phosphorylation sites for the cyclin-dependent kinase Cdk1 in the budding yeast Saccharomyces cerevisiae. We combined specific chemical inhibition of Cdk1 with quantitative mass spectrometry to identify the positions of 547 phosphorylation sites on 308 Cdk1 substrates in vivo. Comparisons of these substrates with orthologs throughout the ascomycete lineage revealed that the position of most phosphorylation sites is not conserved in evolution; instead, clusters of sites shift position in rapidly evolving disordered regions. We propose that the regulation of protein function by phosphorylation often depends on simple nonspecific mechanisms that disrupt or enhance protein-protein interactions. The gain or loss of phosphorylation sites in rapidly evolving regions could facilitate the evolution of kinase-signaling circuits.</p></blockquote>
<p>The paper makes several interested in analyses and observations. However, I found the comparison to the previous <a href="http://dx.doi.org/10.1038/nature02062">study of Cdc28p substrates by Ubersax et al.</a> from the Morgan lab to be less detailed than I had hoped for:</p>
<blockquote><p>Phosphorylation of Cdk1 consensus sites was observed on 67% (122 of 181) of proteins previously identified as Cdk1 substrates in vitro (4). Sixty-six percent (80 of 122) of these proteins contained sites at which phosphorylation decreased (log<sub>2</sub> H/L &lt; –1) after inhibition of Cdk1 (only 45 of 122 are expected if there is no correlation between the experiments in vitro and in vivo; χ<sup>2</sup> test, P &lt; 10<sup>-10</sup>).</p></blockquote>
<p>In other words, 44% (80 of 181) of Cdc28p substrates identified in the old study were confirmed by the new study, and only 26% (80 of 308) of the Cdc28p substrates identified in the new study are supported by the old study. There are many possible explanations for this discrepancy</p>
<p><strong>Depth of the mass spectrometry</strong></p>
<p>It is notoriously difficult to identify peptides from low-abundance proteins in mass spectrometry. In the new mass spectrometry study, the authors were able to map 8710 precise phosphorylation sites on 1957 proteins. However, budding yeast is estimated to express in the order of 4500 distinct proteins during exponential growth (<a href="http://dx.doi.org/10.1038/nature04532">Gavin et al., 2006</a>). Assuming that the majority of these proteins contain sites that are phosphorylated during at least part of the mitotic cell cycle, it is likely that a considerable number of low-abundance Cdc28p substrates identified in the old study have been missed in the new study.</p>
<p><strong>Biases in phosphopeptide enrichment</strong></p>
<p>When doing phosphoproteomics, it is necessary to first enrich for phosphopeptides to improve the coverage. To this end, Holt et al. used immobilized metal affinity chromatography (IMAC). In 2007, the <a href="http://www.imsb.ethz.ch/researchgroup/rudolfa">Aebersold group</a> at <a href="http://www.ethz.ch">ETH</a> published <a href="http://dx.doi.org/10.1038/nmeth1005">a paper</a> showing that different purification methods lead to isolation of different, partially overlapping segments of the phosphoproteome. Specifically, they showed that IMAC enrichment biases the data towards isolation of multiply phosphorylated peptides. Given that only a single purification method was used, it is likely that <em>in vivo</em> Cdc28p substrates may have been missed in the new study, in particular if the peptides contain only a single phosphorylation site.</p>
<p><strong><em>In vitro</em> vs. <em>in vivo</em> conditions</strong></p>
<p>The old study by Ubersax et al. was done performed on cell lysate, which is an <em>in vitro</em> strategy (although all other proteins expressed during the cell cycle are present). It is thus likely that some of the proteins that are phosphorylated by Cdc28p under these conditions are nonetheless not <em>in vivo</em> Cdc28p substrates.</p>
<p><strong>Can we do better?</strong></p>
<p>As always, it is easy to point out potential flaws in other people&#8217;s data sets; however, it is much more constructive to do something about the problems. The challenge is thus to construct a larger and more reliable set of Cdc28p substrates by combining the data from the two studies.</p>
<p>To check the feasibility of assigning confidence scores to different putative Cdc28p substrates, I tested if the fold change observed in the new study correlates with the chance that the substrate was also identified in the old study. To this end, I divided the 308 Cdc28p substrates from the new studies into two groups and constructed histograms of the fold changes for each group:</p>
<p><img class="aligncenter size-full wp-image-637" title="Phosphorylation ratios from Holt et al." src="http://larsjuhljensen.files.wordpress.com/2009/11/holt_ratios.png?w=380" alt="Phosphorylation ratios from Holt et al."   /></p>
<p>The fold changes are clearly skewed towards larger negative values for the Cdc28p substrates also identified by the old study relative to the proteins that were not previously identified as Cdc28p substrates. This difference is statistically significant at P &lt; 1% according to the <a href="http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test">Kolmogorov-Smirnov test</a>. This suggests that the observed fold changes in the new mass spectrometry study correlates with the likelihood that the proteins are true Cdc28p substrates.</p>
<p>The old study gave rise to so-called P-score for the individual proteins (not to be confused with P-values). I decided to test if these too can be used as quality scores, I constructed an equivalent histogram in which the Cdc28p substrates found in the old study were divided into two groups based on whether or not they were also found in the new study:</p>
<p><img class="aligncenter size-full wp-image-638" title="P-scores from Ubersax et al." src="http://larsjuhljensen.files.wordpress.com/2009/11/ubersax_pscores.png?w=380" alt="P-scores from Ubersax et al."   /></p>
<p>In this case, no obvious trend is seen and a <a href="http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test">Kolmogorov-Smirnov test</a> indeed reveals no statistically significant difference between the two distributions. Surprisingly, the P-scores do thus not appear to be useful quality scores for the putative Cdc28p substrates.</p>
<p>Given the two sets of putative Cdc28 substrates, only one of which can be ranked by reliability, how can we create a better combined set? If one aims for the high accuracy at the price of low coverage, one could obviously choose to trust only the substrates identified by both screens. However, given the caveats regarding depth of mass spectrometry and biases arising from the enrichment procedure, I would be hesitant to use this approach. Alternatively, one could aim for maximal coverage at the price of accuracy by trusting all sites identified by either study. However, seeing the large fraction of novel substrates identified by Holt et al. with a log2-ratio only slightly below -1, I would personally tend to apply a more stringent threshold to the data from the new study by Holt et al., for example requiring log2-ratio below -2, before merging the sets of substrates from the two studies.</p>
<p><a href="http://www.webcitation.org/archive?url=http://larsjuhljensen.wordpress.com/2009/11/03/analysis-limited-agreement-among-lists-of-cdc28p-substrates/&amp;title=Analysis:+Limited+agreement+among+lists+of+Cdc28p+substrates&amp;author=Jensen,+Lars+Juhl&amp;source=Buried+Treasure"><img src="http://www.webcitation.org/webcite.gif" alt="WebCite" />Cite this post</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/619/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/619/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/619/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/619/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/619/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/619/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/619/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/619/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/619/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/619/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/619/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/619/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/619/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/619/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=619&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2009/11/03/analysis-limited-agreement-among-lists-of-cdc28p-substrates/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2009/11/holt_ratios.png" medium="image">
			<media:title type="html">Phosphorylation ratios from Holt et al.</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2009/11/ubersax_pscores.png" medium="image">
			<media:title type="html">P-scores from Ubersax et al.</media:title>
		</media:content>

		<media:content url="http://www.webcitation.org/webcite.gif" medium="image">
			<media:title type="html">WebCite</media:title>
		</media:content>
	</item>
		<item>
		<title>Editorial: Social network plumbing</title>
		<link>http://larsjuhljensen.wordpress.com/2009/08/12/editorial-social-network-plumbing/</link>
		<comments>http://larsjuhljensen.wordpress.com/2009/08/12/editorial-social-network-plumbing/#comments</comments>
		<pubDate>Wed, 12 Aug 2009 12:22:05 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Editorial]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=588</guid>
		<description><![CDATA[I guess it is no secret to anyone that Facebook as agreed to acquire FriendFeed. Several people seem puzzled why I left FriendFeed only 3 hours after learning this news. I can understand that this may look like a knee-jerk reaction, but there is logic behind the madness. The truths is that my existing setup [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=588&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I guess it is no secret to anyone that <a href="http://www.facebook.com/press/releases.php?p=116581">Facebook as agreed to acquire FriendFeed</a>. <a href="http://twitter.com/biocs/statuses/3240869683">Several</a> <a href="http://shirleywho.wordpress.com/2009/08/11/a-community-searching-for-a-home/">people</a> seem puzzled why <a href="http://twitter.com/larsjuhljensen/statuses/3235173989">I left FriendFeed only 3 hours after learning this news</a>. I can understand that this may look like a knee-jerk reaction, but there is logic behind the madness.</p>
<p>The truths is that my existing setup of Web 2.0 services was not working nearly as well as I would like. The sheer amount of content being shared on FriendFeed meant that it was easy to overlook a blog post from one of my favorite bloggers, for which reason I still subscribed to their blogs as RSS feeds. This caused me to waste time because the same posts appear in two place, and I could not filter out the blogs on FriendFeed because most comments would be posted there and not on the blogs. Receiving everyone&#8217;s tweets on FriendFeed tended to create a background noise that would drown all other conversation; however, I could also not filter out the Twitter streams on FriendFeed and follow people directly on Twitter instead because many cross-post all their FriendFeed &#8220;likes&#8221; and/or comments to Twitter!</p>
<p>Given the new situation, it was clear to me that the time had come to fix my broken social network setup and redo the plumbing in such a way that FriendFeed would no longer be responsible for gathering most of the content. Looking at FriendFeed, I discovered that most of the content of interest originated from just three sources: RSS feeds of blogs, Google Reader shared items, and Twitter. By following people directly on Google Reader and Twitter, both of which I was already using on a daily basis, I was thus able to relegate FriendFeed to a much less important role. I still feed my content from other sources into FriendFeed and I occasionally check for comments on my posts; however, it is no longer where I read content posted by others. Coincidentally, the new role of FriendFeed is almost identical to the role that Facebook has played all along.</p>
<p>To make a long story short, I&#8217;m not leaving the friendly community at FriendFeed in anger. I still read the content produced and shared by the same people as before. I have just fixed the plumbing.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/588/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/588/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/588/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/588/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/588/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/588/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/588/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/588/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/588/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/588/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/588/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/588/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/588/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/588/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=588&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2009/08/12/editorial-social-network-plumbing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: Results from thermal stability shift and competition binding assays correlate well</title>
		<link>http://larsjuhljensen.wordpress.com/2009/07/31/analysis-results-from-thermal-stability-shift-and-competition-binding-assays-correlate-well/</link>
		<comments>http://larsjuhljensen.wordpress.com/2009/07/31/analysis-results-from-thermal-stability-shift-and-competition-binding-assays-correlate-well/#comments</comments>
		<pubDate>Fri, 31 Jul 2009 09:32:05 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[binding assays]]></category>
		<category><![CDATA[kinase inhibitors]]></category>
		<category><![CDATA[kinases]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=545</guid>
		<description><![CDATA[Several large kinase inhibitor screens have been published in recent years. Two of the largest come from Stefan Knapp&#8217;s lab and Ambit, respectively. The former group used a temperature shift assay to measure the change in thermal stability of 60 human serine/threonine kinases that is caused by the binding of each of 156 kinase inhibitors [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=545&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Several large kinase inhibitor screens have been published in recent years. Two of the largest come from <a href="http://www.sgc.ox.ac.uk/people/stefan/">Stefan Knapp&#8217;s lab</a> and <a href="http://www.ambitbio.com/">Ambit</a>, respectively. The former group used a temperature shift assay to measure the change in thermal stability of 60 human serine/threonine kinases that is caused by the binding of each of 156 kinase inhibitors (<a href="http://dx.doi.org/10.1073/pnas.0708800104">Fedorov et al., 2007</a>). The latter group used a competition a competition binding assay to measure the dissociation constants (Kd) for 38 kinase inhibitors and 290 distinct kinases (<a href="http://dx.doi.org/10.1038/nbt1358">Karaman et al., 2008</a>).</p>
<p>The two screens are not directly comparable because one measures temperature shifts whereas the other measures dissociation constants. To see if it possible to convert temperature shift values to Kd values, I asked Damian Szklarczyk (who is a Ph.D. student in my group) to map all data from both screens onto a common set of chemical and protein identifiers, extract all inhibitor-kinase pairs that were measured in both assays, and make a scatter plot of -log(Kd) as function of temperature shift. The result was a set of 704 pairs of temperature shift and Kd values. In the plot below, inhibitor-kinase pairs for which binding was not observed in the competition binding assay were defined to have a Kd of 10 microM, and negative values from the temperature shift assay were treated as zero temperature shift.</p>
<p><img class="aligncenter size-full wp-image-553" title="Correlation between temperature shift and -log(Kd)" src="http://larsjuhljensen.files.wordpress.com/2009/07/temperature_shift_vs_ambit.png?w=380" alt="Correlation between temperature shift and -log(Kd)"   /></p>
<p>The plot shows that the two assays are in very good agreement, which is surprising considering that the assays are fundamentally very different and were run using different expression constructs for several of the kinases. The linear Pearson correlation coefficient is 0.92 when excluding the one obvious outlier shown in red (BIRB796 vs. MAPK11; this appears to be a false negative in the competition binding assay).</p>
<p>The linear fit gives an intercept with the y-axis of 4.9223, which implies that a temperature shift of zero (i.e. no binding according to the temperature shift assay) does not translate precisely into a Kd of 10 microM (i.e. no binding according to the competition binding assay). We thus did a second linear regression in which we forced the intercept with the y-axis to 5 (red regression line in the plot). We thereby at the calibration function -log(Kd) = 5+0.244*Ts, which allows us to to convert temperature shifts to Kd values. We have thereby managed to put the measurements from the two kinase inhibitor screens onto a common basis that facilitates direct comparison and integration.</p>
<p><em>Full disclosure: I have an on-going collaboration with Stefan Knapp&#8217;s lab related to screening of kinase inhibitor.</em></p>
<p><a href="http://www.webcitation.org/archive?url=http://larsjuhljensen.wordpress.com/2009/07/31/analysis-results-from-thermal-stability-shift-and-competition-binding-assays-correlate-well/&amp;title=Analysis:+Results+from+thermal+stability+shift+and+competition+binding+assays+correlate+well&amp;author=Jensen,+Lars+Juhl&amp;source=Buried+Treasure"><img src="http://www.webcitation.org/webcite.gif" alt="WebCite" />Cite this post</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/545/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/545/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/545/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/545/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/545/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/545/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/545/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/545/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/545/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/545/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/545/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/545/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/545/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/545/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=545&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2009/07/31/analysis-results-from-thermal-stability-shift-and-competition-binding-assays-correlate-well/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2009/07/temperature_shift_vs_ambit.png" medium="image">
			<media:title type="html">Correlation between temperature shift and -log(Kd)</media:title>
		</media:content>

		<media:content url="http://www.webcitation.org/webcite.gif" medium="image">
			<media:title type="html">WebCite</media:title>
		</media:content>
	</item>
		<item>
		<title>Resource: Second Life Interactive Dendrogram Rezzer (SLIDR)</title>
		<link>http://larsjuhljensen.wordpress.com/2009/07/04/resource-second-life-interactive-dendogram-rezzer-slidr/</link>
		<comments>http://larsjuhljensen.wordpress.com/2009/07/04/resource-second-life-interactive-dendogram-rezzer-slidr/#comments</comments>
		<pubDate>Sat, 04 Jul 2009 19:27:29 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Resource]]></category>
		<category><![CDATA[evolution]]></category>
		<category><![CDATA[Second Life]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=510</guid>
		<description><![CDATA[About half a year ago, I began experimenting with Second Life as a tool for virtual conferences (I should add that my experiences have since improved). However, I believe that imitating real life in a virtual world is not necessarily the best way to use the technology &#8211; it may be better to use virtual [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=510&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>About half a year ago, I began experimenting with <a href="http://larsjuhljensen.wordpress.com/2009/01/18/editorial-virtual-conferences-in-second-life/">Second Life as a tool for virtual conferences</a> (I should add that my experiences have since improved). However, I believe that imitating real life in a virtual world is not necessarily the best way to use the technology &#8211; it may be better to use virtual reality for doing the things that are difficult to do in the real world. A good example of this is <a href="https://www.xstreetsl.com/modules.php?name=Marketplace&amp;file=item&amp;ItemID=822766">Hiro&#8217;s Molecule Rezzer</a>, which is one of the best known scientific tools in Second Life. It, and its much improved successor Orac, allows people to easily construct molecular models of small molecules in Second Life.</p>
<p>After speaking with several other researchers in Second Life, who like I are interested in evolution, I set out to build a similar tool for visualization of phylogenetic trees. The result is SLIDR (Second Life Interactive Dendrogram Rezzer), which based on a tree in <a href="http://en.wikipedia.org/wiki/Newick_format">Newick format</a> constructs a dendrogram object. The first version of SLIDR can handle trees both with and without branch lengths; however, I have not yet implemented support for labels on internal nodes or for bootstrap values.</p>
<p>The picture below shows an example of a dendrogram that was automatically generated by SLIDR based on a Newick tree:</p>
<p><img class="aligncenter size-full wp-image-516" title="SLIDR closeup" src="http://larsjuhljensen.files.wordpress.com/2009/07/slidr-closeup.png?w=380" alt="SLIDR closeup"   /></p>
<p>There is a bit more to SLIDR than this, though. After the dendrogram has been built, it can be loaded with a photo and/or a sound for each of the leaf nodes. When click on a node, the corresponding sound will be played and the photo will be shown on the associated screen (the white box in front of which I stand):</p>
<p><img class="aligncenter size-full wp-image-517" title="SLIDR posing" src="http://larsjuhljensen.files.wordpress.com/2009/07/slidr-posing.png?w=380" alt="SLIDR posing"   /></p>
<p>I plan to work with collaborators in Second Life to construct dendrograms for evolution of bats (including their echolocation sounds and photos of the animals) and for the fully sequenced Drosophila genomes. Please do hesitate to contact me if you would like to use SLIDR on another project. I intend to make SLIDR available as <a href="http://www.opensource.org/">open source software</a> once I have implemented support for the full Newick format.</p>
<p><a href="http://www.webcitation.org/archive?url=http://larsjuhljensen.wordpress.com/2009/07/04/resource-second-life-interactive-dendogram-rezzer-slidr/&amp;title=Resource:+Second+Life+Interactive+Dendrogram+Rezzer+(SLIDR)&amp;author=Jensen,+Lars+Juhl&amp;source=Buried+Treasure"><img src="http://www.webcitation.org/webcite.gif" alt="WebCite" />Cite this post</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/510/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/510/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/510/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/510/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/510/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/510/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/510/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/510/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/510/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/510/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/510/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/510/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/510/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/510/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=510&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2009/07/04/resource-second-life-interactive-dendogram-rezzer-slidr/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2009/07/slidr-closeup.png" medium="image">
			<media:title type="html">SLIDR closeup</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2009/07/slidr-posing.png" medium="image">
			<media:title type="html">SLIDR posing</media:title>
		</media:content>

		<media:content url="http://www.webcitation.org/webcite.gif" medium="image">
			<media:title type="html">WebCite</media:title>
		</media:content>
	</item>
		<item>
		<title>Resource: STRING v8.1</title>
		<link>http://larsjuhljensen.wordpress.com/2009/06/25/resource-string-v8-1/</link>
		<comments>http://larsjuhljensen.wordpress.com/2009/06/25/resource-string-v8-1/#comments</comments>
		<pubDate>Thu, 25 Jun 2009 19:52:50 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Resource]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[expression]]></category>
		<category><![CDATA[networks]]></category>
		<category><![CDATA[protein interactions]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=495</guid>
		<description><![CDATA[After months of hard work from the entire STRING team &#8211; thanks everyone -  I am pleased to be able to say that STRING v8.1 has now been put into production. Here is a screen shot of the start page: This is a minor release of STRING, which means that the imported databases of microarray [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=495&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>After months of hard work from the entire STRING team &#8211; thanks everyone -  I am pleased to be able to say that <a href="http://string.embl.de/version_8_1/">STRING v8.1</a> has now been put into production. Here is a screen shot of the start page:</p>
<p><img class="aligncenter size-full wp-image-496" title="STRING 8.1 start page" src="http://larsjuhljensen.files.wordpress.com/2009/06/string81_start.png?w=380" alt="STRING 8.1 start page"   /></p>
<p>This is a minor release of STRING, which means that the imported databases of microarray expression data, protein interactions, genetic interactions, and pathways as well as text-mining evidence have all been updated. We have also fixed a bug that affected the minority of bacteria that have multiple chromosomes.</p>
<p>Another notable feature of STRING v8.1 is the new interactive network viewer that is implemented in Adobe Flash:</p>
<p><img class="aligncenter size-full wp-image-497" title="STRING 8.1 network viewer" src="http://larsjuhljensen.files.wordpress.com/2009/06/string81_network.png?w=380" alt="STRING 8.1 network viewer"   /></p>
<p>For further details please see <a href="http://string-stitch.blogspot.com/2009/06/new-release-of-string-81.html">the post</a> on <a href="http://string-stitch.blogspot.com/">the official STRING/STITCH blog</a>.</p>
<p><a href="http://www.webcitation.org/archive?url=http://larsjuhljensen.wordpress.com/2009/06/25/resource-string-v8-1/&amp;title=Resource:+STRING+v8.1&amp;author=Jensen,+Lars+Juhl&amp;source=Buried+Treasure"><img src="http://www.webcitation.org/webcite.gif" alt="WebCite" />Cite this post</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/495/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/495/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/495/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/495/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/495/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/495/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/495/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/495/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/495/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/495/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/495/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/495/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/495/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/495/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=495&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2009/06/25/resource-string-v8-1/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2009/06/string81_start.png" medium="image">
			<media:title type="html">STRING 8.1 start page</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2009/06/string81_network.png" medium="image">
			<media:title type="html">STRING 8.1 network viewer</media:title>
		</media:content>

		<media:content url="http://www.webcitation.org/webcite.gif" medium="image">
			<media:title type="html">WebCite</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: On the evolution of protein length and phosphorylation sites</title>
		<link>http://larsjuhljensen.wordpress.com/2009/06/25/analysis-on-the-evolution-of-protein-length-and-phosphorylation-sites/</link>
		<comments>http://larsjuhljensen.wordpress.com/2009/06/25/analysis-on-the-evolution-of-protein-length-and-phosphorylation-sites/#comments</comments>
		<pubDate>Thu, 25 Jun 2009 16:30:46 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[evolution]]></category>
		<category><![CDATA[phosphorylation]]></category>
		<category><![CDATA[proteomics]]></category>
		<category><![CDATA[regulation]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=479</guid>
		<description><![CDATA[It has been much too long since I have last written a blog post. Part of the reason has been that I have been busy moving back to Denmark, starting up a research group, and co-founding a company. More on that in other blog posts. The main reason, however, has been a lack of papers [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=479&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>It has been much too long since I have last written a blog post. Part of the reason has been that I have been busy moving back to Denmark, starting up a research group, and co-founding a company. More on that in other blog posts. The main reason, however, has been a lack of papers that inspired me to do the simple follow-up analyses that I usually blog about.</p>
<p>This has thankfully changed now. <a href="http://pbeltrao.blogspot.com/">Pedro Beltrao</a> and coworkers recently published <a href="http://dx.doi.org/10.1371/journal.pbio.1000134">an interesting paper</a> in <a href="http://www.plosbiology.org">PLoS Biology</a> on the evolution of regulation through protein phosphorylation. The paper presents several interesting analyses and comparisoins of phosphoproteomics data from three yeast species; the abstract summarizes the findings better than I can do:</p>
<blockquote><p><strong>Evolution of Phosphoregulation: Comparison of Phosphorylation Patterns across Yeast Species<br />
</strong>The extent by which different cellular components generate phenotypic diversity is an ongoing debate in evolutionary biology that is yet to be addressed by quantitative comparative studies. We conducted an in vivo mass-spectrometry study of the phosphoproteomes of three yeast species (Saccharomyces cerevisiae, Candida albicans, and Schizosaccharomyces pombe) in order to quantify the evolutionary rate of change of phosphorylation. We estimate that kinase–substrate interactions change, at most, two orders of magnitude more slowly than transcription factor (TF)–promoter interactions. Our computational analysis linking kinases to putative substrates recapitulates known phosphoregulation events and provides putative evolutionary histories for the kinase regulation of protein complexes across 11 yeast species. To validate these trends, we used the E-MAP approach to analyze over 2,000 quantitative genetic interactions in S. cerevisiae and Sc. pombe, which demonstrated that protein kinases, and to a greater extent TFs, show lower than average conservation of genetic interactions. We propose therefore that protein kinases are an important source of phenotypic diversity.</p></blockquote>
<p>Figure 1a in the paper shows the intriguing observation that, despite rapid evolution of individual phosphorylation sites, the relative number of phosphorylation sites within proteins from different functional classes (<a href="http://geneontology.org">Gene Ontology</a> categories) remains remarkably constant between species:</p>
<p><a href="http://dx.doi.org/10.1371/journal.pbio.1000134.g001"><img class="aligncenter size-full wp-image-480" title="Beltrao et al., PLoS Biology, 2009, Figure 1a" src="http://larsjuhljensen.files.wordpress.com/2009/06/bel09plosbiol_fig1a.png?w=380" alt="Beltrao et al., PLoS Biology, 2009, Figure 1a"   /></a></p>
<p>However, it occurred to me that this could potentially be a consequence of longer proteins having more phosphorylation sites, and protein length being conserved through evolution. I thus counted the number of unique phosphorylation sites identified in each protein (thanks to Pedro Beltrao for providing the data) and correlated it with the length of the proteins. In the two plots below, I have pooled the proteins so that each dot corresponds to 100 proteins. The upper and lower panels show the results for <em>S. cerevisiae</em> and <em>S. pombe</em>, respectively:</p>
<p><img class="aligncenter size-full wp-image-481" title="Number of phosphorylation sites vs. protein lengh for S. cerevisiae" src="http://larsjuhljensen.files.wordpress.com/2009/06/cerevisiae_psites_vs_length.png?w=380" alt="Number of phosphorylation sites vs. protein lengh for S. cerevisiae"   /></p>
<p><img class="aligncenter size-full wp-image-482" title="Number of phosphorylation sites vs. protein length for S. pombe" src="http://larsjuhljensen.files.wordpress.com/2009/06/pombe_psites_vs_length.png?w=380" alt="Number of phosphorylation sites vs. protein length for S. pombe"   /></p>
<p>As should be evident from the plots, the average number of phosphorylation sites in a protein correlates strongly with its length, which is by no means surprisings. It is unclear to me why the intercept with the y-axis appears to differ from zero in both plots; suggestions are welcome.</p>
<p>The next question was whether the Gene Ontology terms that correspond to proteins with many phosphorylation sites are indeed assigned to proteins that are longer than average. I thus examined the terms &#8220;Cell budding&#8221;, &#8220;Morphogenesis&#8221;, and &#8220;Signal transduction&#8221;.</p>
<p>The average <em>S. cerevisiae</em> protein is 450 aa long. Proteins annotated with &#8220;Cell budding&#8221;, &#8220;Morphogenesis&#8221;, and &#8220;Signal transduction&#8221; are on average 1.6 (739 aa), 2.1 (945 aa), and 1.5 (679 aa) times longer, respectively. By comparison, the corresponding ratios observed for phosphorylation sites are approximately 2.3, 2.6, and 2.4. It would thus appear that differences in protein length between functional classes of proteins account for much, but not all, of the signal that was observed by Beltrao et al. when comparing the number phosphorylation sites.</p>
<p><strong>Edit:</strong> Make sure to read <a href="http://pbeltrao.blogspot.com/2009/06/reply-on-evolution-of-protein-length.html">Pedro Beltrao&#8217;s follow-up blog post</a>, which nicely confirms that whereas protein length does play a role, it is not the full story.</p>
<p><a href="http://www.webcitation.org/archive?url=http://larsjuhljensen.wordpress.com/2009/06/25/analysis-on-the-evolution-of-protein-length-and-phosphorylation-sites/&amp;title=Analysis:+On+the+evolution+of+protein+length+and+phosphorylation+sites&amp;author=Jensen,+Lars+Juhl&amp;source=Buried+Treasure"><img src="http://www.webcitation.org/webcite.gif" alt="WebCite" />Cite this post</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/479/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/479/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/479/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/479/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/479/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/479/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/479/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/479/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/479/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/479/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/479/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/479/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/479/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/479/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=479&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2009/06/25/analysis-on-the-evolution-of-protein-length-and-phosphorylation-sites/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2009/06/bel09plosbiol_fig1a.png" medium="image">
			<media:title type="html">Beltrao et al., PLoS Biology, 2009, Figure 1a</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2009/06/cerevisiae_psites_vs_length.png" medium="image">
			<media:title type="html">Number of phosphorylation sites vs. protein lengh for S. cerevisiae</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2009/06/pombe_psites_vs_length.png" medium="image">
			<media:title type="html">Number of phosphorylation sites vs. protein length for S. pombe</media:title>
		</media:content>

		<media:content url="http://www.webcitation.org/webcite.gif" medium="image">
			<media:title type="html">WebCite</media:title>
		</media:content>
	</item>
		<item>
		<title>Update: The BuzzCloud for 2008</title>
		<link>http://larsjuhljensen.wordpress.com/2009/01/19/update-the-buzzcloud-for-2008/</link>
		<comments>http://larsjuhljensen.wordpress.com/2009/01/19/update-the-buzzcloud-for-2008/#comments</comments>
		<pubDate>Mon, 19 Jan 2009 06:28:03 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Update]]></category>
		<category><![CDATA[text mining]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=183</guid>
		<description><![CDATA[Yes, it is that time of the year again &#8211; we are now almost three weeks into 2009, most papers published in 2008 have hopefully made it into Medline, and it is time to reveal the words of 2008. In other words, I have updated the BuzzCloud resource and here is the result for 2008 [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=183&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p style="text-align:left;">Yes, it is that time of the year again &#8211; we are now almost three weeks into 2009, most papers published in 2008 have hopefully made it into Medline, and it is time to reveal the words of 2008. In other words, I have updated the BuzzCloud resource and here is the result for 2008 (click on the image to go to the web resource):</p>
<p><a href="http://www.bork.embl.de/~jensen/BuzzClouds/BuzzCloud2008.html"><img class="aligncenter size-full wp-image-398" style="border:0 none;" title="BuzzCloud 2008" src="http://larsjuhljensen.files.wordpress.com/2009/01/buzzcloud2008.png?w=380" alt="BuzzCloud 2008"   /></a></p>
<p>I am thrilled to see the outcome. Without any cheating or tweaking, several buzzwords related to proteomics make it on the list with &#8220;phosphoproteomics&#8221; and &#8220;quantitative phosphoproteomics&#8221; being the two most prominent of them. Nice for me to see considering that my new research group at the <a href="http://www.cpr.ku.dk">Novo Nordisk Foundation Center for Protein Research</a> will focus heavily on improving and applying the <a href="http://networkin.info">NetworKIN</a> and <a href="http://netphorest.info">NetPhorest</a> resources for analysis of phosphoproteomics data.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/183/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=183&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2009/01/19/update-the-buzzcloud-for-2008/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2009/01/buzzcloud2008.png" medium="image">
			<media:title type="html">BuzzCloud 2008</media:title>
		</media:content>
	</item>
		<item>
		<title>Editorial: Virtual conferences in Second Life</title>
		<link>http://larsjuhljensen.wordpress.com/2009/01/18/editorial-virtual-conferences-in-second-life/</link>
		<comments>http://larsjuhljensen.wordpress.com/2009/01/18/editorial-virtual-conferences-in-second-life/#comments</comments>
		<pubDate>Sun, 18 Jan 2009 16:10:53 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Editorial]]></category>
		<category><![CDATA[Second Life]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=355</guid>
		<description><![CDATA[This blog has been very quiet for a long time. There are several reasons for this, most of which are positive: I have not had many boring or negative results to write blog posts about, I have been busy writing manuscripts about the positive results instead, and I have moved to Copenhagen where I am [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=355&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>This blog has been very quiet for a long time. There are several reasons for this, most of which are positive: I have not had many boring or negative results to write blog posts about, I have been busy writing manuscripts about the positive results instead, and I have moved to Copenhagen where I am busy starting my own research group at the <a href="http://www.cpr.ku.dk">Novo Nordisk Foundation Center for Protein Research</a>. There is also one more reason for the absence of blog posts from me: I have spent a lot of time experimenting with <a href="http://secondlife.com/">Second Life</a>, and that is the topic of this blog post.</p>
<p>I first got interested in Second Life when I heard that Nature Publishing Group was setting up a virtual conference center called <a href="http://www.nature.com/secondnature/index.html">Elucian Islands</a>. In the beginning I felt very alone on Elucian Islands. There was a good reason for that &#8211; I was alone most of the time. My view on Second Life was thus that it was pretty (see images below) but rather useless.</p>
<p><a href="http://picasaweb.google.com/Lars.Juhl.Jensen/BuriedTreasure#5257636226144109826"><img class="aligncenter size-full wp-image-353" title="Elucian Islands 1" src="http://larsjuhljensen.files.wordpress.com/2008/10/elucian11.png?w=380" alt=""   /></a></p>
<p><a href="http://picasaweb.google.com/Lars.Juhl.Jensen/BuriedTreasure#5257638034062678194"><img class="aligncenter size-full wp-image-354" title="Elucian Islands 2" src="http://larsjuhljensen.files.wordpress.com/2008/10/elucian2.png?w=380" alt=""   /></a></p>
<p>I obviously took a look at the SciFoo presentations (seen in the background of the image above) and the other scientific displays at Elucian Islands and elsewhere in Second Life. However, these mostly reinforced my negative view of Second Life being fairly useless, since almost everything I saw was already being served better by dedicated resources. For example, slide shows are much more conveniently viewed and shared in <a href="http://www.slideshare.net">SlideShare</a> than in Second Life, and 3D protein structures can be examined and analyzed better in programs such as <a href="http://pymol.sourceforge.net">PyMOL</a>.</p>
<p>Over at <a href="http://friendfeed.com">FriendFeed</a>, Jean-Claude Bradley fought a brave fight trying to convince me that Second Life is in fact useful for science. His key point was that Second Life is all about interacting with people, so I should try to go to some scientific events in Second Life. Sadly, there are still not many such events, and although they have changed my view on Second Life, they have also shown that there are many problems that remain to be solved.</p>
<p>The first virtual seminar I went to was &#8220;<a href="http://biomedicine.ning.com/events/event/show?id=2219265%3AEvent%3A362">Cancer, Cell Cycle, and Check Points</a>&#8221; organized by Digi S Lab. This was a perfect match since I work on cell-cycle regulation myself. The seminar consisted of two excellent presentations given by Letizia Cito from Sbarro Health Research Organization and Fayamdria Foley from the American Cancer Society.</p>
<p><img class="aligncenter size-full wp-image-374" title="Meeting on Cancer and Cell Cycle 1" src="http://larsjuhljensen.files.wordpress.com/2008/10/sl_cancer_and_cell_cycle_1.png?w=380" alt="Meeting on Cancer and Cell Cycle 1"   /></p>
<p><img class="aligncenter size-full wp-image-375" title="Meeting on Cancer and Cell Cycle 2" src="http://larsjuhljensen.files.wordpress.com/2008/10/sl_cancer_and_cell_cycle_2.png?w=380" alt="Meeting on Cancer and Cell Cycle 2"   /></p>
<p>Whereas the presentations were great, the seminar also illustrated several of the problems that need to be overcome before virtual conferences in Second Life are ready for prime time. When the first talk started, I could not see any of the slides. Restarting my Second Life client did not solve the problem, nor did a reboot of my computer. After giving up solving the problem, the entire region in which the seminar took place suddenly crashed causing speakers and participants to all be logged out. When it came back online after some minutes and everyone had found their way back, I could suddenly see the slides. Even then, however, they took so long to appear on my screen that the presenter had typically explained half of what was on a slide by the time I could see the slide. I see this as a major problem that must be solved before Second Life conferences can work properly &#8211; it must be possible to change slides without a noticeable delay.</p>
<p>The second event I went to was the &#8220;<a href="http://network.nature.com/people/joannascott/blog/2008/11/10/esrc-complexity-research-seminar-in-second-life">ESRC Complexity Research Seminar in Second Life</a>&#8221; that took place at Elucian Islands. This seminar was very different from the one described above in that it was not a purely virtual seminar; instead it was a video feed from a real-world seminar that was being transmitted into Second Life. Think of it as a virtual overflow room &#8211; the image below shows the people who had gathered shortly before the event started.</p>
<p><img class="aligncenter size-full wp-image-378" title="ESRC Complexity Research Seminar" src="http://larsjuhljensen.files.wordpress.com/2008/12/sl_esrc1.png?w=380" alt="ESRC Complexity Research Seminar"   /></p>
<p>Sadly, this event was marred by technical problems. The sound stream was of such poor quality that the Second Life participants could barely understand a word of what the speakers were saying, and the video stream was of too low quality to be able to read their slides. I do not want to dwell on this but just note that good quality microphones and cameras are a prerequisite for streaming events into Second Life.</p>
<p>The third event I went to was the &#8220;<a href="http://www.nature.com/secondnature/archive_pages/2008_12_03.html">Virtual Conference on Climate Change and CO2 Storage</a>&#8220;, which again took place at Elucian Islands. This was again a mixed event taking place both in the real world and in Second Life. The presentations were excellent and important lessons had been learned from the previous events. The microphones worked perfectly this time, and the video feed had been abandoned in favor of showing a copy of the actual slides in Second Life, which greatly improved the readability.</p>
<p>In addition to these events, Elucian Islands now also runs regular events such as the weekly Nature Podcast event where a fairly large group of people gather to listen to the latest podcast shortly after it has been released (image from <a href="http://network.nature.com/people/joannascott/blog/">Joanna Scott&#8217;s blog</a>).</p>
<p><img class="aligncenter size-full wp-image-376" title="Nature Podcast at Elucian Islands" src="http://larsjuhljensen.files.wordpress.com/2008/12/nature_podcast.png?w=380" alt="Nature Podcast at Elucian Islands"   /></p>
<p>Regular events are crucial in SL because they bring people together in the same place at the same time. The need for people to be online at the same time is in my view one of the major drawbacks of Second Life compared to other tools that researchers can use for social networking. In my view Second Life should thus not be seen as competing with tools like <a href="http://friendfeed.com/">FriendFeed</a> or <a href="https://twitter.com/">Twitter</a>, which you can read when you feel like it, but rather as virtual reality alternative to video conferences. I think that Nature Publishing Group is on the right track with this, and I hope that the few remaining technical hurdles will be overcome in the near future.</p>
<p><em>Full disclosure: I have been working with the staff from Nature Publishing Group trying to solve technical challenges on Elucian Islands.</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/355/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/355/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/355/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/355/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/355/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/355/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/355/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/355/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/355/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/355/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/355/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/355/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/355/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/355/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=355&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2009/01/18/editorial-virtual-conferences-in-second-life/feed/</wfw:commentRss>
		<slash:comments>25</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/10/elucian11.png" medium="image">
			<media:title type="html">Elucian Islands 1</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/10/elucian2.png" medium="image">
			<media:title type="html">Elucian Islands 2</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/10/sl_cancer_and_cell_cycle_1.png" medium="image">
			<media:title type="html">Meeting on Cancer and Cell Cycle 1</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/10/sl_cancer_and_cell_cycle_2.png" medium="image">
			<media:title type="html">Meeting on Cancer and Cell Cycle 2</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/12/sl_esrc1.png" medium="image">
			<media:title type="html">ESRC Complexity Research Seminar</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/12/nature_podcast.png" medium="image">
			<media:title type="html">Nature Podcast at Elucian Islands</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: Four complementary yeast interactomes</title>
		<link>http://larsjuhljensen.wordpress.com/2008/10/04/analysis-four-complementary-yeast-interactomes/</link>
		<comments>http://larsjuhljensen.wordpress.com/2008/10/04/analysis-four-complementary-yeast-interactomes/#comments</comments>
		<pubDate>Sat, 04 Oct 2008 19:16:12 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[protein complexes]]></category>
		<category><![CDATA[protein interactions]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=261</guid>
		<description><![CDATA[The latest issue of Science features a paper by Yu et al. in which they report the results of a comprehensive yeast two-hybrid (Y2H) screen for interactions between budding yeast proteins. Just a few months earlier, Science published a paper by Tarassov et al. that describes a similar screen performed using a novel protein fragment [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=261&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The latest issue of Science features <a href="http://dx.doi.org/10.1126/science.1158684">a paper by Yu et al.</a> in which they report the results of a comprehensive yeast two-hybrid (Y2H) screen for interactions between budding yeast proteins. Just a few months earlier, Science published <a href="http://dx.doi.org/10.1126/science.1153878">a paper by Tarassov et al.</a> that describes a similar screen performed using a novel protein fragment complementation assay (PCA). Peer Bork and I wrote <a href="http://dx.doi.org/10.1126/science.1164801">a Perspectives piece</a> on these two papers, showing that the different assays for detecting protein interactions are complementary in the sense that they capture interactions for different subsets of the proteome. For example, PCA detects many interactions for membrane proteins whereas Y2H detects many interactions for nuclear proteins.</p>
<p>As part of writing the Perspectives piece, I performed numerous analyses that were not included in the final publication, because they were either too technical for a broad audience, not interesting enough to spend valuable space on, or would involve additional figures. Thankfully, my blog imposes no limitations on the number of words or figures (nor is it required that the content is interesting, although that is desirable).</p>
<p>The comparison included, in addition to the two interactomes introduced above, a third interactome that consists of all the high-confidence interactions identified by <a href="http://dx.doi.org/10.1038/nature04532">Gavin et al.</a> and <a href="http://dx.doi.org/10.1038/nature04670">Krogan et al.</a> using the tandem affinity purification (TAP) method. Also included in the comparison (but not in the Perspectives piece) was the literature-curated (LC) set of interactions published by <a href="http://dx.doi.org/10.1186/jbiol36">Reguly et al.</a> in 2006.</p>
<p>The Venn diagram below shows the overlap of the four interactomes in terms of proteins, that is a protein is considered to belong to an interactome if the method in question suggested at least one interaction partner:</p>
<p style="text-align:center;"><img class="aligncenter size-full wp-image-260" title="Venn diagram of the protein overlap of four interactomes" src="http://larsjuhljensen.files.wordpress.com/2008/10/interactomes_venn_orfs.png?w=380" alt=""   /></p>
<p>The numbers outside the ellipses specify the total number of proteins for which a given method identified interactions. Notably, the PCA, Y2H, and TAP interactomes cover only approximately one sixth, one third, and half of the yeast proteome, respectively, despite all three assays having been tested on all yeast ORFs. This suggests that only a fraction of proteins can be targeted with a given assay.</p>
<p>A second way to compare the four interactomes is to count their overlaps in terms of pairs of interacting proteins. To provide additional detail, I distinguished between interactions that are not found in a given interactome because one or both proteins are not covered by the interactome in question (dashed lines in the diagrams), and interactions that were not found despite both proteins being covered (full lines in the diagrams). The Venn diagrams below show all twelve pairwise comparisions of the four interactomes:</p>
<p style="text-align:center;"><img class="aligncenter size-full wp-image-262" title="Venn diagrams of the overlaps in protein pairs" src="http://larsjuhljensen.files.wordpress.com/2008/10/interactomes_venn_pairs.png?w=380" alt=""   /></p>
<p>As expected, the largest overlap is observed when comparing the two largest interactomes (LC and TAP), whereas the smallest overlap is observed when comparing the smallest interactomes (PCA and Y2H). Even if taking into account the differences in terms of protein coverage, however, the the overlaps between the interactomes leave a lot to be desired.</p>
<p>There are several reasons for the poor overlap at the level of pairwise interactions. One is that false positive interactions are unlikely to be reproducible by a different assay. A second is that the assays measure fundamentally different types of interactions: PCA and Y2H measure direct binary interactions between proteins, whereas TAP measures co-complex interactions, that is whether two proteins are part of the same complex or not. This is illustrated in the figure below, which shows the binary and co-complex networks for three different scenarios:</p>
<p style="text-align:center;"><img class="size-full wp-image-276 aligncenter" title="Binary interactions vs. co-complex interactions" src="http://larsjuhljensen.files.wordpress.com/2008/10/binary_vs_cocomplex.png?w=380" alt=""   /></p>
<p>The two types of assays have different strengths and weaknesses. Binary interaction assays can in principle distinguish between the two first complexes, which only differ in that the subunits B and C are in direct contact in first complex but not in the second. However, binary assays are not able to distinguish between the second and the third scenario, that is whether A, B, and C form a single complex (ABC) or two complexes (AB and AC). Conversely, data from co-complex assays are able to answer the latter question but are unable to distinguish between the two first scenarios. The different assays thus complement each other, not only because they are able to interrogate different subsets of the proteome, but also because they provide us with complementary information about the composition and topology of protein complexes.</p>
<p><a href="http://www.webcitation.org/archive?url=http://larsjuhljensen.wordpress.com/2008/10/04/analysis-four-complementary-yeast-interactomes/&amp;title=Analysis:+Four+complementary+yeast+interactomes&amp;author=Jensen,+Lars+Juhl&amp;source=Buried+Treasure"><img src="http://www.webcitation.org/webcite.gif" alt="WebCite" />Cite this post</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/261/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/261/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/261/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/261/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/261/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/261/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/261/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/261/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/261/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/261/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/261/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/261/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/261/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/261/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=261&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2008/10/04/analysis-four-complementary-yeast-interactomes/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/10/interactomes_venn_orfs.png" medium="image">
			<media:title type="html">Venn diagram of the protein overlap of four interactomes</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/10/interactomes_venn_pairs.png" medium="image">
			<media:title type="html">Venn diagrams of the overlaps in protein pairs</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/10/binary_vs_cocomplex.png" medium="image">
			<media:title type="html">Binary interactions vs. co-complex interactions</media:title>
		</media:content>

		<media:content url="http://www.webcitation.org/webcite.gif" medium="image">
			<media:title type="html">WebCite</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: Cell-cycle-regulated proteins are more abundant in haploid relative to diploid cells</title>
		<link>http://larsjuhljensen.wordpress.com/2008/09/30/analysis-cell-cycle-regulated-proteins-are-more-abundant-in-haploid-relative-to-diploid-cells/</link>
		<comments>http://larsjuhljensen.wordpress.com/2008/09/30/analysis-cell-cycle-regulated-proteins-are-more-abundant-in-haploid-relative-to-diploid-cells/#comments</comments>
		<pubDate>Tue, 30 Sep 2008 20:15:57 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[cell cycle]]></category>
		<category><![CDATA[diploid]]></category>
		<category><![CDATA[haploid]]></category>
		<category><![CDATA[proteomics]]></category>
		<category><![CDATA[regulation]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=243</guid>
		<description><![CDATA[Two days ago, Matthias Mann&#8217;s group published a paper in Nature in which they compare the level of individual proteins in haploid relative to diploid budding yeast cells: Comprehensive mass-spectrometry-based proteome quantification of haploid versus diploid yeast Mass spectrometry is a powerful technology for the analysis of large numbers of endogenous proteins. However, the analytical [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=243&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p style="text-align:left;">Two days ago, <a href="http://www.biochem.mpg.de/mann/">Matthias Mann&#8217;s group</a> published <a href="http://doi.dx.org/10.1038/nature07341">a paper in Nature</a> in which they compare the level of individual proteins in haploid relative to diploid budding yeast cells:</p>
<blockquote><p><strong>Comprehensive mass-spectrometry-based proteome quantification of haploid versus diploid yeast<br />
</strong><br />
Mass spectrometry is a powerful technology for the analysis of large numbers of endogenous proteins. However, the analytical challenges associated with comprehensive identification and relative quantification of cellular proteomes have so far appeared to be insurmountable. Here, using advances in computational proteomics, instrument performance and sample preparation strategies, we compare protein levels of essentially all endogenous proteins in haploid yeast cells to their diploid counterparts. Our analysis spans more than four orders of magnitude in protein abundance with no discrimination against membrane or low level regulatory proteins. Stable-isotope labelling by amino acids in cell culture (SILAC) quantification was very accurate across the proteome, as demonstrated by one-to-one ratios of most yeast proteins. Key members of the pheromone pathway were specific to haploid yeast but others were unaltered, suggesting an efficient control mechanism of the mating response. Several retrotransposon-associated proteins were specific to haploid yeast. Gene ontology analysis pinpointed a significant change for cell wall components in agreement with geometrical considerations: diploid cells have twice the volume but not twice the surface area of haploid cells. Transcriptome levels agreed poorly with proteome changes overall. However, after filtering out low confidence microarray measurements, messenger RNA changes and SILAC ratios correlated very well for pheromone pathway components. Systems-wide, precise quantification directly at the protein level opens up new perspectives in post-genomics and systems biology.</p></blockquote>
<p style="text-align:left;">Although the paper focuses on the larger amount of cell-wall proteins and proteins involved in pheromone response in haploid cells, the supplementary tables reveal similar biases for many other functional classes, including nucleosomes and cyclin-dependent kinase inhibitors. As many of these proteins are regulated during the cell cycle, I suspected that cell-cycle-regulated proteins might be more abundant in haploid cells relative to diploid cells.</p>
<p style="text-align:left;">To test this hypothesis, I divided the proteins quantified by the Mann group into two classes: dynamic proteins, which are encoded by genes that are periodically expressed during the cell cycle, and static proteins, which are encoded by genes that are expressed at a constant level (<a href="http://dx.doi.org/10.1126/science.1105103">de Lichtenberg et al., 2005</a>). For each class, I plotted the log<sub>2</sub>-ratios of the protein levels in haploid and diploid cells:</p>
<p style="text-align:left;"><img class="size-full wp-image-246 aligncenter" title="Distribution of log2-ratios for dynamic and static proteins" src="http://larsjuhljensen.files.wordpress.com/2008/09/haploid_diploid.png?w=380" alt=""   /></p>
<p>The plot reeals a quite strong shift of dynamic proteins toward higher log-ratios; this difference is highly significant according to the <a href="http://en.wikipedia.org/wiki/Mann-Whitney_U">Mann-Whitney U test</a> (P &lt; 10<sup>-12</sup>). Proteins encoded by cell-cycle-regulated genes are thus in general more abundant in haploid budding yeast cells than in diploid cells.</p>
<p><em>Full disclosure: I currently collaborate with Matthias Mann and members of his group, and we will soon be colleagues a the Novo Nordisk Foundation Center for Protein Research.<br />
</em></p>
<p><a href="http://www.webcitation.org/archive?url=http://larsjuhljensen.wordpress.com/2008/10/04/analysis-four-complementary-yeast-interactomes/&amp;title=Analysis:+Cell-cycle-regulated+proteins+are+more+abundant+in+haploid+relative+to+diploid+cells&amp;author=Jensen,+Lars+Juhl&amp;source=Buried+Treasure"><img src="http://www.webcitation.org/webcite.gif" alt="WebCite" />Cite this post</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/243/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/243/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/243/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/243/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/243/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/243/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/243/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/243/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/243/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/243/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/243/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/243/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/243/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/243/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=243&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2008/09/30/analysis-cell-cycle-regulated-proteins-are-more-abundant-in-haploid-relative-to-diploid-cells/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/09/haploid_diploid.png" medium="image">
			<media:title type="html">Distribution of log2-ratios for dynamic and static proteins</media:title>
		</media:content>

		<media:content url="http://www.webcitation.org/webcite.gif" medium="image">
			<media:title type="html">WebCite</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: Transcriptional and posttranslational regulation of cell-cycle kinases</title>
		<link>http://larsjuhljensen.wordpress.com/2008/08/31/analysis-transcriptional-and-posttranslational-regulation-of-cell-cycle-kinases/</link>
		<comments>http://larsjuhljensen.wordpress.com/2008/08/31/analysis-transcriptional-and-posttranslational-regulation-of-cell-cycle-kinases/#comments</comments>
		<pubDate>Sun, 31 Aug 2008 16:46:24 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[cell cycle]]></category>
		<category><![CDATA[expression]]></category>
		<category><![CDATA[phosphorylation]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=227</guid>
		<description><![CDATA[Daub and coworkers from Matthias Mann&#8217;s group recently published a paper in Molecular Cell, describing a phosphoproteomics study of kinases during S and M phase of the mitotic cell cycle: Kinase-selective enrichment enables quantitative phosphoproteomics of the kinome across the cell cycle. Protein kinases are pivotal regulators of cell signaling that modulate each other&#8217;s functions [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=227&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Daub and coworkers from <a href="http://www.biochem.mpg.de/mann/">Matthias Mann&#8217;s group</a> recently published <a href="http://dx.doi.org/10.1016/j.molcel.2008.07.007">a paper in Molecular Cell</a>, describing a phosphoproteomics study of kinases during S and M phase of the mitotic cell cycle:</p>
<blockquote><p><strong>Kinase-selective enrichment enables quantitative phosphoproteomics of the kinome across the cell cycle.</strong></p>
<p>Protein kinases are pivotal regulators of cell signaling that modulate each other&#8217;s functions and activities through site-specific phosphorylation events. These key regulatory modifications have not been studied comprehensively, because low cellular abundance of kinases has resulted in their underrepresentation in previous phosphoproteome studies. Here, we combine kinase-selective affinity purification with quantitative mass spectrometry to analyze the cell-cycle regulation of protein kinases. This proteomics approach enabled us to quantify 219 protein kinases from S and M phase-arrested human cancer cells. We identified more than 1000 phosphorylation sites on protein kinases. Intriguingly, half of all kinase phosphopeptides were upregulated in mitosis. Our data reveal numerous unknown M phase-induced phosphorylation sites on kinases with established mitotic functions. We also find potential phosphorylation networks involving many protein kinases not previously implicated in mitotic progression. These results provide a vastly extended knowledge base for functional studies on kinases and their regulation through site-specific phosphorylation.</p></blockquote>
<p>In the study, they identified phosphorylation sites for 219 protein kinases, of which 159 showed differential phosphorylation (at least two-fold induction for at least one site) in S and/or M phase.</p>
<p>My collaborators at <a href="http://www.cbs.dtu.dk">CBS</a> and I have previously shown that transcriptional and posttranslational regulation (for example, phosphorylation by cyclin-dependent kinases) tend to target the same proteins (<a href="http://dx.doi.org/10.1126/science.1105103">de Lichtenberg et al., 2005</a>; <a href="http://dx.doi.org/10.1038/nature05186">Jensen et al., 2006</a>). One should thus expect that the differentially regulated kinases have a tendency to be encoded by periodically expressed genes.</p>
<p>To test this hypothesis, I compared the phosphoproteomics data of Daub et al. to the cell-cycle microarray expression study by Whitfield et al. (2002). I was able to map 132 of the 159 kinases to the microarrays and found that 17 of them are encoded by the top-600 cycling genes. This corresponds to a significant (P &lt; 0.001) two-fold overrepresentation of transcriptional cell-cycle regulation among the genes encoding kinases that are differentially phosphorylated during S and/or M phase.</p>
<p>One could imagine that this trend is not specific to kinases that are differentially phosphorylated during the cell cycle, but that it instead applies to kinases in general. To test this, I also mapped the 60 non-modulated kinases found by Daub et al. to the microarrays (<a href="http://www.molbiolcell.org/cgi/content/full/13/6/1977">Whitfield et al., 2002</a>). Of the 54 kinases that could be mapped, only 3 are encoded by periodically expressed genes, which is almost exactly what is expected by random chance.</p>
<p>I next examined if timing of phosphorylation correlates with the timing of expression of the 17 kinases mentioned above. The kinases can be divided into three classes: phosphorylated in S phase, phosphorylated in M phase, and phosphorylated in both S and M phase. Notably, 13 of the 17 kinases fall in to the M phase class. Looking at the peak times of expression for these (that is when in the cell-cycle the corresponding mRNAs are most highly expressed) reveals that 8 of the 13 kinases are presumably synthesized in M phase only shortly before they become phosphorylated.</p>
<p>In summary, comparison of the phosphoproteomics data from <a href="http://dx.doi.org/10.1016/j.molcel.2008.07.007">Daub et al. (2008)</a> and the microarray expression data from <a href="http://www.molbiolcell.org/cgi/content/full/13/6/1977">Whitfield et al. (2002)</a> supports the view that transcriptional and posttranslational regulation tend to target the same proteins during the mitotic cell cycle. Moreover, it shows that for most of the kinases that are subject to such dual cell-cycle control, both expression and phosphorylation takes place during M phase when the cyclin-dependent kinase activity is maximal.</p>
<p><em>Full disclosure: I currently collaborate with Matthias Mann and members of his group, and we will soon be colleagues a the Novo Nordisk Foundation Center for Protein Research.</em></p>
<p><a href="http://www.webcitation.org/archive?url=http://larsjuhljensen.wordpress.com/2008/08/31/analysis-transcriptional-and-posttranslational-regulation-of-cell-cycle-kinases/&amp;title=Analysis:+Transcriptional+and+posttranslational+regulation+of+cell-cycle+kinases&amp;author=Jensen,+Lars+Juhl&amp;source=Buried+Treasure"><img src="http://www.webcitation.org/webcite.gif" alt="WebCite" />Cite this post</a></p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/larsjuhljensen.wordpress.com/227/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/larsjuhljensen.wordpress.com/227/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/227/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/227/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/227/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/227/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/227/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/227/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/227/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/227/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/227/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/227/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/227/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/227/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/227/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/227/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=227&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2008/08/31/analysis-transcriptional-and-posttranslational-regulation-of-cell-cycle-kinases/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://www.webcitation.org/webcite.gif" medium="image">
			<media:title type="html">WebCite</media:title>
		</media:content>
	</item>
		<item>
		<title>Commentary: On large protein complexes and the essentiality of hubs</title>
		<link>http://larsjuhljensen.wordpress.com/2008/08/02/commentary-on-large-protein-complexes-and-the-essentiality-of-hubs/</link>
		<comments>http://larsjuhljensen.wordpress.com/2008/08/02/commentary-on-large-protein-complexes-and-the-essentiality-of-hubs/#comments</comments>
		<pubDate>Sat, 02 Aug 2008 21:41:38 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Commentary]]></category>
		<category><![CDATA[hubs]]></category>
		<category><![CDATA[networks]]></category>
		<category><![CDATA[protein complexes]]></category>
		<category><![CDATA[protein interactions]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=213</guid>
		<description><![CDATA[In 2001, Jeong and coworkers published a paper in Nature in which they showed that the central proteins in interaction networks, that is the proteins with the highest connectivity, are enriched for essential proteins. This publication has been highly influential as evidenced by the numerous subsequent publications on the importance of &#8220;hub&#8221; proteins. Several hypothesis [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=213&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In 2001, Jeong and coworkers published <a href="http://dx.doi.org/10.1038/35075138">a paper in Nature</a> in which they showed that the central proteins in interaction networks, that is the proteins with the highest connectivity, are enriched for essential proteins. This publication has been highly influential as evidenced by the numerous subsequent publications on the importance of &#8220;hub&#8221; proteins. Several hypothesis have been published that try to explain why hubs are essential, for example that certain protein interactions are essential and that a protein with many interactions is thus more likely to be involved in at least one essential interaction (<a href="http://dx.doi.org/10.1371/journal.pgen.0020088">He and Zhang, 2006</a>).</p>
<p>Yesterday, Zotenko and coworkers published <a href="http://dx.doi.org/10.1371/journal.pcbi.1000140">a paper in PLoS Computational Biology</a> in which they take a closer look at the cause of this phenomenon:</p>
<blockquote><p><strong>Why Do Hubs in the Yeast Protein Interaction Network Tend To Be Essential: Reexamining the Connection between the Network Topology and Essentiality.<br />
</strong><br />
The centrality-lethality rule, which notes that high-degree nodes in a protein interaction network tend to correspond to proteins that are essential, suggests that the topological prominence of a protein in a protein interaction network may be a good predictor of its biological importance. Even though the correlation between degree and essentiality was confirmed by many independent studies, the reason for this correlation remains illusive. Several hypotheses about putative connections between essentiality of hubs and the topology of protein-protein interaction networks have been proposed, but as we demonstrate, these explanations are not supported by the properties of protein interaction networks. To identify the main topological determinant of essentiality and to provide a biological explanation for the connection between the network topology and essentiality, we performed a rigorous analysis of six variants of the genomewide protein interaction network for Saccharomyces cerevisiae obtained using different techniques. We demonstrated that the majority of hubs are essential due to their involvement in Essential Complex Biological Modules, a group of densely connected proteins with shared biological function that are enriched in essential proteins. Moreover, we rejected two previously proposed explanations for the centrality-lethality rule, one relating the essentiality of hubs to their role in the overall network connectivity and another relying on the recently published essential protein interactions model.</p></blockquote>
<p>What Zotenko et al. show is, in other words, that essential hubs tend to be highly connected with each other and hence form large &#8220;Essential Complex Biological Modules&#8221;. <a href="http://dx.doi.org/10.1371/journal.pcbi.1000140.t007">Table 7</a> in their paper lists the Gene Ontology terms associated with these modules; among the recurring themes are &#8220;rRNA metabolic process&#8221;, &#8220;mRNA metabolic process&#8221;, &#8220;RNA splicing&#8221;, &#8220;ribosome biogenesis and assembly&#8221;, and &#8220;proteolysis&#8221;. These Gene Ontology terms obviously correspond to well known protein complexes, namely the RNA polymerases, the spliceosome, the ribosome, and the proteoasome. The analysis of Zotenko et al. thus suggests that the much debated correlation between centrality and essentiality is simply a consequence of the fact that many of the large protein complexes in a eukaryotic cell are essential, which is hardly surprising considering that they have been conserved through more than two billion years of evolution (<a href="http://dx.doi.org/10.1126/science.285.5430.1033">Brocks et al., 1999</a>).</p>
<p><strong>Edit:</strong> For more views on the results of Zotenko et al. see <a href="http://friendfeed.com/e/b330e410-37ac-6eb7-b29a-a0b0c5d417b0/Why-Do-Hubs-in-the-Yeast-Protein-Interaction/">the discussion on FriendFeed</a>.</p>
<p><a href="http://www.webcitation.org/archive?url=http://larsjuhljensen.wordpress.com/2008/08/02/commentary-on-large-protein-complexes-and-the-essentiality-of-hubs/&amp;title=Commentary:+On+large+protein+complexes+and+the+essentiality+of+hubs&amp;author=Jensen,+Lars+Juhl&amp;source=Buried+Treasure"><img src="http://www.webcitation.org/webcite.gif" alt="WebCite" />Cite this post</a></p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/larsjuhljensen.wordpress.com/213/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/larsjuhljensen.wordpress.com/213/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/213/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=213&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2008/08/02/commentary-on-large-protein-complexes-and-the-essentiality-of-hubs/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://www.webcitation.org/webcite.gif" medium="image">
			<media:title type="html">WebCite</media:title>
		</media:content>
	</item>
		<item>
		<title>Live: ISMB 2008 coverage</title>
		<link>http://larsjuhljensen.wordpress.com/2008/07/18/live-ismb-2008-coverage/</link>
		<comments>http://larsjuhljensen.wordpress.com/2008/07/18/live-ismb-2008-coverage/#comments</comments>
		<pubDate>Fri, 18 Jul 2008 12:50:14 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Live]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=204</guid>
		<description><![CDATA[I am now at the ISMB conference from where I will attempt to provide live coverage of the events. To avoid flooding this blog with posts related to the conference, I have set up a separate blog on Tumblr for this purpose. All my posts there will also appear on my FriendFeed.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=204&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I am now at the <a href="http://www.iscb.org/ismb2008/">ISMB conference</a> from where I will attempt to provide live coverage of the events. To avoid flooding this blog with posts related to the conference, I have set up <a href="http://larsjuhljensen.tumblr.com/">a separate blog on Tumblr</a> for this purpose. All my posts there will also appear on <a href="http://friendfeed.com/larsjuhljensen">my FriendFeed</a>.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/larsjuhljensen.wordpress.com/204/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/larsjuhljensen.wordpress.com/204/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/204/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/204/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/204/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/204/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/204/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/204/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/204/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/204/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/204/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/204/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/204/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/204/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/204/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/204/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=204&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2008/07/18/live-ismb-2008-coverage/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>
	</item>
		<item>
		<title>Commentary: Open access equals bulk publishing?</title>
		<link>http://larsjuhljensen.wordpress.com/2008/07/05/commentary-open-access-equals-bulk-publishing/</link>
		<comments>http://larsjuhljensen.wordpress.com/2008/07/05/commentary-open-access-equals-bulk-publishing/#comments</comments>
		<pubDate>Sat, 05 Jul 2008 08:44:36 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Commentary]]></category>
		<category><![CDATA[open access]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=202</guid>
		<description><![CDATA[This week Nature published a News piece by Declan Butler with the rather provocative title &#8220;PLoS stays afloat with bulk publishing&#8221;. Unsurprisingly, this caused a backlash from open-access advocates in general and science bloggers in particular. Jonathan Eisen posted the ironic response &#8220;Only Nature could turn the success of PLoS One into a model of [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=202&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>This week Nature published a News piece by <a href="http://www.nature.com/news/author/Declan+Butler/index.html">Declan Butler</a> with the rather provocative title <a href="http://dx.doi.org/10.1038/454011a">&#8220;PLoS stays afloat with bulk publishing&#8221;</a>. Unsurprisingly, this caused a backlash from open-access advocates in general and science bloggers in particular. Jonathan Eisen posted <a href="http://phylogenomics.blogspot.com/2008/07/only-nature-could-turn-success-of-plos.html">the ironic response</a> &#8220;Only Nature could turn the success of PLoS One into a model of failure&#8221;. For an overview of the many other responses from the blogosphere see <a href="http://scienceblogs.com/clock/2008/07/on_the_nature_of_plos.php">the summary by Coturnix</a> and <a href="http://friendfeed.com/e/e731c5a7-2a55-40bb-895d-75ee14101f9b/PLoS-stays-afloat-with-bulk-publishing/">the long debate</a> on <a href="http://friendfeed.com">FriendFeed</a>.</p>
<p>The core of the criticism by Declan Butler was directed against the business model of the Public Library of Science (PLoS), in particular that a large part of their total income is produced by &#8220;bulk publishing&#8221; in the &#8220;database&#8221; PLoS ONE with only &#8220;light&#8221; peer review. There is no point in denying that PLoS ONE is a major source of income for PLoS, that it publishes many papers, and that it is not a top-tier journal. Still, it is in my view an unnecessary provocation to refer to a journal from a competitor as a &#8220;database&#8221; and between the lines suggest that they do not perform proper peer review.</p>
<p>I have nothing against Nature Publishing Group (NPG) &#8211; they are in my view one of the more progressive publishers with initiative such as <a href="http://www.connotea.org">Connotea</a> and <a href="http://network.nature.com">Nature Network</a>. However, I find the criticism by Declan Butler somewhat unfair, especially considering that NPG also has a considerable number of lower impact journals in their portfolio in addition to their lineup of Nature journals. To illustrate this point, I looked up the impact factors for all the PLoS and NPG journals that I could find (6 and 68, respectively) and plotted the distributions:</p>
<p style="text-align:center;"><img class="size-full wp-image-203 aligncenter" src="http://larsjuhljensen.files.wordpress.com/2008/07/plos_vs_npg.png?w=380" alt=""   /></p>
<p>The average impact factors of the two publishers are remarkably similar 9.19 for PLoS and 9.39 for NPG, but the underlying distributions are very different. Notably, the high average impact factor of NPG&#8217;s journals is due to a fairly small number of journals with impact factors over 20, which are sufficient to offset the large number of journals with impact factors below 5. Consequently, the median impact factors are 9.03 for PLoS and only 4.88 for NPG.</p>
<p>I want to be the first to point out the caveats of this analysis. First, the analysis above did not take into account that each journal does not publish the same number of papers. However, weighting the journals by number of papers when calculating average impact factors shifts the balance in favor of PLoS (9.79 for PLoS vs. 9.46 for NPG). Second, the journal PLoS ONE does not have an impact factor yet and was thus not included in my analysis. Third, the criticism by Declan Butler was mainly targeting the fact that much of PLoS&#8217; revenue is due to PLoS ONE. However, until NPG chooses to make available detailed financial reports like PLoS does, it is impossible to tell how much of their revenue comes from lower-impact journals.</p>
<p>That being said, the business models of PLoS and NPG do not look all that different based on bibliographic metrics alone.</p>
<p><em>Full disclosure: I am an associate editor of PLoS Computational Biology.</em></p>
<p><a href="http://www.webcitation.org/archive?url=http://larsjuhljensen.wordpress.com/2008/07/05/commentary-open-access-equals-bulk-publishing/&amp;title=Commentary:+Open+access+equals+bulk+publishing?&amp;author=Jensen,+Lars+Juhl&amp;source=Buried+Treasure"><img src="http://www.webcitation.org/webcite.gif" alt="WebCite" />Cite this post</a></p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/larsjuhljensen.wordpress.com/202/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/larsjuhljensen.wordpress.com/202/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/202/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/202/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/202/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/202/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/202/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/202/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/202/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/202/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/202/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/202/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/202/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/202/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/202/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/202/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=202&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2008/07/05/commentary-open-access-equals-bulk-publishing/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/07/plos_vs_npg.png" medium="image" />

		<media:content url="http://www.webcitation.org/webcite.gif" medium="image">
			<media:title type="html">WebCite</media:title>
		</media:content>
	</item>
		<item>
		<title>Commentary: Summarizing papers as word clouds</title>
		<link>http://larsjuhljensen.wordpress.com/2008/06/27/commentary-summarizing-papers-as-word-clouds/</link>
		<comments>http://larsjuhljensen.wordpress.com/2008/06/27/commentary-summarizing-papers-as-word-clouds/#comments</comments>
		<pubDate>Fri, 27 Jun 2008 11:50:06 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Commentary]]></category>
		<category><![CDATA[text mining]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=201</guid>
		<description><![CDATA[For use in presentations on literature mining, I did a back-of-the-envelope calculation of how much time I would be able to spend on each new biomedical paper that is published. Assuming that all papers were indexed in PubMed (which they are not) and that I could read papers 24 hours per day all year around [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=201&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>For use in presentations on literature mining, I did a back-of-the-envelope calculation of how much time I would be able to spend on each new biomedical paper that is published. Assuming that all papers were indexed in PubMed (which they are not) and that I could read papers 24 hours per day all year around (which I cannot), the result is that I could allocate approximately 50 seconds per paper. This nicely illustrates the point that no one can keep up with the complete biomedical literature.</p>
<p>When I discovered <a href="http://wordle.net/">Wordle</a>, which can turn any text into a beautiful word cloud, I thus wondered if this visualization method would be useful for summarizing a complete paper as a single figure. To test this, I extracted the complete text of three papers that I coauthored in the NAR database issue 2008. Submitting these to Wordle resulted in the three figures below (click for larger versions):<br />
<a href="http://larsjuhljensen.files.wordpress.com/2008/06/kuh08nar_large.png"><img class="alignnone size-full wp-image-196" src="http://larsjuhljensen.files.wordpress.com/2008/06/kuh08nar_small.png?w=380" alt=""   /></a><br />
<a href="http://larsjuhljensen.files.wordpress.com/2008/06/lin08nar_large.png"><img class="alignnone size-full wp-image-200" src="http://larsjuhljensen.files.wordpress.com/2008/06/lin08nar_small.png?w=380" alt=""   /></a><br />
<a href="http://larsjuhljensen.files.wordpress.com/2008/06/gau08nar_large.png"><img class="alignnone size-full wp-image-194" src="http://larsjuhljensen.files.wordpress.com/2008/06/gau08nar_small.png?w=380" alt=""   /></a></p>
<p>All in all, I think that Wordle does a pretty good job at capturing the essence of each paper: the first cloud shows that <a href="http://stitch.embl.de">STITCH</a> is a database of interactions between proteins and chemicals, the second cloud shows that <a href="http://networkin.info">NetworKIN</a> is a database predictions related to the kinases and phosphorylation, and the third cloud shows that <a href="http://www.cyclebase.org">Cyclebase.org</a> is a database of experiments on gene expression during the cell cycle. However, a paper describing a database might be easier to summarize that a typical research paper.</p>
<p>As a final test, I therefore submitted the complete text from my paper <a href="http://www.landesbioscience.com/journals/cc/article/4537">&#8220;Evolution of Cell Cycle Control &#8211; Same molecular machines, different regulation&#8221;</a>, which describes the somewhat complex concept of <em>just-in-time assembly</em> to Wordle (click for larger version):<br />
<a href="http://larsjuhljensen.files.wordpress.com/2008/06/lic07cellcycle_large.png"><img class="alignnone size-full wp-image-198" src="http://larsjuhljensen.files.wordpress.com/2008/06/lic07cellcycle_small.png?w=380" alt=""   /></a></p>
<p>The result is rather less impressive than for the papers from the NAR database issue. Although the word cloud does contain a good selection of words, it fails to convey the main message. I think a large part of the problem is the splitting of multiwords; for example, &#8220;cell cycle&#8221; becomes two separate terms &#8220;cell&#8221; and &#8220;cycle&#8221;. Another problem is that words from different sections of the paper are mixed, which blurs the messages. These two issues could be solved by 1) detecting multiwords and considering them as single tokens, and 2) sorting the terms according to where in the paper they are mainly used.</p>
<p><a href="http://www.webcitation.org/archive?url=http://larsjuhljensen.wordpress.com/2008/06/27/commentary-summarizing-papers-as-word-clouds/&amp;title=Commentary:+Summarizing+papers+as+word+clouds&amp;author=Jensen,+Lars+Juhl&amp;source=Buried+Treasure"><img src="http://www.webcitation.org/webcite.gif" alt="WebCite" />Cite this post</a></p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/larsjuhljensen.wordpress.com/201/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/larsjuhljensen.wordpress.com/201/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/201/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=201&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2008/06/27/commentary-summarizing-papers-as-word-clouds/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/06/kuh08nar_small.png" medium="image" />

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/06/lin08nar_small.png" medium="image" />

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/06/gau08nar_small.png" medium="image" />

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/06/lic07cellcycle_small.png" medium="image" />

		<media:content url="http://www.webcitation.org/webcite.gif" medium="image">
			<media:title type="html">WebCite</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: Degradation signals correlate with protein half-life</title>
		<link>http://larsjuhljensen.wordpress.com/2008/06/16/analysis-degradation-signals-correlate-with-protein-half-life/</link>
		<comments>http://larsjuhljensen.wordpress.com/2008/06/16/analysis-degradation-signals-correlate-with-protein-half-life/#comments</comments>
		<pubDate>Mon, 16 Jun 2008 17:55:56 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[cell cycle]]></category>
		<category><![CDATA[degradation]]></category>
		<category><![CDATA[half-life]]></category>
		<category><![CDATA[regulation]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=189</guid>
		<description><![CDATA[I yesterday blogged about how the protein half-life data from the O&#8217;Shea lab fit well with my earlier analyses of transcriptional regulation during the budding yeast cell cycle and with the just-in-time assembly hypothesis. However, I have now realized that the same data set can be used to test the validity of the sequence-based predictions [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=189&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://larsjuhljensen.wordpress.com/2008/06/15/analysis-cell-cycle-regulated-genes-encode-short-lived-proteins/">I yesterday blogged</a> about how the <a href="http://dx.doi.org/10.1073/pnas.0605420103">protein half-life data</a> from the <a href="http://www.mcb.harvard.edu/O%27Shea/">O&#8217;Shea lab</a> fit well with my earlier analyses of transcriptional regulation during the budding yeast cell cycle and with the <a href="http://dx.doi.org/10.1126/science.1105103">just-in-time</a> <a href="http://dx.doi.org/10.1038/nature05186">assembly</a> <a href="http://www.landesbioscience.com/journals/delichtenbergCC6-15.pdf">hypothesis</a>. However, I have now realized that the same data set can be used to test the validity of the sequence-based predictions of protein degradation signals that I relied on for the cell-cycle study.</p>
<p>To this end, I divided the budding yeast proteome into six groups: proteins with a D-box, proteins without a D-box, proteins with a KEN-box, proteins without a KEN-box, proteins with a PEST region, and proteins without a PEST region. For each of these six groups of proteins, I simply plotted the distribution of protein half-lives as a histogram:</p>
<p style="text-align:center;"><img class="alignnone size-full wp-image-191 aligncenter" src="http://larsjuhljensen.files.wordpress.com/2008/06/degradation_signals.png?w=380" alt=""   /></p>
<p>The figure shows that for all three degradation signals, proteins with the sequence motif tend to have shorter half-lives than proteins without the motif. These differences are all statistically significant according to the <a href="http://en.wikipedia.org/wiki/Mann-Whitney_U">Mann-Whitney U test</a> (D-box, P &lt; 10<sup>-6</sup>; KEN-box, P &lt; 0.02; PEST region, P &lt; 10<sup>-15</sup>). It is noteworthy that the KEN-box motif gives a far weaker correlation with protein half-live than the two other degradation signals, as it was also the only degradation signal that did not correlate with transcriptional cell-cycle regulation in budding yeast (see supplementary information of <a href="http://dx.doi.org/10.1038/nature05186">Jensen et al., 2006</a>).</p>
<p>In summary, proteins that contain putative degradation signals have significantly shorter half-lives than proteins that do not contain such signals. The only caveat is that long sequences are more likely to match the sequence motifs, and that O&#8217;Shea and colleagues found a negative correlation between sequence length and protein half-life. The correlations described here could thus be a secondary effect; however, it is also possible that the presence of degradation signals in long sequences is the missing explanation for their short half-lives.</p>
<p><a href="http://www.webcitation.org/archive?url=http://larsjuhljensen.wordpress.com/2008/06/16/analysis-degradation-signals-correlate-with-protein-half-life/&amp;title=Analysis:+Degradation+signals+correlate+with+protein+half-life&amp;author=Jensen,+Lars+Juhl&amp;source=Buried+Treasure"><img src="http://www.webcitation.org/webcite.gif" alt="WebCite" />Cite this post</a></p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/larsjuhljensen.wordpress.com/189/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/larsjuhljensen.wordpress.com/189/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/189/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/189/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/189/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/189/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/189/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/189/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/189/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/189/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/189/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/189/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/189/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/189/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/189/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/189/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=189&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2008/06/16/analysis-degradation-signals-correlate-with-protein-half-life/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/06/degradation_signals.png" medium="image" />

		<media:content url="http://www.webcitation.org/webcite.gif" medium="image">
			<media:title type="html">WebCite</media:title>
		</media:content>
	</item>
		<item>
		<title>Analysis: Cell-cycle-regulated genes encode short-lived proteins</title>
		<link>http://larsjuhljensen.wordpress.com/2008/06/15/analysis-cell-cycle-regulated-genes-encode-short-lived-proteins/</link>
		<comments>http://larsjuhljensen.wordpress.com/2008/06/15/analysis-cell-cycle-regulated-genes-encode-short-lived-proteins/#comments</comments>
		<pubDate>Sun, 15 Jun 2008 18:56:53 +0000</pubDate>
		<dc:creator>Lars Juhl Jensen</dc:creator>
				<category><![CDATA[Analysis]]></category>
		<category><![CDATA[cell cycle]]></category>
		<category><![CDATA[degradation]]></category>
		<category><![CDATA[half-life]]></category>
		<category><![CDATA[regulation]]></category>

		<guid isPermaLink="false">http://larsjuhljensen.wordpress.com/?p=185</guid>
		<description><![CDATA[In relation to an entirely different analysis than the one I will describe here, I downloaded the protein half-life data for budding yeast that was published in PNAS by the O&#8217;Shea lab about two years ago: Quantification of protein half-lives in the budding yeast proteome A complete description of protein metabolism requires knowledge of the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=185&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In relation to an entirely different analysis than the one I will describe here, I downloaded the protein half-life data for budding yeast that was <a href="http://dx.doi.org/10.1073/pnas.0605420103">published in PNAS</a> by the <a href="http://www.mcb.harvard.edu/O%27Shea/">O&#8217;Shea lab</a> about two years ago:</p>
<blockquote><p><strong>Quantification of protein half-lives in the budding yeast proteome<br />
</strong></p>
<p>A complete description of protein metabolism requires knowledge<sup> </sup>of the rates of protein production and destruction within cells.<sup> </sup>Using an epitope-tagged strain collection, we measured the half-life<sup> </sup>of &gt;3,750 proteins in the yeast proteome after inhibition<sup> </sup>of translation. By integrating our data with previous measurements<sup> </sup>of protein and mRNA abundance and translation rate, we provide<sup> </sup>evidence that many proteins partition into one of two regimes<sup> </sup>for protein metabolism: one optimized for efficient production<sup> </sup>or a second optimized for regulatory efficiency. Incorporation<sup> </sup>of protein half-life information into a simple quantitative model for protein production improves our ability to predict<sup> </sup>steady-state protein abundance values. Analysis of a simple<sup> </sup>dynamic protein production model reveals a remarkable correlation<sup> </sup>between transcriptional regulation and protein half-life within<sup> </sup>some groups of coregulated genes, suggesting that cells coordinate<sup> </sup>these two processes to achieve uniform effects on protein abundances.<sup> </sup>Our experimental data and theoretical analysis underscore the<sup> </sup>importance of an integrative approach to the complex interplay between protein degradation, transcriptional regulation, and<sup> </sup>other determinants of protein metabolism.</p></blockquote>
<p>The idea that transcriptional regulation goes hand-in-hand with protein degradation is fully consistent with the <a href="http://dx.doi.org/10.1126/science.1105103">just-in-time</a> <a href="http://dx.doi.org/10.1038/nature05186">assembly</a> <a href="http://www.landesbioscience.com/journals/delichtenbergCC6-15.pdf">hypothesis</a>. I thus examined the distributions of protein half-lives for dynamic (i.e. periodically expressed) and static (i.e. not periodically expressed) proteins:</p>
<p style="text-align:center;"><img class="alignnone size-full wp-image-186 aligncenter" src="http://larsjuhljensen.files.wordpress.com/2008/06/halflife_histogram.png?w=380" alt=""   /></p>
<p>The histogram suggests that dynamic proteins are shifted towards shorter half-lives relative to static proteins. The difference is indeed statistically significant according to <a href="http://en.wikipedia.org/wiki/Mann-Whitney_U">the Mann-Whitney U test</a> (P &lt; 10<sup>-4</sup>). This result supports the sequence-based observation that dynamic proteins contain more D-box, KEN-box, and PEST degradation signals than static proteins.</p>
<p>I next tested if the half-life of the dynamic proteins varies during the cell cycle by make scatter plot of the protein half-life as function of the time of peak expression for the corresponding mRNA:</p>
<p style="text-align:center;"><img class="alignnone size-full wp-image-187 aligncenter" src="http://larsjuhljensen.files.wordpress.com/2008/06/peaktime_halflife.png?w=380" alt=""   /></p>
<p>There appears to be no correlation. Together, these analyses indicate that dynamic proteins have shorter half-lives than static proteins, irrespective of when in the cell cycle they are expressed.</p>
<p><a href="http://www.webcitation.org/archive?url=http://larsjuhljensen.wordpress.com/2008/06/15/analysis-cell-cycle-regulated-genes-encode-short-lived-proteins/&amp;title=Analysis:+Cell-cycle-regulated+genes+encode+short-lived+proteins&amp;author=Jensen,+Lars+Juhl&amp;source=Buried+Treasure"><img src="http://www.webcitation.org/webcite.gif" alt="WebCite" />Cite this post</a></p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/larsjuhljensen.wordpress.com/185/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/larsjuhljensen.wordpress.com/185/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/larsjuhljensen.wordpress.com/185/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/larsjuhljensen.wordpress.com/185/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/larsjuhljensen.wordpress.com/185/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/larsjuhljensen.wordpress.com/185/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/larsjuhljensen.wordpress.com/185/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/larsjuhljensen.wordpress.com/185/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/larsjuhljensen.wordpress.com/185/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/larsjuhljensen.wordpress.com/185/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/larsjuhljensen.wordpress.com/185/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/larsjuhljensen.wordpress.com/185/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/larsjuhljensen.wordpress.com/185/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/larsjuhljensen.wordpress.com/185/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/larsjuhljensen.wordpress.com/185/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/larsjuhljensen.wordpress.com/185/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=larsjuhljensen.wordpress.com&amp;blog=2753346&amp;post=185&amp;subd=larsjuhljensen&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://larsjuhljensen.wordpress.com/2008/06/15/analysis-cell-cycle-regulated-genes-encode-short-lived-proteins/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Lars</media:title>
		</media:content>

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/06/halflife_histogram.png" medium="image" />

		<media:content url="http://larsjuhljensen.files.wordpress.com/2008/06/peaktime_halflife.png" medium="image" />

		<media:content url="http://www.webcitation.org/webcite.gif" medium="image">
			<media:title type="html">WebCite</media:title>
		</media:content>
	</item>
	</channel>
</rss>
