With about one month delay relative to the release of the new baseline of PubMed, here is the updated BuzzCloud visualization for what was hot and up-coming 2010 (click image for larger interactive version):
Here is a quick overview of some of the trends that I found interesting:
- Geoepidemiology. A bit of searching in PubMed reveals this to be a buzzword primarily due to the journal Autoimmunity Reviews, which for some reason decided to publish 19 papers with this word in the title in 2010.
- Network pharmacology and systems pharmacology. Due to my personal interests, these buzzword caught my eye although they were mentioned in only 7 and 11 papers from 2010, respectively. I would have been more pleased of one of those had not been in a journal with a history of publishing pseudoscience.
- Metatranscriptomics and viral metagenomics. With metagenomics becoming reality rather than mere buzz, related and derived terms are predictably following suit.
- Orbitrap technology and iTRAQ proteomics. Like metagenomics, large-scale proteomics has become an established field. This is well reflected by two of the best-known proteomics technologies appearing in the 2010 BuzzCloud.
- Astrology. Falling firmly in the “dislike” category, I can only hope that it will be gone in next year’s BuzzCloud.
Sometimes things just come together at the right time. The past few weeks Heiko Horn, Sune Frankild, and I have made much progress on the new version of Reflect, which we hope to put into production very soon. One of the major new features is that Reflect can now be accessed as REST and SOAP web services. When Linden Lab made available the beta version of Second Life viewer 2, which enables you to place a web browser on a face of a 3D object, I simply had to try to put the two together to provide real-time text mining inside Second Life.
The system works as follows. The Reflect Second Life object contains an LSL script that listens to everything that is said in local chat. It sends any text that it picks up to the Reflect REST web service, which returns a simple XML document listing the entities (proteins and small molecules) that were mentioned in the text. The LSL script parses this XML, constructs a URL pointing to the Reflect popup that corresponds to the set of entities in question, and sets this as the shared media to be shown on the Reflect object in Second Life.
The result is an information board that automatically pulls up possibly relevant information related to what people close to it are talking about. The picture below shows the result of me typing a sentence that mentioned human and mouse IL-5 (click for a larger version).
I am well aware that this may not be particularly useful to very many people in Second Life. However, I think it is a nice technology demo of how much can be accomplished with the new Reflect API and just a few lines code.
At the Novo Nordisk Foundation Center for Protein Research we are looking for a scientist to provide bioinformatics support for the Protein Production Unit. For further details, please see the job advert below the fold.
It is that time of the year again: NCBI has rolled out the new PubMed baseline, and it is my pleasure to present you with the latest and greatest of biomedical buzzwords. I present to you the BuzzCloud 2009 (click for a larger interactive version):
In case you have no idea what a BuzzCloud is, it is a visualization of some of the most trendy words in PubMed. To make a long story short, the size of the word represents how many times it was mentioned in the past, whereas the brightness represents how much it was mentioned in the year compared to the previous ten years. For more details, please refer to the original blog post.
The three largest words on the BuzzCloud 2009 are all reruns from earlier years: metagenomics and synthetic biology were both first seen on the BuzzCloud 2004) and click chemistry appeared in 2006. One can only conclude that these research areas continue to grow.
At the other end of the scale we have the small and bright words. These are the words that are rising most rapidly but have not appeared that many times in PubMed yet. Below are three selected examples that I think may be of particular interest to the readership of this blog.
- Personal genomics. No surprise here except that I expected this word would have turned up much earlier considering the broad publicity of the 1000 Genomes Project and the Personal Genome Project.
- Proteogenomics. Why we need a separate word for referring to the combination of proteomics and genomics is beyond me. There is even a paper on comparative proteogenomics published in Genome Research. One can only wonder when someone will compare metabolomics, proteomics, transcriptomics, and genomics data across environmental samples and coin the term comparative metametaboproteotranscriptogenomics.
- Translational bioinformatics. Where bioinformatics meets clinical medicine (see blog post by Russ Altman). I think that bioinformaticians are indeed increasingly working on medically relevant data, which in my view is a good thing. It just makes me wonder what happened to medical informatics?
On a closing note, I am again pleasantly surprised how well the words picked up by a completely automated procedure fit with the ongoing activities in my lab. It is almost eerie.
About half a year ago, I began experimenting with Second Life as a tool for virtual conferences (I should add that my experiences have since improved). However, I believe that imitating real life in a virtual world is not necessarily the best way to use the technology – it may be better to use virtual reality for doing the things that are difficult to do in the real world. A good example of this is Hiro’s Molecule Rezzer, which is one of the best known scientific tools in Second Life. It, and its much improved successor Orac, allows people to easily construct molecular models of small molecules in Second Life.
After speaking with several other researchers in Second Life, who like I are interested in evolution, I set out to build a similar tool for visualization of phylogenetic trees. The result is SLIDR (Second Life Interactive Dendrogram Rezzer), which based on a tree in Newick format constructs a dendrogram object. The first version of SLIDR can handle trees both with and without branch lengths; however, I have not yet implemented support for labels on internal nodes or for bootstrap values.
The picture below shows an example of a dendrogram that was automatically generated by SLIDR based on a Newick tree:
There is a bit more to SLIDR than this, though. After the dendrogram has been built, it can be loaded with a photo and/or a sound for each of the leaf nodes. When click on a node, the corresponding sound will be played and the photo will be shown on the associated screen (the white box in front of which I stand):
I plan to work with collaborators in Second Life to construct dendrograms for evolution of bats (including their echolocation sounds and photos of the animals) and for the fully sequenced Drosophila genomes. Please do hesitate to contact me if you would like to use SLIDR on another project. I intend to make SLIDR available as open source software once I have implemented support for the full Newick format.
Cite this post
After months of hard work from the entire STRING team – thanks everyone – I am pleased to be able to say that STRING v8.1 has now been put into production. Here is a screen shot of the start page:
This is a minor release of STRING, which means that the imported databases of microarray expression data, protein interactions, genetic interactions, and pathways as well as text-mining evidence have all been updated. We have also fixed a bug that affected the minority of bacteria that have multiple chromosomes.
Another notable feature of STRING v8.1 is the new interactive network viewer that is implemented in Adobe Flash:
For further details please see the post on the official STRING/STITCH blog.
Cite this post
Yes, it is that time of the year again – we are now almost three weeks into 2009, most papers published in 2008 have hopefully made it into Medline, and it is time to reveal the words of 2008. In other words, I have updated the BuzzCloud resource and here is the result for 2008 (click on the image to go to the web resource):
I am thrilled to see the outcome. Without any cheating or tweaking, several buzzwords related to proteomics make it on the list with “phosphoproteomics” and “quantitative phosphoproteomics” being the two most prominent of them. Nice for me to see considering that my new research group at the Novo Nordisk Foundation Center for Protein Research will focus heavily on improving and applying the NetworKIN and NetPhorest resources for analysis of phosphoproteomics data.