Tag Archives: statistics

Commentary: The 99% of scientific publishing

Last week, John P. A. Ioannidis from Stanford University and Kevin W. Boyack and Richard Klavans from SciTech Strategies, Inc published an interesting analysis of scientific authorships. In the PLOS ONE paper “Estimates of the Continuously Publishing Core in the Scientific Workforce” they describe a small influential core of <1% of researchers who publish each and every year. This analysis appears to have caught the attention of many, including Erik Stokstad from Science Magazine who wrote the short news story “The 1% of scientific publishing”.

You would be excused to think that I belong to the 1%. I published my first paper in 1998 and have published at least one paper every single year since then. However, it turns out that the 1% was defined as the researchers who had published at least one paper every year in the period 1996-2011. Since I published my first paper in 1998, I belong to the other 99% together with everyone else who started their publishing career after 1996 or stopped their career before 2011.

Although the number 1% is making the headlines, the authors seem to be aware of the issue. Of the 15,153,100 researchers with publications in the period 1996-2011, only 150,608 published all 16 years; however, the authors estimate an additional 16,877 scientists published every year in the period 1997-2012. A similar number of continuously publishing scientists will have started their careers all the other years from 1998-2011. Similarly, they an estimated 9,673 researchers stopped their long continuous publishing career in 2010, and presumably all other years in the period 1996-2009. In my opinion, a better estimate is thus that 150,608 + 15*16,877 + 15*9,673 = 548,858 of the 15,153,100 authors have had or will have a 16-year unbroken chain of publications. That amounts to something in the 3-4% range.

That number may still not sound impressive; however, this in no way implies that most researchers do not publish on a regular basis. To have a 16-year unbroken chain of publications, one almost has to stay in academia and become a principal investigator. Most people who publish at least one article and subsequently pursue a career in industry or teaching will count towards the 96-97%. And that is no matter how good a job they do, mind you.

Advertisements

Commentary: Are other women a woman’s worst enemies in science?

It is clear that in science, we have a gender bias among leaders. It is my impression that most people think this is due to a combination of men and women having different priorities in life and high-ranking male professors favoring their own gender. Conversely, I have never heard anyone dare to suggest that women may be their own worst enemies in this context.

Benenson and coworkers from Emmanuel College have just published an interesting study in Current Biology on collaborations between full professors and assistant professors entitled “Rank influences human sex differences in dyadic cooperation”.

By tabulating the joint publications, they found 76 same-sex publications from male full professors, which should be compared to a random expectation of 61 such publications. By contrast they found only 14 same-sex publications from female full professors with the random expectation being 29. In other words, whereas male full professors collaborated 25% more with male assistant professors than expected, female full professors collaborated more than 50% less with female assistant professors than expected. The authors conclude:

Our results are consistent with observations suggesting that social structure takes differing forms for human males and females. Males’ tendency to interact in same-gender groups makes them more prone to cooperation with asymmetrically ranked males. In contrast, females’ tendency to restrict their same-gender interactions to equally ranked individuals make them more reluctant to cooperate with asymmetrically ranked females.

There is, in other words, a bias towards high-ranking professors of both genders to preferentially collaborate with lower-ranking male professors as opposed to lower-ranking female professors. If anything, that bias appears to be stronger in case of high-ranking female professors than high-ranking male professors.

Analysis: Christmas no longer in vogue!

I have just made an alarming discovery: judging from the biomedical literature, researchers appear to increasingly ignore Christmas.

My plan was to make a funny Christmas post looking at trivialities such as when during the year Christmas-related papers are posted. To this end, I did a trivial text-mining analysis that pulled out all papers mentioning “Christmas”, “Xmas”, or “X-mas” in the title or abstract. As a first check of the data, I looked at how many papes were published each year and was surprised to find only 20-30 in a typical year. To eliminate random fluctuations due to the low counts, I thus binned the data into decades before plotting the temporal trend (black dots are actual data points, red curve is a quadratic trendline):

The shocking result is that the frequency of Christmas-related papers has steadily dropped to less than half of what it was in the 1950s!

How can this be? I can think of several possibilities, and you are welcome to come with more in the comments:

  • We are running out of new funny things to say about Christmas.
  • An increasing proportion of researchers come from countries, in which Christmas is not widely celebrated.
  • Researchers have collectively stopped believing in Santa, as funding has dried up.

Merry Christmas Everyone!

Analysis: Toward doing science

Yesterday, Rangarajan and coworkers published a paper in BMC Bioinformtatics entitled “Toward an interactive article: integrating journals and biological databases”. Not many hours later Neil Saunders made the following tweet commenting on it:

Can we ban use of "toward(s)" in article titles?

This reminded me of a draft blog post that I wrote in 2008 on the use of the word “toward(s)” in article titles, and I decided that it was time to update the plot and finally publish it. The background was that I had the gut feeling that there was a somewhat disturbing trend, namely that more and more papers use these words in the title. I thus went to Medline and counted the fraction of papers from each year having a title starting with “toward” or “towards” (I also included them if towards appeared inside the title following a colon, semicolon, or dash):

The plot shows that fraction of articles with “toward(s)” in the title is rapidly rising; it has more than tripled over the past two decades. There is thus no doubt that the use of “toward(s)” in article titles is a trend in biomedical publishing.

As is often the case with statistics, though, this analysis answers only one question but leads to several new ones. Are we increasingly selling our papers on what we hope to do soon rather than on what we have actually done? Or have we just become more honest by now adding the word “toward(s)” where we might have left it out in the past?

Analysis: Half of published URLs are dysfunctional a decade later

As a small aside when setting up a local mirror of Medline, I extracted 15,915 URLs that were mentioned in the abstracts. Checking them revealed that 12,354 of them (78%) were functional, which may not seem that bad. However, plotting the percentage of dysfunctional URLs as a function of publication year reveals a less pleasant trend:

Dysfunctional URLs

After just 10 years, half of all published URLs are no longer functional, and do not redirect to the new location of the service (if one exists). The fairly high success rate overall is merely a consequence of most URLs having been published within the last few years. Unless the persistence of URLs is improving (which I see no sign of in the plot), we can thus expect to have thousands of URLs in the published literature that are no longer valid.

Edit: Andrew Lang pointed out a similar study of URLs cited in communications journals.

Edit: Duncan Hull pointed out a paper on URL decay in Medline by Jonathan Wren, which reminded me of an even earlier paper on the topic.