Wednesday, November 04, 2009

On the veracity and accuracy of published work.

I had blogged earlier about how I was slightly surprised and concerned by my P.I's attitude towards literature, and his lack of faith in it. I am now beginning to see why.

While I am not naive, I admit to being slightly idealistic, and expecting others to adhere to strong scientific standards as I would. Therefore, I would expect, that if a study says it has performed a certain analysis and found something, within that framework, I expect that to be a carefully analysed and reported set of results- even if restricted to the data and conditions the authors used. There really should be no room for ambiguity once you start looking at the data in the context of that paper alone. And if there is, it should be pointed out in the paper. If the authors miss it, I expect peer-review to catch it.

I spent the past few days trying to make sense of a huge study, published in 2006 in the journal PLOS Medicine The list of authors has a handful of big names on it, and the paper itself is one of the first large-scale studies of its kind in its sub-field with some very promising findings, that can very well serve as a starting point for several other investigators such as me.

When I downloaded all the supplementary data (this is data that is not published in the main text of the journal to save on space) and tried to cross-compare different analyses within the paper, a lot of findings were not holding up. This is not to say that the data was wrong or falsified, but that statements made did not match up to the data shown, and several inconsistencies spanned the paper.

Because of the scale of the study,  badly organized and labelled figures and lots of discordance within the published paper, I spent days just making sense of all of it and putting it down in a way I could understand and explain to my P.I. Thats when things began popping up that didn't make sense or were confusing.  I wrote to the first author who responded that I was over-interpreting the data and there were some caveats (that weren't spelled out in the paper). If I set that aside, other discrepancies still existed- and when I pressed some more, leaving out all my interpretation and simply quoting the paper and his data;  I was told that  the individual experiments in the paper spanned a period of 3 to 4 years and during which genome builds/ chip annotations had changed and hence the discordance.  Hence, some aspects were indeed confusing and my best bet was to re-do the analysis to find what I needed, and that is why the raw data files were provided with the paper.

While I understand the dynamic nature of sequence data and chip design, I would expect that such discordance, when reported in a single paper, would be addressed by the authors as the paper is published, and it should not be left to the reader to have to spend precious hours poring over the data trying to make sense of it, engage in communications with the author, and then, eventually, be told to download the raw data and re-analyse it in order to find the answers to their questions.

Even if the authors did a bad job presenting their work in a careless and possibly wrong way (I won't know until i re-analyse ,and I'll be damned if I waste any more of my time on that), what role does peer-review serve? Finally, the scientific community suffers because of time spent in following wrong leads that were not thoroughly researched in the first place and high profile, high impact journals actually publish what is essentially incorrect or invalidated information.


Sakshi said...

Ask me about the HTS data and how much BS passes as true hits sometime. It is a sore point with me.

A lot of BS passes onto high end journals too. Its a loss of time, money and lot of heartache to redo some of the experiments again. Not to mention how no one believes you if you cant get the expt to work - coz the said group got published in N/S/C... Arggh.. I think I will stop here - still very angry about some things :(

P said...

They asked you re-analyze their data???? Wow that is just beyond ridiculous. There should be some place where we can report such misconduct.
It's really surprising how such papers get published, that too in reputed journals. I won't be surprised if one of those big name authors also feature on the editors' list or some such of that journal. Once someone is famous, hardly anyone dares to question them...peer review is just a formality then. When it comes to us newbies then they make us change the language of abstract 3 times to make it 'more attractive to general reader'!

TBIF said...

PLOS is an open-access journal and tries very hard to harness the power of Web 2.0

They have a comment option on their articles to encourage discussion. By their own estimate it is an underused feature. However, in a perfect world, if discrepancies or doubts or questions arose then reader comments would help start a discussion - like on a blog.

Maybe you could try going that route.

The_Girl_From_Ipanema said...

yes, I think the problem is pronounced in HTS. Because of the very nature/scale of the data, people can slip BS in and it escapes notice or gets overlooked. Unfortunately, everything is HTS these days, and that only means we are all swimming in a sea of BS, trying to make sense of it and ending up reinventing the wheel for ourselves, inspite of a thousand papers out there reporting "Candidate genes".

Exactly. I would be so ashamed to have to tell someone to reanalyse data that I published in a journal. They guy told me that in a very pedantic fashion too, as in "you know, you should re-analyze it and i can help u with it but u shd do it because this data is so dynamic".
And I was thinking the same thing- there has to be some redressal- some platform where these things can be voiced.

I have begun the process of annotating the paper on PLOS website-at the very least, someone else going through the paper will not waste his/her time like me, and will be aware of the discrepancies.

Ni said...

err....I actually read all of it, start to end. Of course there is nothing for me to add, because I understand none of it.

aequo animo said...

The small font is bad. yeah, and most journals publish crap too.