MICRO WRITERS

About anything that your eyes cannot see… written by students to students

Archive for the “Attended for you” Category

Apr 29 2010

Richard Roberts at BioVision Alexandria 2010: I give you the sequence and you give me the function!

Posted by: Mariam Rizkallah in Attended for you, Bioinformatics

I had the chance to attend the international conference BioVision Alexandria 2010 held at the Bibliotheca Alexandrina Conference Center in Alexandria, Egypt, from 12-15 April 2010. I really want to share with you the >50 talks that I attended, given by Nobel laureates and other remarkable scientists specialized in health-related topics.

Dr. Richard J. Roberts

I will start with this talk by Dr. Richard Roberts, who received the Nobel Prize in Physiology or Medicine in 1993 for the discovery of split genes and mRNA splicing in 1977. He is now joint Research Director at New England Biolabs. Dr. Roberts entitled his talk: “Collaborating to bridge the gap between computation and experimentation”. I will try to sum it up for you.

I. Let’s start with stating this fact that Genomics is rapidly taking over the field of biology, at the research level at least.

Examples:

Sequencing of the human genome or “The Human Genome Project” provides the basis of the emerging field “personalized medicine”.
Plant genomics are unbelievably important for food and –maybe- for energy production purposes, unicellular plants mostly.
Ocean organisms are very interesting, as they produce potential new antibiotics and many other useful substances.
Bacteria and archaea are making up to 50% of the living biomass.

Bacteria are everywhere, they live in the oceans, the soil -plants require them for nitrogen fixation- animals and us; our gut, skin, nose and mouth. Most of these bacteria we know absolutely nothing about because we can’t grow them on cultures.

But this is about to change now thanks to DNA sequencing.

II. So, the core of today’s science is DNA sequencing… but unfortunately, DNA sequencing has its drawbacks.

1) DNA sequencing is getting faster and cheaper in a rate that is exceeding our ability to understand the function or the biochemical pathways of every single gene sequenced. Or, if we’re really lucky, we can make a guess –based on sequence similarity– that this gene, for example, encodes for a “hydrolase”, but just a hydrolase with no clue about the exact biochemical pathways it’s involved in or its substrate.

Dr. Roberts gave this interesting simile that getting more and more DNA sequences of bacteria is like getting a car with a list of all its parts with no idea about how they fit together or how they work. Biology is about understanding how life works. If we’re talking about synthesizing life today, we have to understand how life works first. He dreams that before he dies, he can understand how a very simple bacterium actually works, what is the chemistry that is going on there?

So, the first problem is in the very rapid growth in DNA sequencing without a similar growth in annotation/renaming/finding the function. Here’s a quite older graph showing the growth of sequence databases and annotations from 1982 till 2006, close to the one Dr. Roberts presented, from 1995 till 2009. If you can get to a newer one, please do not hesitate to comment on the post and add its link.

The growth of sequence databases and annotations (1082-2006) - Argonne National Laboratory

2) The computer is not enough! Do the biochemistry in the lab! In spite of the large amount of money spent on sequencing different organisms; we still are not making any progress in understanding them. This might be that when we get the DNA sequence, translate it into its corresponding amino acid sequence, our best shot then is to compare it to the existing protein sequences in the databases to know how it looks like what and thus predict its function. If two protein sequences look the same, there’s a chance, not a guarantee, that they have the same function, because if there’s a one amino acid difference, they may have different substrates and thus different functions. How to tell? The computer is not enough! Do the biochemistry in the lab! This will lead us to the third problem.

3) All substrates are not available to all labs all the time. So, one lab can’t determine the function of all genes on earth. He gave this example: if you want to assay a specific disaccharide hydrolase; to determine its substrate, you need to have disaccharide combinations of all possible sugars and test it on them.

4) Lack of good funding for biochemistry. Funding agencies think that biochem- is an “old-fashioned” field! They are funding the more appealing genome-wide studies, which is very superficial.

III. Dr. Roberts’ suggestions for a solution: “COMBREX”
Identifying Protein Function—A Call for Community Action.

Dr. Roberts and colleagues have got an NIH fund in October 2009 to establish COMBREX (maybe: COMputational Biology Reading EXperiments). The work flow will be very much like this:

Step1: Establishment of a database. From 1200 complete bacterial and archaeal genome sequences, computational biologists groups generate protein families/domains of unknown function (DUFs), predict the function based on sequence similarity and establish a database.

Step2: Coordination of the efforts between biochemistry labs, experimentalists/biochemists (young grads, even technicians) offer a proposal to test those predictions, gain an exclusive access to those genes of interest for 6 months + a small grant (5,000-10,000 USD) to carry out single gene studies. If we know one protein’s function, we know the function of the whole protein family.

Step3: Making of a Wikipedia-type page for suggestions and predictions.

Step4: Establishment of a journal to publish the findings.

IV. What genes should we focus on/start with?

Dr. Roberts suggested this list, which is ordered in a descending order:

1) Genes abundant in many many different organisms; in humans, animals, bacteria… etc. Those are likely to have conserved important functions.

2) E. coli, the most widely used and so-called “the best studied” organism, we can make a full characterization of it.

3) Helicobacter pylori, to understand the biochemistry of such an important pathogen that we know nothing about.

4) Identify cloned, translated and frozen open reading frames (ORFs) products.

V. Who can help?

Dr. Roberts said almost everybody, computational biologists to predict, biochemists to test, geneticists, as personnel university students -even high school students it can help them to get a genuine science project-, retired professors to supervise and maybe get back to the lab, and funding agencies.

You can watch this talk and most of the conference’s talks via the Bibliotheca Alexandrina webcast.

Dr. Roberts' talk at BioVision Alexandria 2010

Richard Roberts with BioVision Alexandria 2010 attendees

Tags: annotation, Bibliotheca Alexandrina, biochemistry, BioVision Alexandria 2010, COMBREX, database, DNA sequencing, domains of unknown function, drawbacks, DUFs, E. coli, Helicobacter pylori, human genome project, New England Biolabs, Nobel laureates, open reading frames, ORFs, Richard Roberts, sequence similarity, synthetic life

No Comments »

Dec 17 2009

Web-based workflow tools and publishing: Article mining is just getting easier!

Posted by: Mariam Rizkallah in Attended for you, Bioinformatics

I had the chance to attend this interesting webinar hosted by Pubget, a new search engine for life-science PDFs. The webinar was held on Friday, December 11, 2009 (you can catch the recording here). There were 160 attendees and the GoToWebinar tool enabled live interaction with the speakers.

The webinar meant to have speakers who are experts in their areas and to cover different segments dealing with searching, analyzing, and reusing scientific articles. The webinar was moderated by Ryan Jones, President of Pubget, and the speakers represented:

Publishers: Peter Binfield, Managing Editor, PLoS One
Libraries: Marcus Banks, Manager of Education and Research Services, UCSF
End Users: Ansuman Chattopadhyay, PhD, Bioinformatics, University of Pittsburgh
Tools: Ramy Arnaout, MD PhD, Chairman and CEO, Pubget

Peter Binfield talked about his experience with PLoS One as a journal established in the digital era, and all of its content is digital. He was much concerned with how to monitor the “reuse” of an article and the tools incorporated in PLoS to achieve that. PLoS uses multi-dimensional, article-level metrics rather than a monolithic system like impact factor. PLoS metrics system enables every one to know the exact usage of an article, downloads and views. PLoS also enables commenting, rating, discussing, selecting a part/line and writing a note about it, sharing/bookmarking, and showing trackbacks to blogs and citations.

Marcus Banks said that the digital “libraries” are still in need of a librarian to analyze, organize and link publications. He also talked about the need of a tool that enables researchers to highlight only the parts of a publication that they need, instead of consuming time reading through the whole publication. He talked about sharing tools like: Zotero, Mendeley, Del.icio.us, RefShare, CiteULike, and Pubget.

Representing the end-users was Ansuman Chattopadhyay on the stand. His presentation was entitled: “Beyond PubMed: Next generation literature searching”. With PubMed, it’s difficult to narrow down your search and reduce the number of the results/hits, but this could be achieved by the newer Google-like tools such as:

GoPubMed, which gives the users suggestions as they are typing
Novoseek, which categorizes search results into: diseases, pharmacological substances, genes/proteins, procedures, organisms, etc.

and text-similarity tools like:

eTBLAST, a web server to identify expert reviewers, appropriate journals and similar publications (the paper)
JANE, Journal/Author Name Estimator
DeepDyve

One point I didn’t get is the need of a “daily journal of negative results”.

Ramy Arnaout presented Pubget as a search tool that is:

like an on-the-web Acrobat Reader (the search results are the PDFs of the papers)
able to deliver science at speed
legal and free, as researchers use their institution’s license to get to all publications including the non-open-access ones
user-friendly, as a user chooses from a list of publications a paper that opens in the same window

The concerns that all four speakers expressed at the end of the webinar were mostly:

How to achieve the balance between delivering science and preserving copyrights, a problem that is being partly solved by Open-Access journals.
How to tell the end-user what is related to his/her field.
Although everything is “online”, the challenge is how to get to it and use it.
How to interact with the end-users and make them discover the tools/features of search engines, this can be solved by workshops and tutorials.

I do thank Pubget for giving me the chance to attend this very informative webinar by making it freely available.

Edited on Dec 22, 2009 09:31 p.m. CLT