Posts Tagged “alignment”

Man or Machine? Bioinformaticians at McGill university are betting on man. They want to put, what was previously wasted, time on the internet into use. Thus, Phylo was created. That is the name of an online interactive game, aiming to solve the problem of multiple sequence alignments, one that has been agonizing researchers for some time now. The human mind is evolved in a way, that even computers supposedly can’t beat. We are capable of recognizing certain patterns and forming interrelations between them, a skill which numerous lines of codes can not easily accomplish.

So what to do? Once you open the link, go ahead and sign up, although it is possible to play as a guest. But hey, if I am taking time off to contribute to science, I want to be able to brag about it later on. 🙂 The creators of the game have formed a very comprehensive tutorial, explaining how the game works. They use down-to-earth terms and comparisons to simplify matters, so people from all walks of life can jump in as well.

The coloured blocks: Those symbolize the nucleotides. Correspondingly, there are four of them: Orange, Green, Blue, Purple. I wasn’t able to find exactly which colour codes for which nucleotide, something which particularly intrigued me, since purple blocks were scanty in my alignment.

Aim of the game: Our job is to align these blocks, as best as possible, so that the blocks’ colour in the first line are matching those in the second line. Matching blocks gives you a score of 1 point and mismatched ones deduct 1 point. This should be preferably done WITHOUT having to create gaps. They point out that gaps represent the mutations, which the sequences have incurred during evolution. In the easier stages, the sequences are provided on two lines, representing two different species. As it gets more difficult, more lines are provided and related together through a mini-phylogenetic tree, to allow you to pinpoint your priorities. Once you have reached the same score a computer had previously provided “par”, a star will blink to indicate that you are ready to move on, as the alignments are stored in a database for future use.

My experience: I stumbled upon a feature, where you can choose the type of sequences you want to work with. They are arranged according to disease, level ID, or simply random. I chose the blood and immune system disorders and was granted sequences, related to essential thrombocytopenia.

Statistics: At the end, I was provided with the following astonishing numbers. So far, 5344 users have submitted 70196 alignments for 2137 different levels. Personally, I think this number is quite surprising, since that many people are joining in since only November 29th, the date of the official launch.

Interested in more: In the “about” page, the following sentence is provided: “For more information about any one of these topics, click here“.

Tags: , , , , ,

Comments No Comments »

What is bioinformatics?

It can simply be defined as a link between biology and computer science, in which the biological data is processed and computed through software, to yield an output, that is later interpreted in different ways.

Biological data indicates the nucleic acid or protein sequences, their simple or complicated forms, whereas the software is the computer program, specially designed for processing these data in a certain way, done using a certain algorithm (it is a recipe to solve a program problem). The data output is usually numerical or visual (often graphical), but mostly it needs to be well understood. The last one is the key point in the bioinformatics.

What is the need of bioinformatics?

In the research field, we need to be led to certain road, to choose one way or another, or to try many options until we define our research plan. Bioinformatics simply brings the solutions into your hands by a few mouse clicks.

One simple example to make it all clear is the PCR (Polymerase Chain Reaction). We always need to design a primer to trigger our reaction. If we did this through the ordinary ways, we would have to practically try out so many primers and this would surely take a tremendous amount of time. Now, what if you are computer- and internet-literate? You can simply use software to get many primer options for the DNA piece under investigation; doesn’t this save time, efforts and money?

Can bioinformatics be useful in different ways, other than the PCR example?

Some people may think that using bioinformatics is limited to some fields of biological research, and some others might think it is only a matter of prediction, which always needs to be evaluated for its accuracy, specificity and efficiency. But indeed, bioinformatics can be used in the analysis of nucleic acids and proteins.

Analysis?!! That is a vague word, how can you analyze a protein using bioinformatics?

Now you’ll see what bioinformatics can do for protein analysis:

  1. Retrieving protein sequences from different databases, either specialized or general databases and it is not an easy job if you would think so.
  2. Computing a protein or amino acid sequence to obtain:
  • So much of the physicochemical properties of you sequence like the molecular weight, and isoelectric point…etc
  • Hydrophilicity / hydrophobicity ratio

Both of the above can provide us with the probabilities of one protein acting as a receptor on the cell surface or it might be antigenic or even secreted outside the cell.

3. On the prediction aspect, we can predict:

The last two points are applications of what is called structural bioinformatics, through which computer is capable of predicting the 2ry and 3ry (3-D) configuration of your protein, using special programs with advanced algorithms and artificial intelligence. Amazingly, this may be useful in understanding the receptor-substrate interactions.

4. Comparing sequences to obtain the best alignment (it means compare 2 or more sequences to find their relation to each other, i.e. finding similarities and differences), it will help in:

  • Classifying your protein and relate it to its protein family
  • Making your evolutional expectations about your protein to define whether it descends from another protein or not. This is called phylogenetic analysis, at which the proteins under investigation are studied to know which protein is considered a mother to the others, which are the daughter, the grand daughter, and so on
  • Detection of the common domains, this will help us understanding the functions of unknown protein when it is compared to sequences of other proteins of known functions

Then, what will we gain if we compute DNA? Or you can say, what can bioinformatics do for DNA research?

On the same level as with protein, though different applications, we can use it in:

  • Retrieving DNA sequences from different databases
  • Computing a sequence to obtain information about its properties (like proteins) e.g. GC% which could be used with other properties to identify a gene
  • Assembling sequence fragments (usually DNA is sequenced in the form of fragments which are needed to be assembled in the best way, bioinfo. does this in a faster and more accurate way rather than the ordinary assembly)
  • Designing a PCR primer
  • Prediction of DNA and RNA secondary structures (e.g. prediction the stems and loops of the t-RNA)
  • Performing alignments between 2 or more sequences that can lead to many applications (as those mentioned above in protein alignments)
  • Finding of repeats, restriction sites, Single Nucleotide Polymorphism (SNPs), and/or open reading frames, all of which have so huge applications in the medical and paramedical fields and typically in the research activities.

Tags: , , , , , , , , , , , , , , , ,

Comments 1 Comment »

Microbiology, Immunology & Biochemistry Dept.*

Faculty of Pharmacy

Cairo University

Bioinformatics Practical Exam – Winter 2010**
Time allowed: Lab computers will automatically hibernate after 2 hours.***

Target: Assigning the function of the uncharacterized protein O67940_ AQUAE from Aquifex aeolicus ****

A suggested procedure:
1- Get the amino acid sequence of the protein from UniProtKB
— Run it through BLAST to find homologs (related sequences). Do not forget to choose Blastp & PSI-BLAST
— Check the assigned hits (known function & solved crystal structure) which have highest possible similarity (highest score/ highest % id) to your query.
2- Check obtained BLAST alignment of those proteins against your query.
3- Check if the protein belongs to any protein family using PIRSF & COGs
— Check if the protein shares any conserved domain with assigned function using Pfam.
— Using PROSITE the functional site database, check if the protein shares any sequence motifs with other proteins
4- Check if the protein belongs to a superfamily using SCOP database, which provides structural and evolutionary relationships between proteins.
5- As you don’t have the crystal structure of your Aquae protein & you have the structure of the closest assigned protein, use VAST to search & align protein related structures to yours.
6-  Extract homologs.
7- Multiple alignment (structure-guided alignment) using Cn3D
—  Neighbor-joining (NJ) phylogenetic analysis using CDTree
8- Use PDBSum to obtain an overview of the protein–ligand interactions available for your query.
9- Alignment of homologous sequences to identify conserved functional residues.
10- Evidence-based assignment of biological function of query O67940_Aquefix.


* What have I got to lose?!
** I have faith.
*** I can provide that; I know a guy who knows a guy!
**** Frankly, I wanted to pick a different protein, but I hesitated.

Tags: , , , , , , , , , , , , , , , , , , , ,

Comments 9 Comments »