DPG Phi
Verhandlungen
Verhandlungen
DPG

Berlin 2008 – wissenschaftliches Programm

Bereiche | Tage | Auswahl | Suche | Downloads | Hilfe

BP: Fachverband Biologische Physik

BP 25: Protein Structure and Folding

BP 25.9: Vortrag

Donnerstag, 28. Februar 2008, 16:30–16:45, PC 203

Accurate sequence alignment statistics for different protein models — •Stefan Wolfsheimer1, Inke Herms2, Sven Rahmann3, and Alexander K Hartmann11Institut für Physik, Universität Oldenburg, Germany — 2AG Genominformatik/COMET, Technische Fakultät,Universität Bielefeld, Germany — 3Fachbereich Informatik, TU Dortmund, Germany

Searching for homologous sequences or identifying proteins are well studied fields in bioinformatics. For these purposes a large sequence database is searched with a query by sequence alignment algorithms. The Smith-Waterman algorithm is a famous representative of those. A meaningful interpretation of the score is given by a p-value, which states the probability of the score within a selected null model.

Exact results are only known for gapless alignment of infinitely long uncorrelated protein models, where the amino acids are independent and identically distributed (i.i.d.). For this case a Gumbel distribution is expected. It turned out that real proteins do not fulfill these restrictions: first they are finite and secondly the i.i.d. assumption might not be the best description. Therefore we study more complex systems which incorporate information from secondary structure annotation to obtain a more plausible null model.

By generalized ensemble Monte Carlo simulations we obtain the score distributions down to very small probabilities (p ∼ 10−100 ). We find strong deviations from the expected form in the rare-event tail. Our results indicate that p-values are overestimated in the high scoring regime, when assuming a Gumbel extrapolation.

100% | Mobil-Ansicht | English Version | Kontakt/Impressum/Datenschutz
DPG-Physik > DPG-Verhandlungen > 2008 > Berlin