Hauptseminar Bioinformatik
| Veranstaltungstyp | Seminar (2 SWS) |
|---|---|
| ects | 4.0 |
| Dozent | Stefan Kramer |
| Zeit | Montag, 12:00-14:00 |
| Turnus | wöchentlich vom 25.10.2004 bis zum 31.01.2005 |
| Raum | Seminarraum MI 03.13.010 |
| Teilnehmerzahl | 10 |
Vorbesprechung und Platzvergabe:
Montag, 26. Juli 17.00 Seminarraum 03.13.010
Übersicht:
Das Seminar behandelt die Theorie verschiedener Ansätze des maschinellen Lernens und Data Minings sowie deren praktische Anwendung auf typische Fragestellungen der Bioinformatik.
Betreuer:
| Prof. Dr. Stefan Kramer (SK) | kramer@in.tum.de | |
| Dr. Lothar Richter (LR) | richter@arb-home.de | |
| Dipl.-Inf. Ulrich Rückert (UR) | rueckert@in.tum.de |
Themenliste:
| datum | bearbeiter | betreuer | thema | material |
| 25.10.04 | Nkemdilim Uwaje | UR | Hidden Markov Models | Click me! |
| 08.11.04 | Markus Bundschus | UR | Support Vector Machines - Theory | Click me! |
| 15.11.04 | Chad Davis | SK | Support Vector Machines - Applications | Click me! |
| 22.11.04 | Jörg Wicker | SK | Clustering and Applications | Click me! |
| 29.11.04 | Anika Tillich | SK | Frequent Pattern Discovery in Bioinformatics | Click me! |
| 06.12.04 | Timo Duchrow | UR | Information Extraction in Bioinformatics | Click me! |
| 10.01.05 | Kristina Kessler | SK | Relational Data Mining in Bioinformatics | Click me! |
| 17.01.05 | Diana Tanasescu | SK | Data Integration in Bioinformatics | Click me! |
| 24.01.05 | Alexander Platzer | UR, LR | Predicting Genetic Regulatory Response | Click me! |
| 31.01.05 | Stefan Förster | UR, SK | Suffix Arrays | Click me! |
Material:
Neben der jeweils vorgeschlagenen Literatur beinhaltet ein Hauptseminar auch die selbstständige Suche nach weiteren Quellen.
Hidden Markov Models
L.R. Rabiner. A Tutorial on Hidden Markov Models and Selected
Applications in Speech Recognition. Proceedings of the IEEE, 77(2):257 - 286, 1989
R. Durbin. Biological Sequence Analysis: Probabilistic Models of
Proteins and Nucleic Acids. 1998.
S.R. Eddy. Profile Hidden Markov Models. Bioinformatics, 14(9):755-763,
1998.
A. Moore. Statistical Data Mining Tutorials.
http://www-2.cs.cmu.edu/~awm/tutorials/hmm.html
Support Vector Machines - Theory
C.J.C Burges. A Tutorial on Support Vector Machines for Pattern
Recognition. Data Mining and Knowledge Discovery, 2(2):121-167, 1998.
K.-R. Müller, S. Mika, G. Rätsch, K. Tsuda, B. Schölkopf. An
Introduction to Kernel-Based Learning Algorithms. IEEE Neural Networks,
12(2):181-201, 2001.
T. Hastie, R. Tibshirani, J. Friedman. The Elements of Statistical
Learning. Springer Verlag, 2001.
Support Vector Machines - Applications
I. Guyon, J. Weston, S. Barnhill, V. Vapnik. Gene Selection for Cancer
Classification using Support Vector Machines. Machine Learning, 46(1-3):
389-422, 2002.
T. Jaakkola, M. Diekhans, D. Haussler. A Discriminative Framework for
Detecting Remote Protein Homologies. Journal of Computational Biology,
7(1-2): 95-114, 2000.
C. Ding, I. Dubchak. Multi-Class Protein Fold Recognition Using Support
Vector Machines and Neural Networks. Bioinformatics, 17(4):349-358,
2001.
A. Ben-Hur, D. Brutlag. Remote Homology Detection: A Motif-Based
Approach.
In: Proceedings of the 11th International Conference on Intelligent
Systems for Molecular Biology, Bioinformatics 19(Suppl1):i26-i33,
2003.
K.-J. Park, M. Kanehisa. Prediction of Protein Subcellular Locations
by Support Vector Machines Using Compositions of Amino Acids and
Amino Acid Pairs. Bioinformatics, 19(13):1656-1663, 2003.
Clustering and Applications
G. Bejerano, D. Haussler, M. Blanchette. Into the Heart of Darkness:
Large-Scale Clustering of Human Non-Coding DNA. In: Proceedings
of the 12th International Conference on Intelligent Systems for
Molecular Bioology, Bioinformatics, 20(Suppl1):i40-i48, 2004.
D. Hand, H. Mannila, P. Smyth. Principles of Data Mining. MIT Press,
2001.
T. Hastie, R. Tibshirani, J. Friedman. The Elements of Statistical
Learning. Springer Verlag, 2001.
S. Datta, S. Datta. Comparisons and Validation of Statistical Clustering
Techniques for Microarray Gene Expression Data. Bioinformatics,
19(4):459-466, 2003.
Frequent Pattern Discovery
S.-C. Chen, I. Bahar. Mining Frequent Patterns in Protein
Structures: A Study of Protease Families. In: Proceedings of
the 12th International Conference on Intelligent Systems for
Molecular Biology, Bioinformatics, 20(Suppl1):i77-i85, 2004.
M. Koyutürk, A. Grama, W. Szpankowski. An Efficient Algorithm
for Detecting Frequent Subgraphs in Biological Networks.
In: Proceedings of the 12th International Conference on Intelligent
Systems for Molecular Biology, Bioinformatics, 20(Suppl1):i200-207,
2004.
C. Creighton, S. Hanash: Mining Gene Expression Databases for
Association Rules. Bioinformatics, 19:79-86, 2003.
C. Becquet, S. Blachon, B. Jeudy, J.-F. Boulicaut, O. Gandrillon.
Strong-Association-Rule Mining for Large-Scale Gene-Expression Data
Analysis: A Case Study on Human SAGE Data. Genome Biology, 3(12):1-16,
2002.
J. Li, L. Wong. Identifying Good Diagnostic Gene Groups from Gene
Expression Profiles using the Concept of Emerging Patterns.
Bioinformatics, 18:725-734, 2002.
Information Extraction in Bioinformatics
D.E. Appelt. Introduction to Information Extraction. AI Communications,
3:161-172, 1999.
M. Craven, J. Kumlien. Constructing Biological Knowledge Bases by
Extracting Information from Text Sources. in: Proceedings of the Seventh
International Conference on Intelligent Systems for Molecular Biology
(ISMB-99), 77-86, 1999.
T. Ono, H. Hishigaki, A. Tanigami, T. Takagi. Automated Extraction of
Information on Protein-Protein Interactions from the Biological
Literature. Bioinformatics, 17:155-161, 2001.
R. Gaizauskas, G. Demetriou, P. J. Artymiuk, P. Willett. Protein
Structures and Information Extraction from Biological Texts: The PASTA
System. Bioinformatics, 19:135-143, 2003.
Relational Dataming in Bioinformatics
S. Dzeroski, N. Lavrac, (eds). Relational Data Mining. Springer Verlag,
2001.
T. Horvàth, S. Wrobel, U. Bohnebeck Relational Instance-Based Learning
with Lists and Terms, Machine Learning, 43(1/2): 53-80, 2001
U. Bohnebeck, T. Horvàth, W. Sälter, S. Wrobel, D. Blohm. Measuring
Similarity of RNA Structures by Relational Instance-based Learning: A
First Step toward Detecting RNA Signal Structures in Silico. in: O.
Zimmermann, D. Schomburg, (eds.), Proceedings of the German Conference
on Bioinformatics, GCB98, 1998.
Data Integration in Bioinformatics
U. Leser, P. Rieger. Integration molekularbiologischer Daten.
Datenbank Spektrum, 6:56-66, 2003.
U. Leser, M. Lehrach, H. Roest Crollius. Issues in Developing
Integrated Genomic Databases and Application to the
Human X Chromosome. Bioinformatics, 14(7):583-590, 1998.
L.D. Stein. Integrating Biological Databases. Nature Reviews Genetics,
4(5):337-345, 2003.
P. Karp. A Vision of DB Interoperation. Technical Report, SRI
International,
1995.
S. Buckingham, S. Bioinformatics: Data's Future Shock. Nature,
428(6984):774-777, 2004.
Z. Lacroix, T. Critchlow (eds.). Bioinformatics - Managing Scientific
Data. Morgan Kaufmann, 2003.
E.M. Zdobnov, R. Lopez, R. Apweiler, T. Etzold. The EBI SRS Server -
Recent Developments. Bioinformatics, 18(2):368-373, 2002.
T.A. Tatusova, I. Karsch-Mizrachi, J.A. Ostell. Complete Genomes in WWW
Entrez: Data Representation and Analysis. Bioinformatics,
15(7/8):536-543, 1999.
L.M. Haas, P.M. Schwarz, P. Kodali, E. Kotlar, J.E. Rice, W.C. Swope.
DiscoveryLink: A System for Integrated Access to Life Sciences Data
Sources. IBM Systems Journal, 40(2): 489-511, 2001.
L.Wong. Kleisli, a Functional Query System. Journal of Functional
Programming, 10(1):19-56, 2000.
P. Karp. A Strategy for Database Interoperation. Journal of
Computational Biology, 2(4):573-586, 1995.
Predicting Genetic Regulatory Response
M. Middendorf, A. Kundaje, C. Wiggins, Y. Freund, C. Leslie.
Predicting Genetic Regulatory Response Using Classification.
In: Proceedings of the 12th International Conference on Intelligent
Systems for Molecular Biology, Bioinformatics, 20(Suppl1):i232-i240,
2004.
Y. Freund, L. Mason. The Alternating Decision Tree Learning
Algorithm, In: Proceedings of the 16th International Conference
on Machine Learning, pp. 124-133, Morgan Kaufmann, 1999.
Suffix Arrays
M.I. Abouelhoda, E. Ohlebusch, and S. Kurtz.
Optimal Exact String Matching Based on Suffix Arrays.
In: Proceedings of the Ninth International Symposium on String
Processing and Information Retrieval, Springer Verlag, Berlin,
2002.
S. Burkhardt, J. Kärkkäinen.
Fast Lightweight Suffix Array Construction and Checking.
In: Proceedings of the 14th Annual Symposium on Combinatorial
Pattern Matching (CPM'03), Springer Verlag, Berlin, 2003.
D. Gusfield. Algorithms on Strings, Trees and Sequences: Computer
Science and Computational Biology. Cambridge University Press, 1997.
W.-K. Hon, K. Sadakane, W.-K. Sung.
Breaking a Time-and-Space Barrier in Constructing Full-Text Indices.
In: Proceedings of the 44th Symposium on Foundations of Computer
Science (FOCS 2003), pages 251-260. IEEE Computer Society Press, 2003.
P. Ko, A. Aluru. Space Efficient Linear Time Construction of Suffix
Arrays. In: Proceedings of the 14th Annual Symposium on Combinatorial
Pattern Matching (CPM'03), Springer Verlag, Berlin, 2003.T.P. Speed, (ed.) Statistical Analysis of Gene Expression Microarray
Data. Chapman & Hall, 2003.
