Module details on 'Information Retrieval and Web Search Engines - Information Retrieval and Web Search Engines'

CategoryData & Information
LecturerProf. Dr. Balke, Wolf-Tilo (Braunschweig)
Module Exam ID2075
Weekly Composition2L+1E
Required Hours of Work (presence / self-study)125 (42 / 83)
Semesterperiodically, according to student demand and staff specialisms
Teaching Methodsslide presentation, home work, discussions
Module DescriptionThe module gives an introduction to Web Information Retrieval with particular emphasis on the algorithms and technologies used in the modern search engines. It covers an introduction to traditional text IR, including Boolean retrieval, vector space model as well as tolerant retrieval. Afterwards, the technical basics of Web IR are discussed, starting with the Web size estimation and duplicate detection followed by the link analysis and crawling. This leads on to the study of the modern search engine evaluation methods and various test collections. Finally, applications of classification and clustering in the IR domain are discussed. The theoretical basis is illustrated by the examples of the modern search systems, such as Google, Bing, Yahoo Search, Clusty, etc.
Module OutcomesOn completion of this module, the student should be able to • Understand the principles used in the design of the modern search engines, especially with respect to the relevance ranking, indexing and crawling. • Discuss the differences between the traditional text- and Web IR. • Compare the algorithms available to perform relevance ranking on the Web. • Understand the differences between classification and clustering and discuss their applications in IR domain. • Explain the query expansion and reformulation methods. • Understand the principles used in the evaluation of search engines. • Be aware of the query optimization issues.
Recommended LiteratureChristopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008. Because this is a rapidly changing field, some of the newer literature may be announced during the lectures.
PrerequisitesBasic understanding of database technologies. Nevertheless, this module is mostly self-contained and should not pose a problem even if no prior database knowledge is present.
ExamWritten or oral exam, graded (Written (90 min) / Oral (25min))
CommentsE-Teaching Modes: Video Recording, Live transmission to other Universities, Individual Streaming might be possible in the future

