Details of CS4201 (Spring 2025)
Level: 4 | Type: Theory | Credits: 4.0 |
Course Code | Course Name | Instructor(s) |
---|---|---|
CS4201 | Information Retrieval and Web Search | Dwaipayan Roy |
Preamble |
---|
Information Retrieval forms the foundation of the modern search engines, and IR (popular acronym for Information Retrieval) is often called as the science behind search. Although IR systems are mostly associated with Web search engines (e.g., Bing, Google, Yandex etc.), there are significant applications of IR in digital library search, patent search, and automatic question-
answering, to name a few. Likewise, IR models (the underlying algorithm behind retrieval systems) are adopted to solve a wide range of problems, such as organizing documents into an ontology, recommending news stories to users, detecting spam, and efficiently address information need to the users. This course will provide an overview of the theory, implementation, and evaluation of IR techniques. In particular, we will explore how search engines work, how they interpret human language, what different users expect from them, how they are evaluated, why they sometimes fail, and how they might be improved in the future. For hands-on experience, we will use PyLucene1, a robust, industry standard search engine with a Python wrapper. |
Syllabus |
---|
Basic idea of Information Retrieval (IR)
Index structures Retrieval Models Probabilistic model for IR Language modeling for IR IR model evaluation Relevance feedback Web search Discussion on different corpora, forums Practical with Lucene (Python wrapper) |
Prerequisite |
---|
Basic concepts of Computer Science and Data Structures (CS3101, CS3201).
Basic probability (conditional probability, Bayes theorem etc.). Programming knowledge for practicals (Programming in Python: knowledge of packages, modules, functions etc.). |
References |
---|
Introduction to Information Retrieval
C. D. Manning, P. Raghavan and H. Schutze ISBN: 978-0-521-86571-5 https://nlp.stanford.edu/IR-book/information-retrieval-book.html Information Retrieval: Implementing and Evaluating Search Engines S. Buttcher, C. L. A. Clarke, G. Cormack. ISBN: 978-0-262-02651-2 http://www.ir.uwaterloo.ca/book/ |
Course Credit Options
Sl. No. | Programme | Semester No | Course Choice |
---|---|---|---|
1 | IP | 2 | Elective |
2 | IP | 4 | Elective |
3 | IP | 6 | Not Allowed |
4 | MP | 2 | Not Allowed |
5 | MP | 4 | Not Allowed |
6 | MR | 2 | Not Allowed |
7 | MR | 4 | Not Allowed |
8 | MS | 10 | Elective |
9 | MS | 4 | Not Allowed |
10 | MS | 6 | Not Allowed |
11 | MS | 8 | Elective |
12 | RS | 1 | Elective |
13 | RS | 2 | Elective |