Module Code - Title:

MN5151 - INFORMATION RETRIEVAL

Year Last Offered:

2025/6

Hours Per Week:

Lecture

2

Lab

0

Tutorial

0

Other

2

Private

6

Credits

6

Grading Type:

N

Prerequisite Modules:

Rationale and Purpose of the Module:

This module introduces students to the fields of Information Retrieval, Information Extraction, and Semantic Web. The module will cover a blend of fundamental concepts and current tools, techniques, and technologies used in modern information retrieval systems.

Syllabus:

The module will cover a blend of fundamental concepts and current tools, techniques, and technologies used in modern information retrieval systems under the 6 headings indicated below. 1. Information Retrieval concepts and models: such as structured vs. unstructured data, the classic search model, term document incidence matrices, inverted index, Query Processing with the Inverted Index, The Boolean Retrieval Model & Extended Boolean models, Phrase Queries and Positional Indexes. 2. Ranked Retrieval Systems: including Scoring with the Jaccard Coefficient, Term Frequency Weighting, Inverse Document Frequency Weighting, TF-IDF Weighting, Vector Space Model, TF-IDF Cosine Similarity, Evaluating Search Engines (precision-recall curve, MAP ,MRR, NDCG). 3. Text Clustering Methods: such as K-means, K-means for text documents, Flat clustering, Hierarchical clustering. 4. Information Extraction Approaches: including Named Entity Recognition, Relation Extraction, Using Patterns to Extract Relations, Semi Supervised and Unsupervised Relation Extraction. 5. Question Answering Approaches: such as Answer Types and Query Formulation, Passage Retrieval and Answer Extraction, Using Knowledge-bases in Question Answering, Answering Complex Questions (query-focused summarization). 6. Semantic Web and Linked Data Approaches: including Taxonomies, Ontologies, Knowledge Graphs, Ontology Querying, Ontology Reasoning, Data Quality and Interlinking.

Learning Outcomes:

Cognitive (Knowledge, Understanding, Application, Analysis, Evaluation, Synthesis)

On successful completion of this module, students will be able to: 1. Implement a simple inverted index and search through it. 2. Implement TF-IDF and cosine similarity to build a simple search engine. 3. Use local full-text search engines such as Apache Lucene, Solr, and Elasticsearch, and cloud-based options such as MeiliSearch and Algolia. 4. Implement a simple K-means text clustering method. 5. Use document clustering engines such as carrot2 and Weka. 6. Implement a simple Named Entity Recognition method (sentence segmentation, tokenization, part of speech tagging, entity detection, relation detection). 7. Store linked data in a triplestore such as Ontotext GraphDB. 8. Use SPARQL to query knowledge bases such as Wikidata.

Affective (Attitudes and Values)

On successful completion of this module, students will be able to: 1. Appreciate the use libraries and cloud-based services for Named Entity Recognition (e.g., GATE, OpenNLP, Spacy, NLTK, Azure Cognitive Services, Watson Natural Language Understanding, TextRazor). 2. Appreciate the use semantic web technologies (RDF, RDFS, OWL, JSON-LD, RDFa and schema.org) to create and publish Linked Data.

Psychomotor (Physical Skills)

On successful completion of this module, students will be able to:

How the Module will be Taught and what will be the Learning Experiences of the Students:

The module will be delivered fully online using on-line lectures, labs and tutorials.

Research Findings Incorporated in to the Syllabus (If Relevant):

Prime Texts:

Manning, C.D., Raghavan, P. & Schütze, H. (2008) Introduction to information retrieval. , Cambridge University Press

Other Relevant Texts:

Allemang, D., Hendler, J., & Gandon, F. (2020) Semantic Web for the Working Ontologist: Effective Modeling for Linked Data, RDFS, and OWL , Association for Computing Machinery

Daniel Jurafsky, James H. Martin. (2021) Speech and Language Processing (3rd Edition) , Stanford

Programme(s) in which this Module is Offered:

MSARINTPA - ARTIFICIAL INTELLIGENCE

Semester(s) Module is Offered:

Autumn

Module Leader:

arash.joorabchi@ul.ie