Module Code - Title:
CS6502
-
APPLIED BIG DATA AND VISUALIZATION
Year Last Offered:
2025/6
Hours Per Week:
Grading Type:
N
Prerequisite Modules:
Rationale and Purpose of the Module:
Introduce students to big data management and associated issues. The topics include overview of the Apache toolset (Hadoop, Spark, and others), distributed file systems, big data programming models, data warehousing and big data security and protection; visualization tools and frameworks.
(Note: module to be offered on MScSE (Data Science option), MScDA (KBS' Data Analytics) and ME (Computer Engineering)
Syllabus:
1. "Big data": meaning and sources; the Vs of big data; data governance: accuracy, availability, usability and security; impacts of big data, industrial and societal.
2. Big data programming frameworks and systems: distributed file systems, scalable computing, the MapReduce programming model, the Spark programming and computing model, overview of the main components of the Hadoop ecosystem.
3. Data warehousing concepts: what is a data warehouse; role of a data warehouse in data management; architecture of a data warehouse; ETL: extraction, transformation, load process, data marts; operational systems vs. data warehouses.
4. Big data security and protection challenges and practices, such as privacy-preserving data composition, encryption, granular access control, user authentication models, endpoint filtering and validation, etc.
5. Relational information in a business context; visualization challenges; graph / network visualization frameworks: Sugiyama and force-directed layout methods.
Learning Outcomes:
Cognitive (Knowledge, Understanding, Application, Analysis, Evaluation, Synthesis)
1. Recognise the technological challenges in big data governance.
2. Summarise the Hadoop ecosystem.
3. Describe the MapReduce programming model.
4. Describe the architectural components of a data warehouse.
5. Discuss the big data security challenges and practices.
Affective (Attitudes and Values)
1. Discuss the impact of big data on industry and society.
2. Recognise the importance of big data security.
3. Awareness of ethical issues associated with use and misuse of big data.
Psychomotor (Physical Skills)
N/A
How the Module will be Taught and what will be the Learning Experiences of the Students:
The module is taught in the form of lectures and lab practice in a computer lab with state-of-the-art software for big data management.
Research Findings Incorporated in to the Syllabus (If Relevant):
Prime Texts:
Valliappa Lakshmanan (2018)
Data Science on the Google Cloud Platform
, O'Reilly Media
Other Relevant Texts:
Tom White (2015)
Hadoop: The Definitive Guide: Storage and Analysis at Internet Scale
, O'Reilly Media
Ben Spivey and Joey Echeverria (2015)
Hadoop Security: Protecting Your Big Data Platform
, O'Reilly Media
Vijay Srinivas Agneeswaran (2014)
Big Data Analytics beyond Hadoop : Real-time Applications with Storm, Spark, and more Hadoop alternatives
, Pearson Education
Programme(s) in which this Module is Offered:
MSSOENTFA - SOFTWARE ENGINEERING
MEECENTFA - ELECTRONIC AND COMPUTER ENGINEERING
MSBUANTFA - BUSINESS ANALYTICS
Semester(s) Module is Offered:
Spring
Module Leader:
sarmad.ali@ul.ie