Page 1 of 1

Module Code - Title:

CS4337 - BIG DATA MANAGEMENT AND SECURITY

Year Last Offered:

2024/5

Hours Per Week:

Lecture

2

Lab

2

Tutorial

0

Other

0

Private

6

Credits

6

Grading Type:

N

Prerequisite Modules:

Rationale and Purpose of the Module:

Introduce students to the challenges and practice in big data management and governance. The topics include overview of the Hadoop ecosystem, distributed file systems, big data programming models, scalable database systems solutions, data warehousing and big data security and protection.

Syllabus:

1. What makes big data "big"; sources of big data; the Vs of big data; data governance: accuracy, availability, usability and security; the impact of big data on industry and society. 2. Big data programming frameworks and systems: distributed file systems, scalable computing, the MapReduce programming model, the Spark programming and computing model, overview of the main components of the Hadoop ecosystem. 3. Database systems for big data: a. Scalable relational database systems: partitioning and sharding; example implementation in a current relational database system. b. NoSQL database systems for big data management: key-value, column-family, document-oriented and graph database systems; case study of a current NoSQL database system. 4. Data warehousing concepts: what is a data warehouse; role of a data warehouse in data management; architecture of a data warehouse; ETL: extraction, transformation, load process, data marts; operational systems vs. data warehouses. 5. Big data security and protection challenges and practices, such as privacy-preserving data composition, encryption, granular access control, user authentication models, endpoint filtering and validation, etc.

Learning Outcomes:

Cognitive (Knowledge, Understanding, Application, Analysis, Evaluation, Synthesis)

1. Recognise the technological challenges in big data governance. 2. Summarise the Hadoop ecosystem. 3. Describe the MapReduce programming model. 4. Contrast relational databases systems with the variety of NoSQL database systems in terms of scalability. 5. Describe the architectural components of a data warehouse. 6. Discuss the big data security challenges and practices.

Affective (Attitudes and Values)

1. Discuss the impact of big data on industry and society. 2. Recognise the importance of big data security.

Psychomotor (Physical Skills)

N/A

How the Module will be Taught and what will be the Learning Experiences of the Students:

The modules is taught in the form of lectures and lab practice in a computer lab with state-of-the-art software for big data management.

Research Findings Incorporated in to the Syllabus (If Relevant):

Prime Texts:

Tom White (2015) Hadoop: The Definitive Guide: Storage and Analysis at Internet Scale , O'Reilly Media
Ben Spivey and Joey Echeverria (2915) Hadoop Security: Protecting Your Big Data Platform , O'Reilly Media
Pramod J. Sadalage and Martin Fowler (2012) NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence , Addison-Wesley Professional
Nenad Jukic,¿ Susan Vrbsky,¿ Svetlozar Nestorov (2016) Database Systems: Introduction to Databases and Data Warehouses , Prospect Press

Other Relevant Texts:

Vijay Srinivas Agneeswaran (2014) Big data analytics beyond hadoop : real-time applications with storm, spark, and more hadoop alternatives , Pearson Education

Programme(s) in which this Module is Offered:

BSCOSYUFA - COMPUTER SYSTEMS

Semester(s) Module is Offered:

Autumn

Module Leader:

Andrew.ju@ul.ie