About Data Modeling and Management
Objective: The course emphasizes on emerging data models and technologies suitable for managing different types and characteristics of data. Students will develop skills in analyzing, evaluating, modeling and developing database applications with concerns on both technical and business requirements.
Learning Outcomes: Students, on successful completion of the course, will be able to
- Explain data modeling and management concepts.
- Design and organize various types of data using a relational and non-relational data models.
- Analyze the characteristics and requirements of data and select an appropriate data model.
- Identify, implement and perform frequent data operations (CRUD: create, read, update and delete) on relational and NoSQL databases.
- Describe the concepts and the importance of big data, data security, privacy and governance.
- Describe the concepts and the importance of data engineering and data visualization.
- Introduction to Data Modeling and Management
I. Recall: Relational Data Model and Management
- Relational Model Concepts
- Relational Database Management Systems (RDBMSs)
- Entity Relationship Model (ER Model)
- Relational Database Design and Normalization
II. NoSQL Data Modeling and Management
- NoSQL Concepts and Characteristics
- Major Categories of NoSQL Data Models
- NoSQL Database Design
- NoSQL Features and Operations
III. Data Distribution
- Data Sharding and Replication Models
- CAP Theorem
IV. Transaction Processing and Consistency Models
- Transaction Processing Concepts
- ACID Model
- BASE Model
V. Large Scale Data Handling
- Big Data characteristics
- Big Data Modeling and Management
VI. Applications and Case Studies
VII. Data Engineering
- Business Understanding
- Data Acquisition and Understanding
- Data Cleansing
- Data Preparation, Transformation and Feature Engineering
VIII. Introduction to Related Topics
- Data Security
- Data Privacy and Legal Issues
- Data Cleansing
- Data Governance: Social and Ethical Issues, Biasness (gender, religions, etc.)
Laboratory Session(s): 30 hours of laboratory sessions of NoSQL data stores, tools, CRUD operations, and API development/usage.
- Meier and M. Kaufmann: SQL & NoSQL Databases: Models, Languages, Consistency Options and Architectures for Big Data Management, Springer, 2019, ISBN 978-3658245481
- Kleppmann, Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems, O'Reilly, 2017, ISBN 978-1449373320
- Sullivan, NoSQL for Mere Mortals, Addison-Wesley, 2015, ISBN 978-0-1340-2321-2
- Sadalage and M. Fowler, NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence, Addison-Wesley Professional, 2013, ISBN 978-0-3218-2662-6
- Redmond and J. R. Wilson, Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement, 2012, ISBN 978-1-93435-692-0
- Harrison,Next Generation Databases: NoSQL and Big Data, Apress, 2015, ISBN 978-1-4842-1329-2
- Robinson, J. Webber and E. Eifrem, Graph Databases: New Opportunities for Connected Data, 2/E, O’Reilly, 2015, ISBN 978-1-491-93200-1
- Elmasri and S. Navathe: Fundamentals of Database Systems, 7/E, Addison-Wesley, 2015
Journals and Magazines:
- IEEE Transactions on Knowledge and Data Engineering, IEEE
- ACM Transactions on Database Systems, ACM
- ACM Transactions on Information Systems, ACM
Teaching and Learning Methods:
- Discussion and case studies
- Laboratory sessions: Students will be required to perform a series of exercises in data analysis and submit a lab report.
- Homework: Several homework exercises requiring students to apply the knowledge acquired from lecture and discussion will be assigned and graded.
- Project: Students will propose and execute a plan for a significant data modeling project in groups. Students should execute their projects independently under the guidance of the instructor and make a formal presentation of the results.
Time Distribution and Study Load:
- In-class lecture/discussion: 30 hours.
- Laboratory sessions: 45 hours.
- Self study: 30 hours.
- Homework: 30 hours.
- Project work: 30 hours.
- Midterm exam: 20%
- Assignments and laboratory work: 30%
- Project: 25%
- Final examination: 25%
A grade of “A” indicates excellent and insightful understanding of the key concepts and ability to implement sophisticated systems; “B” indicates a good understanding of the key concepts and ability to implement basic techniques; “C” indicates barely acceptable understanding and implementation ability; and “D” indicates poor understanding and implementation ability.