Zeinalipour > Courses > EPL646

EPL646: Advanced Topics in Databases

Instructor: Demetris Zeinalipour »
Type: Postgraduate (All Directions)
Prerequisite: EPL342 - DB I (or equivalent)
When: Tue., 15:00-18:00 in ΧΩΔ01-106
Recitation: Tue., 14:00-15:00 in ΘΕΕ01-202
Laboratory: Tue., 18:00-19:30 in ΘΕΕ01-201
Assistant: Christophoros Panayiotou »

Podcast (Course Overview) »


Overview

The main objective of this graduate-level course is to provide an in-depth understanding of advanced concepts and research directions in the field of databases. The course is organized in three parts: (i) Fundamentals of Database Systems Implementation; Compressed Storage for AI Data Processing, IoT Databases, Vector Databases; (ii) Distributed, Web and Cloud Databases; (iii) Spatio-temporal Data Management, Sensor Data Management, other selected and advanced topics from the recent scientific literature.


Content

• Fundamentals of modern Database Management Systems (DBMSs): storage, indexing, query optimization, transaction processing, concurrency and recovery. Row Layout (SQLlite) vs. Column File Layout (Parquet/PAX/ORC/Delta Lake, Pandas/Arrow, Spatial/GeoParquet, DuckDB) vs TimeSeries DBs (LSM/TSM-based/AVRO/Kafka). Vector Databases (Embeedings and Similarity Search using Hierarchical Navigable Small World (HNSW): Chroma/DuckDB Examples, LLMs, RAG and Vector Databases (L) • Fundamentals of Distributed DBMSs, Web Databases and Cloud Databases (NoSQL / NewSQL): Semi-structured data management (XML/JSON, XPath and XQuery), Document data-stores (i.e., CouchDB, MongoDB, RavenDB), Key-Value data-stores (e.g., BerkeleyDB, MemCached), Introduction to Cloud Computing (NFS, GFS/Hadoop HDFS, Replication/Consistency Principles), Big-data processing/analytic frameworks (Apache MapReduce/PIG, Spark/Shark), Column-stores (e.g., Google's BigTable, Apache's HBase, Apache's Cassandra), Graph databases (e.g., Twitter’s FlockDB) and Overview of NewSQL (Google's Spanner/F1). • Spatio-temporal data management (trajectories, privacy, analytics) and index structures (e.g., R-Trees, Grid Files) as well as other selected and advanced topics, including: Embeeded Databases (sqlite), Sensor / Smartphone / Crowd data management, Energy-aware data management, Flash storage, Stream Data Management, etc. The last part of the course will feature both invited talks from external invited speakers and the presentations of students.
Syllabus (in greek) »


News

  • DMAI 2025 We organize the EDBT 2025 Summer School on AI & Data Management, 7th to 11th of July, 2025.
  • Final scheduled on Monday, May 15, 2025 @ 16:30-19:30 in ΧΩΔ01-001. Please always consult the schedule for updated details.
  • AS4 CouchDB assignment posted!
  • Research Papers (AS3) presentation and discussion scheduled for Thursday, May 1, 2025, 15:00-18:00 in ΘΕΕ01-#146. Presentations and papers are available under the reading list.
  • AS3 The technical research paper reading list has been posted. Please follow the instructions posted on Moodle!
  • Midterm scheduled on Tuesday, March 11, 2025, 15:00-16:15 in ΧΩΔ01-106 (A4 paper allowed). Please see the schedule for further details.
  • Overview Papers (AS1) presentation and discussion scheduled for Thursday, March 6, 2025, 15:00-18:00 in ΘΕΕ01-#146. Presentations and papers are available under the reading list.
  • AS2 Storage & DBs (Sqlite, Parquet/Pandas and InfluxDB comparison) assignment has been posted!
  • AS1 The overview research paper reading list has been posted. Please follow the instructions posted on Moodle!
  • ACM SIGMOD Contest this programming contest is the premier world-wide competition in DBs co-organized by MIT. The highest rankings of UCY students in the contest over the years are as follows:
    • 2015: ranked 5th (Topic: Concurrent Transaction Processing and Validation System). Congrats to Lambros Petrou, George Koummeto and Marios Mintzis
    • 2014: ranked 9th (Topic: Social Network Analysis System). Congrats to Lambros Petrou, Marios Mintzis and George Koummeto.
    • 2013: ranked 10th (Topic: Parallel Document Matching with TRIES). Congrats to Lambros Petrou, George Koummetou, George Larkou.
    • 2011: ranked 6th (Topic: A Durable Main-Memory Index Using Flash). Congrats to George Constantinou, Marios Constantinides and Silouanos Nicolaou.
    • 2010: ranked 9th (Topic: Distributed Query Engine). Congrats to Fotos Fragkoudis, Andriani Stylianou, Onisiforos Onisiforou.
  • Welcome to EPL646 (Spring 2025)! Please sign up to our course management platform to access the course forum and assignment submission area.

Schedule »


Laboratory »


Readings »


Assignments

AS1 Overview Paper | Due: W8 | LIST

AS2 Storage Assignment (Pandas/Parquet, SQLite and InfluxDB) | Due: W5 | PDF | CSV

AS3 Research Paper | Due: W13 and W14 LIST

AS4 CouchDB Assignment | Due: W12 | PDF

AS5 MapReduce (Apache Hadoop or Spark) Assignment | Due: W14 | PDF | ZIP


Bibliography

  • Slides Lecture slides and articles from the recent scientific literature.
  • Silberschatz Database System Concepts, 7th Edition, by Abraham Silberschatz, Henry Korth, S. Sudarshan, McGraw Hill; 7th edition, 1376 pages, ISBN-10: 0078022150, 2019.
  • Elmasri Fundamentals of Database Systems, 7/E Ramez Elmasri, Shamkant B. Navathe, ISBN-10: 0133970779, ISBN-13: 9780133970, 2016
  • Abiteboul Web Data Management, Serge Abiteboul, Ioana Manolescu, Philippe Rigaux, Marie-Christine Rousset, Pierre Senellart; ISBN-10: 1107012430, ISBN-13: 978-110701243, Cambridge University Press, 450 pages, (available online), 2011.
  • Özsu Principles of Distributed Database Systems, Özsu, M. Tamer, Valduriez, Patrick, 3rd Edition, 846 p., Springer Press, 2011.
  • Ramakrishnan Database Management Systems: Paperback Edition, 3 Edition, Raghu Ramakrishnan and Johannes Gehrke, McGraw-Hill Publishers, Paper; 1065 pp, ISBN: 0-07-123057-2, 2003.