PhD thesis defense to be held on May 8, 2025 at 13:00 (via internet)


Thesis title: OPTIMISING DATA ANALYSIS THROUGH OFFLINE LARGE LANGUAGE MODELS AND SCALABLE DATA MANAGEMENT TECHNIQUES

Abstract: The importance of data across various sectors demands innovative approaches to data management and analytics. This PhD thesis investigates the integration of offline large language models (LLMs) for automated code generation, aiming to streamline data analysis processes, and thus enhance the scalability and efficiency of data management systems. By leveraging offline LLMs, the proposed approach empowers users to perform data analyses without extensive programming skills, thereby democratizing data analytics. The research delves into the architecture and implementation of scalable data management systems that can efficiently handle datasets of several volumes. Based on an efficient data management platform, the capabilities of offline LLMs to generate analytical code are examined, showcasing how these models can transform user queries into executable scripts that facilitate data manipulation and interpretation. Through experiments and case studies, the practical applications and benefits of the proposed study are showcased. The results highlight the potential of offline Large Language Models in Data Science and Analysis. This thesis contributes to the field by presenting a study that integrates AI-driven code generation with robust data management practices, ultimately paving the way for more efficient and user-friendly data analytics solutions.

Supervisor: Professor Theodora Varvarigou

PhD Student: Anastasios Nikolakopoulos