Multi-Modal Data Lake Analytics

This research project explores innovative approaches to analyzing complex, multi-modal data lakes using the power of Large Language Models (LLMs). The project aims to bridge the gap between different data modalities and enable seamless information retrieval and processing.

Research Objectives

Develop novel architectures for multi-modal data integration
Create efficient algorithms for cross-modal information retrieval
Design scalable solutions for large-scale data lake processing
Implement practical applications in real-world scenarios

Key Technologies

Large Language Models: GPT-based architectures, BERT variants
Computer Vision: ResNet, Vision Transformers, CLIP
Database Systems: Distributed data storage, query optimization
Machine Learning: Deep learning, transfer learning, multi-task learning

Current Progress

Currently working on developing a unified framework that can process text, images, and structured data simultaneously, enabling more comprehensive data analytics and insights generation.

Supervision

This research is conducted under the guidance of Prof. Nan Tang at HKUST(GZ), focusing on advancing the state-of-the-art in data science and analytics.

Share on

Twitter Facebook LinkedIn

JIANG Ziyu