ID 2419: Decoding Foundation Models: Enhancing Understanding and Transparency

This work aims to investigate and analyze the intriguing properties exhibited by foundation models in deep learning, including vision, natural language processing (NLP), and multimodal models. The research intends to provide a comprehensive understanding of these properties to enhance the reliability and transparency of foundation models in various applications.

Requirements

  • Strong background knowledge in deep learning, computer vision, natural language processing, and machine learning.
  • Proficiency in programming languages such as Python and familiarity with popular deep learning frameworks (e.g., PyTorch, TensorFlow).
  • Proficiency in training and evaluating deep learning models on high-performance computing (HPC) systems.

Tasks

  • Review the existing literature on foundation models, focusing on their properties, strengths, limitations, and emerging research trends in vision, NLP, and multimodal domains.
  • Design and implement gradient-based visualization and attribution methods for different types of foundation models.
  • Conduct extensive experiments on diverse datasets, including benchmark datasets (e.g., ImageNet for vision, GLUE for NLP) and domain-specific datasets, to validate the findings and assess the generalizability of the observed properties.
  • Document the research process, results, and insights in a comprehensive thesis report, adhering to the standard thesis writing guidelines.

Supervisors

Please use the application form to apply for the topic. We will then get in contact with you.