Summer Study Group 2024

Summer Study Group 2024 on LLM, RAG and Multi-Modality (31.07.2024 ~ 29.08.2024)

AI generated image by PromptPerfect .

Learning goals

Are you ineresting in learining the latest topics in Large Language Models (LLM), Retrieval-Augmented Generation (RAG), and Multi-Modality. We are organize this summer study group to keep you at the forefront of these cutting-edge technologies and enhance your knowledge and skills in these rapidly evolving fields.

By attending this study group, you will:

Gain the understanding of Large Language Models and their applications.
Learn about Retrieval-Augmented Generation and how it enhances information retrieval and generation processes.
Understand Vector databases and the concept of similarity search and how vector databases enable efficient nearest neighbor search to find similar items based on vector representations.
Explore the world of Multi-Modality and understand how combining different modes of data can lead to more powerful and versatile AI solutions.

Join us and take the next step in your AI learning journey!

Practical information

This online study group will be conducted via Zoom, with sessions held every Wednesday from 10:30 AM to 12:00 PM, starting July 31, 2024, and concluding on August 29, 2024. The final review and quiz session will take place on August 29th. In response to requests from students at other universities, we have decided to hold the final review meeting online as well.

Please register using the following link by July 15, 2024.

Registration Link

Cost: The attendance of this study group is free to everyone.

Call for volunteer speakers

In each session, we need a volunteer speaker to give a 30-minute talk summarizing the three papers listed in the references. Volunteer speakers can be Master's students, doctoral students, postdoctoral researchers, or senior researchers. If you are interested in volunteering as a speaker, please indicate your intention in the registration form . This is a great opportunity to enhance your presentation skills, share your insights, and contribute to the group's learning.

Schedule

Topics	Time	Speakers	Learning outcomes	Reference 1	Reference 2	Reference 3
Transformer	10.30-12.00, 31.07.2024, Wednesday	Ziqi kang , Download Slides, View recorded presentation	Understand the high-level structure of the Transformer model, including the encoder and decoder components. Grasp the purpose of self-attention mechanisms and how they differ from traditional recurrent neural networks (RNNs) and convolutional neural networks (CNNs). Explore advanced variants of the original Transformer architecture, such as BERT, GPT, and T5, and understand their specific improvements.	Attention is All You Need This paper was the foundation for LLMs with the introduction of the Transformer architecture.	The Annotated Transformer This paper present an annotated version of the above paper in the form of a line-by-line implementation. This document itself is a working notebook, and should be a completely usable implementation.	A Comprehensive Analysis of T5, BERT, and GPT This article introduces early NLP techniques: word embeddings and compares T5, BERT and GPT models.
Large Language Models	10.30-12.00, 07.08.2024, Wednesday	Lauri Seppalainen , Guanghan Wu . Slides: Part 1 , Part 2 . View recorded presentation	Define large Language Models (LLMs) and describe LLM Use cases. Explain Prompt Engineering and describe Generative AI project lifecycle. Understand the scaling laws for Neural Language Models. Understand the main classification and ideas for PEFT algorithms.	Prompt Design and Engineering: Introduction and Advanced Methods This paper introduces core concepts of prompts and prompt engineering, advanced techniques like Chain-of-Thought and Reflection, and the principles behind building LLM-based agent.	Scaling Laws for Neural Language Models This paper studied empirical scaling laws for language model performance on the cross-entropy loss.	Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey Parameter Efficient Fine-Tuning (PEFT) provides a practical solution by efficiently adapt the large models over the various downstream tasks by minimizing the number of additional parameters introduced or computational resources required. This paper presents comprehensive studies of various PEFT algorithms, examining their performance and computational overhead.
Retrieval-Augmented Generation	10.30-12.00, 14.08.2024, Wednesday	Harinda Samarasekara . Download slides . View recorded presentation .	Understand the fundamental concept of Retrieval Augmented Generation, including how it combines retrieval-based methods with generative models to enhance information generation. Explain the stages of involving RAG mainly include pre-training, fine-tuning, and inference. Identify the central technologies integral to the RAG process, specifically on the aspects of retrieval, generation and augmentation. Understand different augmentation process including iterative retrieval, recursive retrieval and adaptive retrieval.	Retrieval-Augmented Generation for Large Language Models: A Survey This survey paper offers a detailedexamination of the progression of RAG paradigms for the Naive RAG, the Advanced RAG, and the Modular RAG.	Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts This paper argues that similarity is not always the panacea for retrieval augmented generation and totally relying on similarity would sometimes degrades the performance of RAG. They propose MetRag, a Multi-layEred Thoughts enhanced Retrieval Augmented Generation framework.	LangChain Library (GitHub) - This library is aimed at assisting in the development of those types of applications, such as simple LLM, Chatbots and other Agents. You can read the documentation here
Vector databases	10.30-12.00, 21.08.2024, Wednesday	Valter Uotila and Quy Anh Nguyen. Download Slides: Part 1 , Part 2 . View recorded presentation .	Understand the concept of similarity search and how vector databases enable efficient nearest neighbor search to find similar items based on vector representations. Identify practical applications of vector databases, such as recommendation systems, voice recognition, image and video similarity search and Chatbots. Gain familiar with generating and using vector embeddings with techniques such as Word2Vec, GloVe, BERT, and other deep learning models. Understand the indexing mechanisms used in vector databases, such as HNSW (Hierarchical Navigable Small World), LSH, and IVFFlat (Inverted File with Flat quantization).	When Large Language Models Meet Vector Databases: A Survey This survey explores the synergistic potential of Large Language Models and Vector Databases.	Vector database management systems: Fundamental concepts, use-cases, and current challenges This paper provides an introduction to the fundamental concepts, use-cases, and current challenges associated with vector database management systems.	Vector Database Management Techniques and Systems This tutorial paper reviews the existing vector database management techniques for queries, storage and indexing.
Multi-Modality	10.30-12.00, 28.08.2024, Wednesday	Lidia Pivovarova, Qinhan Hou, Huaiwu Zhang . Download Slides: Part 1 , Part 2 , Part 3 .	Learn how different modalities are represented and encoded in machine learning models, including techniques for embedding text, images, audio, and video. Understand the various techniques for deep fusing multimodal data, such as encoder-decoder methods, attention mechanism methods, graph neural network methods, generative neural network methods, and other constraint-based methods. Understand the basic principles of multi-modality data management, including the integration and processing of both structured data (e.g., relational databases, spreadsheets) and unstructured data (e.g., text, images, audio, video).	Foundations and Recent Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions This paper defines three key principles of modality heterogeneity, connections, and interactions and propose a taxonomy of six core technical challenges: representation, alignment, reasoning, generation, transference, and quantification for multimodal machine learning.	Deep Multimodal Data Fusion This paper proposes a fine-grained taxonomy grouping the fusion models into five classes: Encoder-Decoder methods, Attention Mechanism methods, Graph Neural Network methods, Generative Neural Network methods, and other Constraint-based methods.	LANISTR: Multimodal Learning from Structured and Unstructured Data This paper proposes an attention-based framework to learn both unstructured data (image, text) and structured data (tabular and time series data). It learns a unified representation through joint pretraining on all the available data with significant missing modalities.
Final review and Quiz	9.30-11.00, 29.08.2024, Thursday	Chunbin Lin , Putian Zhou	Two Keynote Presentations on Industry Applications of Large Language Models (LLMs) through Zoom meeting

Keynote Presentations on Industry Applications of Large Language Models (LLMs) (29.08.2024)

Talk 1:

Title: Visa Genai Platform - Why, What and How

Speaker: Dr. Chunbin Lin

Time: 9:30-10.00 AM , 29.08.2024 (Helsinki time zone)

Abstract: Visa Genai Platform is a secure and scalable service allowing users to query various LLMs and build LLM applications on top of it. It integrates OpenAI models, e.g., GPT-4, GPT-4o, Text-embedding-ada-002, Antropic models, e.g., Claude-3.5-Sonnet, and open-source models, e.g., Mistral 7B. It also offers RAG (Retrieval-Augmented Generation) as a service and Agent as a service. In this talk, I will introduce why we build it, what are the challenges and how we resolve them.

Bio: Chunbin Lin is a senior staff software engineer at Visa, where he leads the Visa GenAI Platform team. Prior to joining Visa, Chunbin held key roles at Amazon AWS, IBM, and Informatica, where he contributed to various advanced projects, including leading the Amazon Redshift workload management team. He earned his Ph.D. in computer science from the University of California, San Diego (UCSD). Chunbin's research interests bridge the fields of distributed systems and machine learning, focusing on (i) applying machine learning techniques to optimize system performance and (ii) developing high-performance machine learning platforms. He has published over 40 papers in top conferences and journals such as SIGMOD, VLDB, and PVLDB, and has served on the program committees for numerous top conferences, including SIGMOD, VLDB, and ICDE.

Talk 2:

Title: LLM applications in a startup business

Speaker: Dr. Putian Zhou

Time: 10:00-10.30 AM , 29.08.2024 (Helsinki time zone)

Abstract: We will introduce two application cases. One is how to improve and extend a traditional service system with the LLMs. Another one is how to make a chatbot matching the requirements of a business company.

Bio: Putian Zhou graduated from the INAR (Institute for Atmospheric and Earth System Research) at the University of Helsinki in 2018, then Putian continued working in research and teaching in INAR. His research topic is to use numerical models to simulate atmospheric chemistry and aerosol processes, applying them to the analysis of climate change in both paleo and future climates. In 2024, during his spare time, Putian co-founded an AI technology company Aitomore with a few friends. They are now collaborating with various businesses, mainly the food service market and online retailers.

Presentation Videos:

This event is connected with University Cooperation Initiative Nordforsk NUEI project .

More information and questions, please contact Prof. Jiaheng Lu , Department of Computer Science, University of Helsinki

Email : Jiaheng.lu.at.helsinki.fi