- Aleks Petrov: When Do Promp...
- Edit Event
- Cancel Event
- Preview Reminder
- Send Reminder
- Other events happening in December 2023
Aleks Petrov: When Do Prompting and Prefix-Tuning Work? A Theory of Capabilities and Limitations
Speaker:
Aleks Petrov
Date: Friday, December 08, 2023
Time: 11:00 AM to 12:00 PM Note: all times are in the Eastern Time Zone
Public: Yes
Location: G449 (KIVA)
Event Type:
Room Description:
Host: Aleksander Madry
Contact: Deborah Goodwin, dlehto@csail.mit.edu
Relevant URL:
Speaker URL: None
Speaker Photo:
Reminders to:
seminars@csail.mit.edu
Reminder Subject:
TALK: Aleks Petrov: When Do Prompting and Prefix-Tuning Work? A Theory of Capabilities and Limitations
Abstract:
Large Language Models (LLMs), typically based on transformer architecture, have undergone extensive research aimed at enhancing their scalability and performance. Despite the technological advancements, the resource-intensive nature of training state-of-the-art transformer models limits their accessibility to a broader spectrum of researchers and smaller enterprises. As a result, context-based fine-tuning methods, including prompting, in-context learning, soft prompting (also known as prompt tuning), and prefix-tuning, have gained popularity due to their ability to often match the performance of full fine-tuning with a fraction of the parameters. However, despite their empirical successes, there is little theoretical understanding of how these techniques influence the internal computation of the model and their expressiveness limitations. Our research presents the first theoretical framework for context-based fine-tuning methods. We establish that these methods cannot alter the attention patterns over the content; they can only skew the outputs of an attention layer in a predetermined direction. This inherent limitation restricts the scope of what can be achieved through context-based fine-tuning. While these methods can effectively prompt a model to manifest a skill it has already acquired, combining pre-existing skills presents a more complex challenge and is harder to optimise. Crucially, our analysis demonstrates that context-based fine-tuning methods, including prompting, are incapable of learning entirely new behaviours by the model.
Research Areas:
Impact Areas:
Created by Deborah Goodwin at Thursday, November 30, 2023 at 1:26 PM.