Since Their First Appearance Back in 2017 in The Paper "attention Is All You Need" Vaswani Et Al. 2017, Transformers Became The Standard Model For Almost Every Task in Deep Learning. This Course Will Cover The Components That The Transformers Is Comprised of and The Part Each One Plays in The Complete Function Of The Model As Well As Advanced/sota Transformer-based Methods For Various Tasks. Topics Such As Reinforcement Learning, Multi-modality, Vision, In-context Learning (icl) Will Be Covered From The Viewpoint Of Transformers As Well As Architecture-specific Topics Such As Sparsity And Scalability, Architecture Optimization, As Well As Encoder-decoder Vs. Encoder-only Vs. Decoder-only. Learning Outcomes# at The End of The Course The Studetns Will Be Able To/know# 1. Review The Fundamentals of Deep Learning and Explain The Different Pros and Cons of Different Models in The Context of The Transformer And Attention in General. 2. Train Various Deep Neural Networks and Analyze The Various Challenges That Arise When Using Them. 3. Construct/modify Transformer-based Networks and Apply Them On Different Domains._ 4. Explain The Reasons For Success/failure in a Network From a More Theoretical Points View While Utilizing Fundamental Concepts Such As Manifolds.

Faculty: Computer Science
|Undergraduate Studies |Graduate Studies

Pre-required courses

(236299 - Intr. to Natural Language Processing and 236756 - Introduction to Machine Learning) or (236756 - Introduction to Machine Learning and 236781 - Deep Learning On Computation)


Semestrial Information