视频简介
这场由JR Academy举办的AI公开课由拥有11年软件开发经验、现任AI技术负责人的Emily老师主讲。课程以生动易懂的方式系统讲解了Transformer的原理、架构及在现代AI中的核心地位,帮助学员从零理解大模型(LLM)的底层机制。课程中涵盖从Tokenizer、Word Embedding到Positional Encoding,再到Self-Attention与Multi-Head机制的全流程讲解,详细拆解了Transformer如何实现并行计算与注意力机制。 Emily老师不仅介绍了Transformer在ChatGPT、Claude、DeepSeek等主流大模型中的实际应用,还通过形象类比(如传话游戏)帮助学员理解RNN的局限与Transformer的突破。课程最后还深入分析了Decoder结构、残差连接、前馈网络、以及Softmax归一化等关键模块,帮助学生在面试AI Engineer时能清晰回答“Transformer核心思想与贡献”。本公开课适合希望进入AI领域的开发者、数据科学学习者、以及对人工智能技术体系感兴趣的求职者,是掌握大模型思维与AI核心技术的绝佳入门课程。 This JR Academy open class, “Attention is All You Need: A Guided Introduction to the Transformer Architecture,” is delivered by Emily, an AI Technical Lead at Swarco Austria with over 11 years of software engineering experience. The session provides a clear and structured explanation of the Transformer model, the foundational architecture behind modern AI systems such as ChatGPT, Claude, DeepSeek, and Qianwen. The lecture begins with an accessible introduction to tokenization, word embeddings, and positional encoding, gradually leading learners into the self-attention mechanism and multi-head attention design that revolutionized how neural networks process information. Emily explains complex AI concepts using intuitive analogies—such as comparing RNNs to a “telephone game”—to illustrate the challenges of long-term dependency and how attention mechanisms overcome them. Participants gain an end-to-end understanding of the encoder-decoder structure, residual connections, normalization layers, feed-forward networks, and the Softmax output process. These explanations not only reveal how Transformers enable efficient parallel computation but also prepare students for AI Engineer interviews, where understanding the “core idea and contribution” of the Transformer is essential. Throughout the class, Emily connects theoretical knowledge with practical insights from real-world AI projects, emphasizing why mastering Transformer fundamentals is a must for every AI developer or data professional. Whether you are a beginner exploring machine learning, a software engineer aiming to transition into AI roles, or a student preparing for technical interviews, this session serves as the perfect foundation to understand how modern large language models think, learn, and generate intelligent responses.