Sunday, February 09, 2025

AI: Deep Dive by Andrej Karpathy

Deep Dive into LLMs like ChatGPT - YouTube

This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full training stack of how the models are developed, along with mental models of how to think about their "psychology", and how to get the best use them in practical applications. 

Instructor Andrej was a founding member at OpenAI (2015) and then Sr. Director of AI at Tesla (2017-2022), and is now a founder at Eureka Labs, which is building an AI-native school. 

His goal in this video is to raise knowledge and understanding of the state of the art in AI, and empower people to effectively use the latest and greatest in their work. 




previous videos


Code created in the Neural Networks: Zero To Hero video lecture series, specifically on the first lecture on nanoGPT. Publishing here as a Github repo so people can easily hack it, walk through the git log history of it, etc.


podcast: Science, AI: Into The Impossible, by Brian Keating, Ph.D

Into The Impossible Podcast Episodes

Dr Brian Keating - YouTube


Dr. Brian Keating: Charting the Architecture of the Universe & Human Life - Huberman Lab

Brian Keating, Ph.D., is a cosmologist, a professor of physics at the University of California, San Diego, an author and a public science educator.

Brian Keating - Wikipedia

Brian Gregory Keating is an American cosmologist. He works on observations of the cosmic microwave background, leading the POLARBEAR2 and Simons Array experiments. He also conceived the first BICEP experiment. He received his PhD in 2000, and is a distinguished professor of physics at University of California, San Diego, since 2019. He is the author of two books, Losing The Nobel Prize and Into the Impossible.





Most Influential Papers in Computer Science

The 7 Most Influential Papers in Computer Science History – Terrible Software

“On Computable Numbers, with an Application to the Entscheidungsproblem” (1936)
Author: Alan Turing
=> defined computing

“A Mathematical Theory of Communication” (1948)
Author: Claude Shannon
=> defined information

“A Relational Model of Data for Large Shared Data Banks” (1970)
Author: Edgar F. Codd
=> defined databases

“The Complexity of Theorem-Proving Procedures” (1971)
Author: Stephen A. Cook

“A Protocol for Packet Network Intercommunication” (1974)
Authors: Vinton G. Cerf and Robert E. Kahn
=> internet

“Information Management: A Proposal” (1989)
Author: Tim Berners-Lee
=> web

“The Anatomy of a Large-Scale Hypertextual Web Search Engine” (1998)
Authors: Sergey Brin and Larry Page
=> search

“Recursive Functions of Symbolic Expressions and Their Computation by Machine” (1960) –
John McCarthy

“Go To Statement Considered Harmful” (1968)
Edsger Dijkstra

"Time, Clocks, and the Ordering of Events in a Distributed System” (1978)
Leslie Lamport

“No Silver Bullet—Essence and Accident in Software Engineering” (1986)
Fred Brooks

“Attention Is All You Need” (2017)
Vaswani et al.
=> LLMs AI