Sunday, November 26, 2023

AI: LLMs Intro by Andrej Karpathy, LLM "OS"

llama-2-70b model (open source from Meta) = 140GB params + ~500 lines of .c code (!!!)

 [1hr Talk] Intro to Large Language Models - YouTube

Slides as PDF: https://drive.google.com/file/d/1pxx_... (42MB)

The Unreasonable Effectiveness of Recurrent Neural Networks (from 2015)


Andrej Karpathy - Wikipedia

PhD from Stanford in AI/ML (CNNs)
Lead of self-driving in Tesla
Co-founder of OpenAI


Llama 2 - Meta AI

LLaMA - Wikipedia

facebookresearch/llama: Inference code for LLaMA models @GitHub

book: Thinking, Fast and Slow by Kahneman, Daniel

system 1 vs system 2 "thinking" 

LLMs are currently only "system 1" (fast and unreliable)

future: attempt "system 2 thinking", "trade time for accuracy", take more time for better result.




"self improvement": hard in language domain, no "rules"


"AI apps store"







No comments: