Deep Dive into LLMs like ChatGPT with Andrej Karpathy
This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full...
The video demonstrates running large language models Qwen3-235B and GLM 4.5-Air-106B on AMD Ryzen AI MAX "Strix Halo" systems using Linux and up to 128GB unified memory.
Detailed setup guidance, kernel configuration, and unified memory tuning are provided and tested primarily on the HP Z2 Mini G1a, but are applicable to similar Strix Halo systems. Benchmarks, setup scripts, and Fedora-based containers for these AI workloads are shared via a linked GitHub repository.