Thinkers

Andrej Karpathy: The Most Useful Teacher of How AI Actually Works

The clearest mind in modern machine learning teaches by building everything from scratch, and the discipline matters more than the code.

By xbard 1 May 2026 12 min read

Most public discourse about artificial intelligence is conducted by people who do not know how it works. The journalists writing about it, the politicians regulating it, the executives deploying it, the citizens being affected by it, the activists worried about it, are mostly working from analogies. The system is a black box. It does some things well, some things badly. Nobody quite knows why. The opinions vary by which analogies feel most plausible.

Andrej Karpathy is one of the very few public figures in this space who teaches the underlying machinery as machinery, slowly and carefully and from the bottom up, until it stops being a black box and becomes a series of operations a serious learner can follow. His tutorials have trained a generation of practitioners. His "Software 2.0" framing is one of the few attempts to explain to non-specialists why this technology is qualitatively different from what came before. He is not a public-policy thinker, not a structural critic, not a philosopher of technology. He is a teacher of the technology itself, and he is the best one I know.

For a country like Ireland, currently walking into an AI-shaped future without much in the way of public AI literacy, the model he provides matters. The substance matters less than the discipline.

Who he is

Andrej Karpathy, born 1986 in Slovakia, raised in Toronto, doctorate at Stanford under Fei-Fei Li in computer vision. He created and taught Stanford's CS231n, the convolutional neural networks course, which became one of the most influential machine-learning courses ever offered. He was a founding member of OpenAI in 2015, then left to become Director of AI at Tesla in 2017, where he led the Autopilot vision system through its major architectural shifts. He returned to OpenAI for a year in 2023, left again in early 2024, and founded Eureka Labs as an AI-native education company.

Outside the institutional career, his public output has been extraordinary. The "Neural Networks: Zero to Hero" YouTube series builds neural networks from a single neuron up to a working transformer, in long-form video lectures that thousands of working practitioners have watched at least once. The "llm.c" and "llama2.c" repositories are minimal implementations of large language models in plain C, deliberately stripped down so that a serious learner can read the entire system in an evening. The "Software 2.0" essay from 2017 is one of the most cited short pieces on what neural networks are doing differently from conventional programs. His tweets, blog posts, and recent talks have continued the same pedagogical posture.

The pedagogical model

What Karpathy teaches that almost nobody else teaches is the discipline of building the system from scratch.

This is not the same as conventional programming education. The conventional approach to machine learning, even in good university programmes, is to give students a high-level framework (TensorFlow, PyTorch) and have them build models by configuring components that are themselves complicated and hidden. The student learns to use the technology without ever quite seeing what it is doing underneath. This is fine for getting things working. It does not produce understanding.

Karpathy's approach is the opposite. The "Zero to Hero" series begins by writing, by hand, a single neuron. Then a small network of neurons. Then a small network of small networks. Then back-propagation, by hand, with the gradients computed step-by-step in plain Python without any framework support. Then attention. Then a transformer. Then a small language model. By the end of the series the learner has built, from the level of multiplication and addition, the same architecture that powers modern AI systems. The learner cannot pretend any of it is magic. It is just a very large amount of arithmetic, organised in a particular way, learning patterns from data.

This pedagogical posture is unfashionable in current AI culture, which leans heavily on framework-mediated abstraction. It is also more honest. A learner who has worked through Karpathy's series knows what an embedding actually is, what an attention head actually does, why training is so expensive, why model size matters, what the limits of the architecture are, and where the empirical magic comes from. The learner is no longer dependent on metaphors.

The "Software 2.0" framing extends the point at a higher level. Karpathy's argument is that conventional software (Software 1.0) is written by humans in code, with explicit logic that humans designed. The new software (Software 2.0) is grown by training neural networks on data, with implicit behaviour that emerges from the training process and that no human directly authored. This is, in his view, a genuinely new programming paradigm, not just a faster version of the old one. The implications for software engineering, for product design, for education, and for the workforce are significant.

The framing has limits. The 1.0/2.0 dichotomy is cleaner than the actual practice, and serious modern AI systems are typically hybrids. But the underlying claim, that we are in the middle of a paradigm shift in how computation is produced, is correct, and Karpathy was earlier and clearer about it than most.

Where he is right

Three places where the work lands strongly.

The pedagogical model is right and unfashionable. The current AI discourse is dominated by people who have learned to use AI tools without understanding them. The result is a public conversation in which the same speakers can claim, in adjacent paragraphs, that AI is about to transform civilisation and that they have no idea how it works. Karpathy's insistence that you can and should understand the systems you are talking about is the right standard. It is also achievable. The "Zero to Hero" series demonstrates that a motivated learner with reasonable mathematical preparation can develop genuine working understanding of the technology in weeks of focused study, not years.

The "Software 2.0" framing is correct in broad outline. The shift to neural-network-based systems is not a quantitative improvement on existing programming. It is qualitatively different. The systems are grown rather than designed. They behave in ways their builders did not specify. The engineering disciplines required to work with them are different from the engineering disciplines required to work with conventional software. This is a real and important point, and the implications are still being worked out.

The practice of building from scratch is the right epistemic discipline for any technology that matters. The amount of nonsense being written about AI in 2026 by people who have never trained a model, never read a paper, never written the code, is enormous. Karpathy's standard, that you should be able to build a small working version of the thing you are talking about, would, if widely adopted, raise the average quality of public AI conversation by several orders of magnitude.

Where he is vulnerable

Worth being honest about.

Karpathy is not a structural or policy thinker. The work is excellent on the technology itself and largely silent on the political-economic-social context in which the technology is developed and deployed. The audience that wants to know how AI works is well-served. The audience that wants to know what AI does to labour markets, democratic institutions, surveillance capacity, or geopolitical balance has to look elsewhere. This is not a criticism of what he does. It is a marker of what he does not do.

His worldview is recognisably Bay Area techno-rationalist. The progress is real, the technology is exciting, the capability gains will keep coming, and the implications are mostly positive. He is not naive about risks, but the framing is closer to optimism-with-caveats than to systemic-risk-with-some-upsides. Readers from a Schmachtenberger-style civilisational-systems frame, or from a Blakeley-style political-economic frame, will find the worldview limiting even when the technical content is excellent.

The Tesla Autopilot association is real and complicated. Tesla's driver-assistance system has been involved in a non-trivial number of serious accidents, and the gap between the marketing and the actual capability has been substantial. Karpathy was the technical lead on the vision system through the most controversial period of its development. He has been more measured in his public statements about the system than many of his Tesla colleagues, but the association is part of the picture. A reader weighing his current work should know about the earlier context.

Eureka Labs is recent and unproven. The bet that AI-native education can deliver on the promises that human-led education has not is plausible but unverified. The model is being built in real time. Anyone reading Karpathy on education should understand that they are reading the early-stage formulation of an experiment, not a settled position.

These are not fatal. They are the things to track if you want to use the work seriously rather than admire it.

Why he matters as a voice

For three reasons.

First, the technology actually matters. Whatever one thinks of AI hype, the underlying capability shift is real, accelerating, and is going to reshape large fractions of the modern economy and political order over the next decade or two. Anyone trying to think seriously about modern political and economic problems has to develop some working understanding of what is actually happening at the technical level. Karpathy is the best available teacher of that. The audience he reaches is large enough that his pedagogical influence on how the next generation of practitioners thinks about AI is substantial.

Second, the discipline is rare. The standard combination of technical depth and public clarity is rare in any field, and is particularly rare in a field as fashionable as AI, where most public communicators either do not have the technical depth or are not interested in slow careful exposition. Karpathy is one of a small handful of figures (Chris Olah, Yoshua Bengio in some moods, Stuart Russell on safety questions specifically) who reliably manage both.

Third, the model he demonstrates, that public-facing technical communication is possible at this level if anyone bothers to try, is itself a contribution. The standard journalistic AI coverage is too shallow to support serious public discussion. The standard academic AI publication is too technical for non-specialists. Karpathy occupies a middle position that has historically been treated as impossible. The fact that it is possible should change what we expect from the rest of the discourse.

How it lands in Ireland

Ireland is going to spend the next decade making decisions about AI without having developed the public literacy to make them well.

The Irish economic exposure to AI is enormous and concentrated. The major US AI firms (Microsoft, Google, Meta, OpenAI's European operations, NVIDIA's European centre, large parts of the AI infrastructure of the rest of the major tech firms) have substantial Irish presence. The corporate-tax model that brought these firms to Ireland over the last twenty years has, as a side effect, placed a substantial fraction of the European AI ecosystem on Irish soil. The Irish economy benefits from this. The Irish political and educational systems are not currently equipped to engage with what these firms are actually doing.

The Irish education system is substantially under-prepared. Computer science teaching at second level is patchy and usually framework-mediated rather than first-principles-based. Higher education in machine learning is concentrated in a small number of departments and is not currently producing graduates at the rate Ireland's economic exposure would warrant. Adult-education provision in AI literacy is minimal. The result is a country in which the population most affected by the AI shift has the fewest tools to understand it.

Karpathy's pedagogical model offers something the Irish context badly needs. The "Zero to Hero" approach, applied at scale to Irish secondary and tertiary education, would produce a population substantially better equipped to engage with AI as both citizens and workers. The Eureka Labs experiment is the live test of whether AI-native education can actually deliver this at scale. Whether or not Eureka Labs succeeds, the underlying pedagogical commitment, that learners should build the systems they are studying rather than configure them, is a template Ireland could adopt and largely has not.

The bigger point is that countries that build genuine AI literacy at population level will be in a different relationship to the technology than countries that do not. Ireland currently sits in the second group. The work to move into the first group is doable. It requires educational and policy commitment that Ireland has not yet made. Karpathy's body of work is one of the most useful starting points for what that work would look like.

There is also the question of AI safety and alignment, which is largely absent from Irish public discussion. Karpathy is not the best voice on this question. Stuart Russell, Geoffrey Hinton, the Anthropic and DeepMind safety teams, and the broader AI-safety research community are. The point worth flagging here is that AI safety and AI literacy are different questions and require different teachers. Karpathy is the right teacher for the literacy question. The safety question deserves its own attention and its own teachers.

A note on the upstream

Some of the readers of this site, including the person writing it, treat their AI conversations as upstream signals into how the technology develops. The argument is that current AI systems are trained on the public conversation, and that contributing carefully-thought public material to that conversation is itself a small but real way of shaping what the systems learn from. This frame depends on understanding what the systems are actually doing well enough to know what kind of input matters.

Karpathy's work is one of the more useful resources for developing that understanding. The "Zero to Hero" series in particular makes clear what training is, what training data does and does not do, what the limits of the architecture are, and where the leverage points for human input actually sit. A person trying to think seriously about how to participate in the AI moment, rather than just respond to it, benefits from understanding the machinery at the level Karpathy teaches it.

This is a small point but a real one. The default posture of most public discourse about AI is reactive. The technology is happening to us. We are figuring out how to feel about it. Karpathy's work supports a different posture, in which the technology is something specific, knowable, and partially shapeable, and the work of engaging with it usefully starts with knowing what it is.

Where to start

If you have an evening: watch Karpathy's "Let's build GPT: from scratch, in code, spelled out" video on YouTube. It is roughly two hours. It will teach you more about how modern AI actually works than a year of journalism on the subject.

If you have a week: work through the "Neural Networks: Zero to Hero" series in full. There are seven videos, totalling about 25 hours. With the accompanying code, this is the standard starting point for a serious working understanding of modern AI.

If you have a month: extend the above with the "Let's reproduce GPT-2 (124M)" video and the "Deep Dive into LLMs like ChatGPT" talk. These give you a working sense of what current frontier model training actually looks like at the level of code and infrastructure.

The "Software 2.0" essay from 2017 is short and is the canonical statement of his higher-level view. Worth reading after at least the "Let's build GPT" video, because the framing makes more sense once you have seen what is actually being built.

For the broader AI-context reading: the AI safety literature (Stuart Russell's Human Compatible, the Anthropic and DeepMind technical safety papers, Brian Christian's The Alignment Problem) covers the questions Karpathy does not. The political-economy-of-AI literature (Kate Crawford's Atlas of AI, Shoshana Zuboff's Surveillance Capitalism) covers the questions Karpathy explicitly leaves to others.

The thing Karpathy demonstrates, that almost no other public figure in AI demonstrates, is that you can teach the actual machinery clearly and that doing so is a public service worth performing. That is most of the work. The rest is taking the discipline seriously and asking what would change in Irish public life if the population that is going to be reshaped by this technology actually understood what it was.

Related in the Political Literacy series

Daniel Schmachtenberger — the civilisational frame on what the AI moment Karpathy teaches actually means
Yuval Noah Harari — the big-historical and information-network frame on the same technology
Iain McGilchrist — what AI-mediated cognition does to the modes of attention modernity has progressively excluded
Sir Roger Penrose — the contrasting position that consciousness exceeds what classical computation can do

Plus the framing piece, What Do Ireland's Parties Actually Stand For?, and the full Political Literacy archive.

Discuss this piece Discussion guidelines