Resources for learning about compilers and “How To”
We all use compilers every day, but they still can seem like a mysterious black box at times. In this try! Swift talk, Samuel Giddins builds a tiny compiler for his made-up language 100% from scratch to get a feel for the basics of how compilers work.
We’re going to start by going over the basic structure of a compiler. Then we’re going to build a lexer and a parser for Kaleidoscope. Then we’re going to take that parse data and we’re going to compile it to LLVM Intermediate Representation.
These are things you can do on your own. I’ve arranged them roughly in order of difficulty and time commitment, although of course the language / environment you pick will affect things.
This is the “Kaleidoscope” Language tutorial, showing how to implement a simple language using LLVM components in C++.
This book contains everything you need to implement a full-featured, efficient scripting language. You’ll learn both high-level concepts around parsing and semantics and gritty details like bytecode representation and garbage collection. Your brain will light up with new ideas, and your hands will get dirty and calloused. It’s a blast.
It’s a playground explaining how to create a tiny programming language (Mu).
In my Advanced Compilers course last fall we spent some time poking around in the LLVM source tree. A million lines of C++ is pretty daunting but I found this to be an interesting exercise and at least some of the students agreed, so I thought I’d try to write up something similar. We’ll be using LLVM 3.9, but the layout isn’t that different for previous (and probably subsequent) releases.
Now that a particular code name for Swift has been leaked from a Definitive Source*, I’ll point out a little easter egg I put in: the “magic number” for swiftmodule files is E2 9C A8 0E. Those first three bytes are UTF-8 for ✨ (U+2728 SPARKLES).