Learn how to build and use all parts of real-world compilers, including the frontend, optimization pipeline, and a new backend by leveraging the power of LLVM core libraries
Key FeaturesGet to grips with effectively using LLVM libraries step-by-stepUnderstand LLVM compiler high-level design and apply the same principles to your own compilerAdd a new backend to target an unsupported CPU architectureBook DescriptionLLVM was built to bridge the gap between compiler textbooks and actual compiler development. It provides a modular codebase and advanced tools which help developers to build compilers easily. This book provides a practical introduction to LLVM, gradually helping you navigate through complex scenarios with ease when it comes to building and working with compilers.
You'll start by configuring, building, and installing LLVM libraries, tools, and external projects. Next, the book will introduce you to LLVM's design, unraveling its practical applications in each compiler frontend, optimizer, and backend. Using a real programming language subset, you'll build a frontend, generate LLVM IR, optimize it through the pipeline, and generate machine code from it. Advanced chapters cover how to extend LLVM with a new pass, and how to use LLVM tools to debug and raise the quality of your code. You'll also focus on Just-in-Time compilation issues and the current state of JIT-compilation support that LLVM provides, before finally going on to understand how to develop a new backend for LLVM, which introduces you to the target description and how instruction selection works.
By the end of this book, you'll have practical, hands-on experience with the LLVM compiler development framework through real-world examples and source code snippets.
What you will learnConfigure, compile, and install the LLVM frameworkUnderstand how the LLVM source is organizedDiscover what you need to do to use LLVM in your own projectsExplore how a compiler is structured, and implement a tiny compilerGenerate LLVM IR for common source language constructsSet up an optimization pipeline and tailor it for your own needsExtend LLVM with transformation passes and clang toolingAdd new machine instructions and a complete backendWho this book is forThis book is for compiler developers, enthusiasts, and engineers who are new to LLVM and are interested in learning about the LLVM framework. It is also useful for C++ software engineers looking to use compiler-based tools for code analysis and improvement, as well as casual users of LLVM libraries who want to gain more knowledge of LLVM essentials. Intermediate-level experience with C++ programming is mandatory to understand the concepts covered in this book more effectively.
Table of ContentsInstalling LLVMThe structure of a compilerTurning the source file into an abstract syntax treeBasics of IR Code GenerationIR generation for high-level language constructsAdvanced IR generationOptimizing IRThe TableGen languageJIT compilationDebugging using LLVM toolsThe target descriptionInstruction SelectionBeyond Instruction Selection
This book is above average quality for Packt, but substandard by the standards of reputable publishers. It’s newer, but substantially worse, than _LLVM Techniques, Tips, and Best Practices Clang and Middle-End Libraries_ by Min-Yih Hsu, which is the only book from Packt I’ve read worthy of a real publisher. While the copy-editing is, unusually for Packt, tolerable for both, the lack of a general editor is painfully apparent.
You may, regrettably, need to read this as well, as Hsu concentrates on the Clang front-end and middle-end; Nacke has considerable coverage of the back-end. (I skipped most of Nacke’s sections on the front-end, as it was already covered by Hsu.) Nacke’s organization is badly defective, and explanations are largely missing in favor of cookbook recipe steps; details below.
It was still worth reading parts, for me at least, for a unified presentation of some topics, and for what it omits. It is much nicer, for instance, to start with a _short_ explanation of datalayout. (Exceptionally, read Nacke’s section on TableGen instead of, or perhaps after, Hsu’s — which uses a nonsensical example, doughnut recipes.)
Nacke generally fails to explain new concepts; he just provides recipes. (Nor is there an architectural diagram.) • His introduction of “unsigned” on page 149 does not mention how different its meaning is from higher-level languages. • His first mention of Undef, on page 164, merely describes it as special, with no attempt to explain LLVM’s difficult concepts of Undef and Poison. • The first substantive occurrence of metadata (p. 219) does not define it; I don’t think there is a real definition, but you can at least skip ahead to p. 240 to see an example. • The existence of both an old and a new Pass Manager is not mentioned until the 17th occurrence (of 42) of the phrase, on p. 293. (The lack of definitional consistency means that a possible interpretation of the text is that the Legacy Pass Manager is distinct from the Old Pass Manager, which would make three in all.) • The section on SelectionDAG never properly explains what a DAG is. (Hsu does.) The following section on its intended replacement, GlobalISel, merely presents a rival list of recipe steps, without explaining the difference.
Nacke’s organization is conspicuously incorrect in places, e.g.: • The name of Part 2: “From Source to Machine Code Generation” is incorrect; it doesn’t cover machine code. • The name of Part 3: “Taking LLVM to the Next Level” is both meaningless and incorrect. • Chapter 8 “The TableGen Language” belongs where it is; it is used in what should be the following chapter, 13.1 “Adding a new machine function pass to LLVM.” • The other two, Chapter 9 “JIT Compilation” and Chapter 10 “Debugging Using LLVM Tools” do not, and Chapter 10 is not about debugging. (It includes libFuzzer, sanitizers, and the Clang Static Analyzer.) • Chapter 13 “Beyond Instruction Selection” should actually be three distinct chapters: • 13.1 “Adding a new machine function pass to LLVM,” • 13.2 “Integrating a new target into the clang frontend,” • 13.2 “How to target a different CPU architecture.” The last is misnamed; it should be titled “Cross-Compilation.” (Much of the book is about targeting a different CPU architecture.)