Originally developed by Alfred Aho, Brian Kernighan, and Peter Weinberger in 1977, AWK is a pattern-matching language for writing short programs to perform common data-manipulation tasks. In 1985, a new version of the language was developed, incorporating additional features such as multiple input files, dynamic regular expressions, and user-defined funcitons. This new version is available for both UNIX and MS-DOS.
Most of my early programing was with Pascal. With the discovery of this book, and the awk program provided by the Thompson Automation Software (a super-set of the traditional Awk. Unfortunately TAS is now extinct) for the past 20 years all of my programing has been with AWK. From small one line scripts to 5 pages of code, all done with AWK (TAWK). It's been a great ride. I'm 80 now, and still at it.
Perkeitä varten luettu ennen vuotta 1990 julkaistu ohjelmointiopas, jonka valitsin sillä perusteella, että joskus 17 vuotta sitten jollain kieliteknologian kurssilla käytettiin jotain yksittäisiä awk-komentoja, joten arvelin, että ehkä tämän lukeminen ei olisi ihan turhaa. Mnoh, kyllä tämä nyt kuitenkin taisi olla täyttä ajanhukkaa siltäkin kannalta, koska nykyään laajojen tekstimassojenkin käsittelyyn on paljon monipuolisemmat työkalut kuin silloin, kun awk-kieli on luotu. En siis usko koskaan mitään tästä lukemaani käyttäväni, eikä päähäni kyllä mitään tästä jäänytkään. Ihan seikkaperäinen ja systemaattinen kirja kyllä, muttei tätä enää tällä vuosikymmenellä tarvita.
Very valuable read even for those programming in Awk for many years---if anything, it might show alternative, simpler ways of writing Awk scripts. It is a short, hands-on read that quickly covers the whole language.
Some of the internal details are explained in greater depth, such as how multi-dimensional arrays are actually mapped onto a single dimension. But most topics are covered to the extent needed for a proper usage.
While the authors frequently remind that Awk scripts are useful for prototyping before migrating the code to another programming language, in my experience Awk (both the language as well as the implementation that I mostly use, Gawk) is very solid for production applications, with competitive performance as those produced with compiled programming languages. A classic adage of temporary becomes permanent.
Wow, what a gem of a book! I read it for free in PDF form online (you can find it at archive.org), but loved it so much that I've ordered a paper copy on eBay for the grand sum of $4.
Despite its age, this remains a shining example of how to write the perfect programming book. It begins with a brief introduction to the language to get you going. Chapter Two is dry reference which you should absolutely skim as the authors suggest, lest you die of boredom. Then all of the other chapters show increasingly virtuoso uses of the language: for text database queries, for report writing, for interactive domain-specific language interpretation (I thought the infix calculator example was particularly elegant in just 30 lines of code!), and then a full-on recursive descent parser!
The authors of the book are the authors of the language and it was wonderful to read about their experiences with the evolution and success of their creation in their own words.
As for the AWK language: I've certainly come away with a new respect for its capabilities. It's a full language with conditionals, functions, associative arrays, etc. The brevity of the automatic built-in record/field parsing is absolutely wonderful. Unfortunately, you are afforded such elegance for only one input and one output file/stream - after that, you must resort to C-style file reading/writing and (shudder) for loops with indexing, etc. Yuk! If AWK had a mechanism for multiple nested instances of its main record/field parsing, it could have been so much more...dare I wonder if successors such as Perl would even have occurred?
In conclusion, the book is wonderful: by the end, you will absolutely know AWK inside and out. I cannot recommend this book highly enough. Likewise, the AWK language itself is also wonderful for its intended purpose: one-liners and small scripts for parsing single streams of input, especially if the input matches the record/field set paradigm. For longer scripts, you will find many worthy successors.
The 1988 book on awk by Aho, Weinberger, and Kernighan is as good as everyone says, and I will highly recommend it for programmers and data science folks.
Specialized tools like cut and bc handle many simple tasks, and general languages like Python can do pretty complicated things. For a long time I avoided properly learning awk, perhaps because I didn't understand that it fills such a useful space between those two extremes, and that the learning curve is really reasonable.
Awk is good at processing lines of delimited text quickly and easily, with defaults that are often convenient. For example, here's adding a BMI column to this file I found:
That would probably take more effort with most other tools.
Unfortunately awk doesn't make it very easy to deal with some complications that can show up in CSV (quoting, newlines inside fields, etc.). A Python project called pawk looks like it could be the answer, if you don't mind installing it and using Python syntax.
Here's the conclusion from the book:
> Awk is not a solution to every programming problem, but it's an indispensible part of a programmer's toolbox, especially on Unix, where easy connection of tools is a way of life. Although the larger examples in the book might give a different impression, most awk programs are short and simple and do tasks the language was originally meant for: counting things, converting data from one form to another, adding up numbers, extracting information for reports.
> For tasks like these, where program development time is more important than run time, awk is hard to beat. The implicit input loop and the pattern-action paradigm simplify and often entirely eliminate control flow. Field splitting parses the most common forms of input, while numbers and strings and the coercions between them handle the most common data types. Associative arrays provide both conventional array storage and the much richer possibilities of arbitrary subscripts. Regular expressions are a uniform notation for describing patterns of text. Default initialization and the absence of declarations shorten programs.
> What we did not anticipate were the less conventional applications. For example, the transition from "not programming" to "programming" is quite gradual: the absence of the syntactic baggage of conventional languages like C or Pascal makes awk easy enough to learn that it has been the first language for a surprising number of people.
> The features added in 1985, especially the ability to define functions, have led to a variety of unexpected applications, like small database systems and compilers for little languages. In many cases, awk is used for a prototype, an experiment to demonstrate feasibility and to play with features and user interfaces, although sometimes the awk program remains the production version. Awk has even been used for software engineering courses, because it's possible to experiment with designs much more readily than with larger languages.
> Of course, one must be wary of going too far — any tool can be pushed beyond its limits — but many people have found awk to be valuable for a wide range of problems. We hope we have suggested ways in which awk might be useful to you as well.
They take awk pretty far in the book, and it's delightful to see, even if you only end up using awk for simpler things like my example above.
At a remove of several decades, when maximal complexity and marketing juju (hadoop)
AWK’s purposes and perspective
[[[draft review. goodreads doesn't have a save function]]]
AWK is a small program WRITER. So it sits between "command-line tools" like cut, grep, par, col, lam, join† ----- and one-liners from perl/ruby. It’s more like making a dumb plot in gnu terminal and exporting this as a bash command, or calling r -pie 'as.numeric(stem(stdin()))' from littler to get a stem plot.
Are there *other* little-program WRITERs you could be making? How many arguments should your function take? What would computer world be like if everyone could understand and hack upon their lexer/parser/compiler rather than thinking only at the API edge of their language’s‡ abstractions?
If you wrestle with how to *architect* your functions, a look back onto the past might be worthwhile.
† join -- and thus it makes you think deeper about why we do databases the way we do
‡Be that heavy inheritance OOP; Rails; or something very very high-level like Macaulay2, in which case I think the result would be bad.
I wouldn’t recommend it to people who aren’t going to or haven’t already spent years on programming. It’s probably better for people whose thoughts have been too influenced by the node / HN set and want a break from ten-new-frameworks-open-source-github blah blah. The AWK book harkens back to before Cryptonomicon bleeding-edge blah blah, to the black-and-white days of classic Bell Labs.
The first two chapters are a clear, concise introduction to the language. After that it gets into use cases—Chapter 3 is called "Data Processing" and Chapter 4 is called "Reports and Databases." This is still pretty useful, though a lot of time is spent trying to get the reports to "look nice" (there's a hard ceiling on how nice monospace text can look). My Awk code is for me alone, so I don't need the output to have nice headings and line wrapping.
After that, it moves on to abuse cases—things that you should use a real programming language for. Chapter 6 is called "Little Languages" and begins with an example of an assembler and interpreter for "a hypothetical computer." These later chapters are nerds-only. Perhaps I will come back to them when I am wiser.
This is one of the best books on programming ever written. AWK is generally used as for on-the-fly commandline scripts, although it has all the essential features of a complete programming language. The authors recognized that users were taking full advantage of the language for complex scripts. They do not disappoint when showing the possible applications: math expression parser, recursive descent parsing, an assembly code compiler and emulator, sorting algorithms, &c. All that, and a great tutorial on the basics of AWK.
Always heard about AWK but never looked into it before. I like it a lot! It’s not perfect for everything, and most of the time it’s probably better to use pandas, but sometimes you have text that isn’t well formatted and needs to be cleaned up and AWK is quite expressive when writing scripts for that purpose.
The text itself is pretty good—the exercises seem like a bit of an afterthought and the large program examples feel like page count filler, but the first few chapters and the appendix taught me enough to feel confident writing AWK one liners and slightly more complicated scripts.
A book about a programming language that I found interesting and an enjoyable read. I thought it would be fun to use for extracting the list of NuGets from several Visual Studio csproj files. A kind of quick and dirty SBOM (Software Bill of Material.) It worked as expected, but after reading the book, I think there are better ways using PowerShell and Python. AWK had a time in the past, so please read it for historical reference, but I don't think it fits in my tool bag.
read a few chapters from this, plus chapters from several other AWK programming manuals so i'm counting it. this one is extremely well written as kernighan's programming books tend to be, but doesn't include an overview of some of the more important features of GAWK (i was particularly interested in arrays of arrays).
An exemplar of technical writing. An incredibly thin volume at 200+ pages with remarkable print quality showing that the publisher (hey, it was the 80s) did not spare but actually printed on high-quality paper. Trashes all the print-on-demand crapfest/cheapfest everybody has to buy these days. The POD crap is estimated 2x (never less than 2x!) to 3.5x wider for the same number of pages.
Just read some parts of all the chapters, the first chapter pretty much covers whatever working knowledge you need to get started. The remaining chapters are more detailed, and probably server more as a reference. I love Kernighan's writing style and this book didn't disappoint.
Why Why Why didn’t I read this 15 years before I did? I wasn’t an AWK respecter before I read this thinking it was gimmicky. I was so wrong. A totally outstanding book. You will want code AWK well before the end!!
A well-done, professional revision of a classic programming title which illustrates how useful AWK can still be, even in 2024, as a text processing language.
AWK is a quirky little language. If you spend enough time working in a Unix environment you'll see it crop up once in a while. You can safely ignore it, of course, and go on writing shell scripts or use Python or Ruby to do quick data extraction and manipulation. I've wasted a lot of time writing Python scripts for what would have been a one line AWK program, but quick and dirty data manipulation is what AWK shines at.
This book is written by the "A", the "W" and the "K" in AWK. You couldn't get a more authoritative source even if you tried. But why would you? You'll learn all you need to know--and so much more. From one liners you'll use often to programs that you'll skim through, and then promptly forget, because there's saner ways of doing it these days--at least Python and Ruby have proper local variables in functions (well, mostly). But that's besides the point. The authors' message is that little languages are fun, productive and that "good enough" is just good enough sometimes.
Should you read this book? Dunno, I'm not you. But if you just want to use AWK, the first two chapters are enough. If, on the other hand, you want to get a taste of the "Unix philosophy" (whatever that is), you can read the whole thing.
Even though it's an artifact of its time, the book is entertaining, and for me it achieved its goal of teaching me how to use AWK.
This is another classic language manual in the same tradition as Kernighan and Ritchie's "The C Programming Language". As with "The C Programming Language", this book is a compact, lucid,tightly-written guide to its subject. There's a lot of knowledge packed into a small number of pages here, not just about Awk, but about computing in general --- among the examples are toy or skeletal implementations of an awk-subset-to-C compiler, a column-oriented database, and the "make" utility.
Unfortunately, the book is mostly worth reading for its style and these examples, rather than as a guide to the language itself. I only rarely find myself reaching for awk: coreutils covers simpler record-manipulating tasks, while general-purpose languages like Python are almost as easy to use for more complex problems while offering significantly more return on investment.
All I wanted was to improve my shell text processing skills a little, but this book is so much more than that. Some of the examples go pretty deep and it could really be a pretty good introduction to the topic of programming for someone. It flows nicely and it isn't a long read, so I recommend you to check it out even if you aren't especially into AWK (which turns out to be a relatively pleasant language to use, actually.)
I like command line tools for their brevity, for their declarativeness and efficiency. Awk is one of those tools that you end up using every now and then but for printing out just that $n column. Awk is capable of doing much more than that. This is one of the best technical books I’ve read despite it being dated. The authors give a comprehensive overview of the practical capabilities of the language and to the extent it can be used in software engineering not just devops. A very good read.
A minor classic that I managed to dig up. It covers a surprisingly large range of topics, and does so with style and eloquence. It's not just about awk -- it gives you a mini-CS education (e.g. topological sort in awk).
awk doesn't receive much credit these days, but it was the predecessor of Perl, which is known for starting this era of getting things done with dynamic languages.
A good start on the awk language. My brain and learning method prefer a lot of examples, this has some good ones but is a slender volume. Used awk for a lot of statistical analysis on large text files. Handy once you get used to it.
I'm always amazed by the powerful little one-liners awk can shoulder, but have never given the language much study. It'd be useful, probably, to delve a little deeper with a book like this and sharpen my text-processing skills a bit more.
AWK is a wonderful language. Thanks to this great book, I was able to write countless utilities to tackle data transformation in text files. What would I have done, without AWK and this book! Million thanks to the illustrious authors!
First "language" I learned to make pretty output and manage text. Concepts are great and still a good reference if you are bash scripting and need some friendly, human readable output.