This is a script called m5.awk that I randomly found after reading a paper about it on the ACM website. The paper’s worth reading too but the script is just a piece of work. Commented to the brim, clean, and most importantly, useful as hell. it could be a replacement for m4. It allows you to embed AWK in text files, and besides that, it allows you to use several preprocessng features that it offers via macros. Give it a try.
This is a strong example for the distinction between abundant comments and good documentation. Comments should describe why and how, not what. Comments on every single line stating what the line is doing is rarely a good sign. Despite all the comments, it’s not super clear what it actually does from a user perspective and why someone would want to use it. I assume it’s meant to be a templating system for text documents.
I forgot to mention, tools like Bison and Autoconf use m4. Andrew Appel calls languages that need preprocessing ‘weak’. The specs for D also call C ‘weak’ for needing to be preprocessed. Preprocessing is almost an abandoned practice, at least in the world of programming, because Scheme-like ‘hygenic macros’ have taken their place. Some languages like Rust and Zig are attempting to reinvent the wheel by re-inventing macros, but Scheme had it right in the 70s. Truly, C and Pascal’s preprocessor were out the door in the 70s, let alone, now.
Here’s another preprocessor that is 'neat: GPP.
I have made my own preprocessor, it’s very nasty and has a lot of bugs because I was an idiot when I made it: https://github.com/Chubek/Ekipp
I have half a mind to remake Ekipp. I don’t know what purpose would it serve. But it’s just neat, I think preprocessors are neat, as useless as they are.
It does not really need to explain what it ‘is’ because
1- a paper has been released on ACM explaining it, you can download it for free from here: https://dl.acm.org/doi/pdf/10.1145/353474.353484
2- It’s name is m5, most people in the UNIX world know what m4 is. If you have Linux, bring up
info m4
.Imagine this, when you use C, there is a preprocessor that does all the
and
s, m4 is like that and m5 is like that too. Preprocessing used to be big, but now it’s not. If you wanna see an example of preprocessing, look at this preprocessor I made for C: https://gist.github.com/Chubek/b2846855e5bb71a67c7e3effc6beefd6
If only this post title had received descriptive text too
I am too dumb for this. Can somebody give an example of how this could be used in a real world scenario?
It’s a preprocessor. Like m4 and GPP. I thought that is clear from the name?
These words mean nothing to me. That is why I specifically asked for a real world example. By that I mean something like a userstory.
You could start with:
“Imagine you are Frank Frankis and you have an office job. Your boss tells you to do […] . But […] would take a long time to do manually because […]. Frank uses m5 in the following way to folve the […] task.”
Followed by an example input, example command and example output that is a solution to the problem from the scenario.
Without that - I am too dumb to understand.
Thank you in advance.
So I am going to explain the concept of macro preprocessors to you — m5.awk is a macro preprocessor, so is m4 so is GPP so is C’s CPP, so is my Ekipp and so is my AllocPP.pl. They all work like this:
1- Frank Frankis (hereby referred to as FF) creates a file called
my-file.randext
—randext
here meaning that macro prerpcoessors work on all kind of files, be it a C code, a Python code, an HTML template, your tax files, your notes to your GF, your angry letter to your boss, etc;2- There are dozens and dozens of uses for a macro preprocessors, but let’s say FF wants to obliterate two birds with one sniper shot, he wishes to write a manual for his new Car XAshtray ™, in HTML and Markdown, but he wants it to be contained within one single file – and he wishes to reuse certain fragments of text, certain HTML tags, certain Markdown syntaxes, to be ‘reusable’, just like a ‘function’ is reusable piece of code in an imperative language (but not a functional language!) or how a template works in C++, etc.
3- FF enters the file with his favorite text editor. He defines these ‘functions’, which are the ‘macro’ part of a macro preprocessor. Think what ‘macro’ means. It means ‘big picture’ basically. I think the better term here is ‘meta’. These two words have a close relationship in the English language, don’t they?
Now let’s see what macro preprocessor FF uses. Since GPP is really similar in syntax to C’s preprocessor (at least with default settings) let’s use GPP. I would use my Ekipp here but I honestly have forgotten the syntax (believe it or not, creating a language does not mean you are good at it).
#ifdef __HTML__ #define TITLE <h1>My Car XAshtray Manual</h1> #define SUBTITLE <h5>Throw your ash on me</h5> #define BOLDEN(text) <b>text</b> #elif __MARKDOWN__ #define TITLE \# My Car XAhtray Manual #define SUBTITLE \#\#\#\#\# Throw your ash on me #define BOLDEN(text) **text** #else #error "Must define a target language" #endif
FF keeps defining these. Now comes writing the actual manual.
Keep in mind that GPP stands for ‘Generic Preprocessor’, it’s a text macro prerpcoessor and not a language preprocessor like CPP (C’s preprocessor) is. m4 and Ekipp are like that. My AllocPP.pl is a language preprocessor, it preprocesses C. So now, this means FF can now freely treat
my-file.randext
as a text file, with a ‘language’, the Macro Preprocessor language, defining the output (I’ll talk about what I mean by ‘output’ soon).So he writes his manual:
TITLE SUBTITLE Hello! Are you okay? Why did you buy this ashtray? BOLDEN(ARE YOU OKAY?). In this manual I will teach you how to use my ashtray... ...
Replace the last ellipse with rest of the text, the Car Ashtray manual.
Now, FF needs to ‘preprocess’ this text file. This is, well, the ‘preprocessing’ part of a macro preprocessor. It’s like compiling a program with a compiler, except you are compiling text-to-text instead of text-to-binary.
gpp -D__MARKDOWN__ my-file.randext > my-manual.md `` But what happened at `-D__MARKDOWN__`? I think you have already guessed. In the 'program' we asserted if '__MARKDOWN__' is 'defined', then then define those macros as Markdown, else HTML. We can also define a macro with a value:
gpp -DMyMacro=MyValue my-file.randext > my-manual.md
Now, GPP has more built-in macros like `#ifdef`, they are called 'meta macros` (as opposted to the macros you yourself define). There's `#include` which includes a file. There `#exec` which executes a shell command. Etc etc. Now, you can read more about GPP on its Github. I was in touch with its maintainer, Tristan Miller, very recently when I showed him my Ekipp. He has made a new version of GPP so don't install it from your package manager like apt, install it from source because the release is very recent and these packages take ages to be updated. GPP is just one C file, very neat and clean. Read the man page (`man 1 gpp`) for more info. m4 and m5 or Ekipp etc, as I said, are too, generic text preprpocessors. My Ekipp has this feature where, you can treat any program like PHP works:
#! delimexec $ ‘‘awk “{ print $1; }”’’ | <== foo bar ==>
This will run the AWK program in the file. You can install my Ekipp using these commands:
sudo apt-get install libreadline-dev libgc-dev libunistring-dev wget -qO- https://raw.githubusercontent.com/Chubek/Ekipp/master/install.sh | sudo sh
Bring up `man 1 ekipp` to learn about it. Keep in mind that Ekipp has some bugs. I will have to completely rewrite it honestly but I am busy making an implementation of ASLD (github.com/Chubek/asdl) and I am working on an implementation of AWK and later C so a macro preprocessor does not bite me really. Thanks.
Thanks, I appreciate your effort a lot and I understand the usecase now!
Languages like C have a preprocessor. The preprocessor preprocesses the source code before compiling it. The C preprocessor, copy-pastes code from other files into the current file (
s), erases code (
#if
if the condition is false), and expands macros (e.g. you have#define MAX(x, y) ((x) < (y) ? (y) : (x))
, it replaces every use of that macro with the definition of that macro:a += a * MAX(b, c)
→a += a * ((b) < (c) ? (c) : (b))
. There are also general-purpose preprocessors (or macro processors), that are not tied to a specific programming language however. m4 is one of them and GNU autotools make extensive use of them to generate their configuration and make files. What preprocessors allow one to do is write a template, and then generate a result, based on your needs.
I mean, it’s two unique cheracters why are people so confused?!