bookmark_borderCode Generators: Bad, worse & ugly

Code generators have been invented and forgotten at least four times in software history. They have an appeal to developers like the sun to Daedalus’ son. Let’s not be Icarus, let’s keep them generators at a distance and watch them carefully.

Whenever a language, framework or paradigm forces developers to do the same thing over and over and over and over again they will try to get rid of that repetition. One approach is to automate writing code. It is not pretty but it saves time to concentrate on more interesting and useful things. Seasoned and reasonable software developers have resorted to that solution and many inexperienced have followed. Outside very narrow use cases I think generated code should better not exist at all.

Valid use case

Generated code and other kinds of boilerplate code are valid where avoiding them is not practical. This is often true for entry points. Depending on language, it might not be trivial for a the first piece of running code to find

  • its configuration
  • its collaborators, base classes, dependencies
  • useful data from previous runs

I have written a long, long piece on two-stage autoloaders and other two-stage bootstrapping topics and I keep rewriting it, breaking out separate topics. It is still unreleased for a reason. Any two-stage process that splits automated detection or definition of artifacts from the production run that uses them is essentially code generation. Avoiding it might be possible but impractical. Some level of repetition cannot be avoided at all and is best automated.

Another valid use case is generating code from code or transpiling. Nothing is wrong with that.

Unfortunate Use Cases

There are other use cases that should be avoided. Your framework follows convention over configuration so making magic work requires having some artifacts in the right places. Even if they have no natural place in your specific solution they are needed for technical reasons so you copy/paste or auto-generate the minimum sufficient implementation and make it fit. This is something to look for. Often there are ways around it. Another case is limits of the underlying language. You evolved from using magic properties and methods to implementing type safe, explicit equivalents but now you have to re-invent the type specific list or container type and you automate it. Bonus points if your ORM tool requires it. If your language does not support generics or another templating method, you are stuck between repetitive, explicit code and weakly typed magic. You end up using a code generator. Hopefully at one point somebody is annoyed enough and ventures to bring generics into your language. That would be the better solution but it is likely out of scope for your day to day work.

Stinking unnecessary use cases

Beyond that you are likely in the land of fluff where things just happen while they can and lines of code are generated just because it is customary. This is a foolish thing best avoided. Granted, automating code is better than hand-writing it. It does however not mean the code should exist at all. If you have no specific reason to repeat code, it is likely a design smell. This is not new, the Cunningham wiki had this thought a decade or more ago. Likely they were not even the first to recognize it. Refactoring, abstraction, configuring the specifics can help reduce the necessity for repetitive code.

My programming tools were full of wizards. Little dialog boxes waiting for me to click “Next” and “Next” and “Finish.” Click and drag and shazzam! — thousands of lines of working code. No need to get into the “hassle” of remembering the language. No need to even learn it. It is a powerful siren-song lure: You can make your program do all these wonderful and complicated things, and you don’t really need to understand.

Ellen Ullman: The dumbing-down of programming, Salon, May 1998 https://www.salon.com/1998/05/12/feature_321/

Let us take the input to a code generator and make it the input to abstracted, ready to run code instead. We will know when it is not practical, not performant or not possible. Then code generation is a blessed thing. Otherwise it is a sin.