4. Januar 2017

by

In: Allgemein, Tech

No comments

One of the core principles of modern Object-oriented programming (OOP) says „Once and Only Once“. Having several places in a system where the same or similar code performs the same or similar tasks is usually bad. It makes code longer than necessary, it allows bugs to remain in one place that have already been solved in another and it increases cost of optimization, feature development, adjusting code to new versions of frameworks and dependencies.

But what’s worse, it is a lazy, half-hearted approach to programming. We can do better. Imagine a cook book of 200 pages which explains making fresh pasta. 200 Pages, the ultimate book, containing everything you ever wanted to know about pasta, right? Wrong. The first twenty pages describe making dough. Then we have a page on how to form the first noodle out of the dough. Then we have a page on how to form the second noodle out of the dough. Several pages later, we have a chapter which goes into explaining short thick noodles. It begins with twenty pages on making dough. Then we have a page on how to form the first short thick noodle, the second short thick noodle. The next chapter is about long thin noodles. It begins with 20 pages on making dough…

You wouldn’t buy that, right? Why should a customer accept such a product?

Good software is created by refactoring. First, you want to write the first thing that works. You might even copy-paste and adapt from something similar you wrote somewhere else.
Or somebody else, respecting license and naming the original author. You finish the day, a few hours late and glad you arrived at a point where your program somehow, for some reason, does the right thing, at least in your defined test case. You want to go to sleep.

But if you don’t get back at that code, look into this, reorganize the parts, remove things you don’t need or already have somewhere else, you get software like this pasta book.

Recently, I was working on a component which implements multiple types of fields to hold and display data. Among those fields were an enumeration (choose one out of several predefined options), a relation (choose one out of a list of things stored somewhere else), a multienum (choose none, one or many out of several predefined options) and a multirelation (You can imagine, right?). They were somehow related, that’s obvious. It took a little time to figure out which way the code needed to be structured to reduce the amount of duplication in the different classes. I ended up with the following:

  • A single-select enumeration is a special case of a multi-select enumeration, as the latter can do all the first needs to do – it must only be restricted.

Now for purely technical reasons an important question was if the different types of relation were more similar to each other than they were to single and multiple enumeration respectively. It turned our they were. So the complete set of inheritance was like this:

  • A single-select enumeration is a special case of a multi-select enumeration, as the latter can do all the first needs to do – it must only be restricted.
  • A multi-select relation is a special case of a multi-select enumeration.
  • A single-select relation is a special case of a multi-select relation, as the latter can do all the first needs to do – it must only be restricted.

The „it must only be restricted“ part was still annoying duplication but I needed to move on and implement other field types. A little bit later I decided once and for all that all those validations should be reusable parts which should be independent from the fields. Don’t repeat the same validation for similar fields but let the fields each use the validation items like „mandatory“ (choose at least one value), „singleOption“ (Don’t allow multiple options), „existingValue“ (value must be from option list). In the old Form class I wanted to replace, all was organized a little different.

It also coupled the presentation to the field type. I did not do this. I separated presentation type from field type just like I separated validation from storage.

The reason is simple. A list of radio buttons represents a single-choice enumeration just like a dropdown box or a clickable selection list. The data type cannot know which presentation fits best. For yes/no questions, even a checkbox might be an option. Or a multiselect could be served by a list of options with checkboxes. But if we have really huge lists of hundreds or ten thousands of elements, a searchable box with auto completion might be the best presentation. Maybe combined with a short selection of recently used or frequently used items. How could a data type know that? On the other hand, different presentations might mean the same thing and should not be different data types.

Most expert find it important to strictly separate the process of feature development from the process of refactoring existing code. Otherwise you can end up in a complete mess. Or never finish. In different software projects, I have seen both. Some people have a talent to limit themselves, to make minor improvements on the fly while moving around bits and pieces. I think this is fine, you just need to know when to stop messing around and get one thing done and leave the other for later.

Before I published this article, I already had it written for some time. I decided to move some things around, to amend some parts, to delete what was said twice. At some point it was important to stop fiddling around. I thought „stop messing around, get the pasta done and leave the text for later“. Life is not so different from programming, after all.

Tags: , , , , ,

Leave a Reply