Object-Oriented Brain Damage
So, this is the post that I’m thinking for some time. I’m still surprised that it feels like swearing in church, it shouldn’t be this hard (and this rare) to criticize object-oriented programming.
What do I mean by OOP? Mainly the classical style, where you define classes with members and methods, and use this to model the solution you’re dealing. It’s supposed that this way was superior to the procedural style, where you write some procedures that modifies some global state. I believe the rage against the procedural programming was because of this global state, because somehow, in all domains, we have to keep the state and having a global state makes the reusability of procedures/functions almost impossible.
I can understand how the reaction against global state lead to something like Java’s everything is a class creed. In order to contain the global state, you need classes that contain part of it, and interact through methods. The idea is simple, if we forbid global state and keep partial state within structs, and mutate it with methods, we get reusable software. Once upon a time I believed in this too.
In an ideal world where we could write the software in Smalltalk instead of C++, I’d probably not write this post. Actually, this post isn’t about C++ or Java either. They have their places, they are solving some real problems and they are probably errors that should be done to arrive to Go or Rust. We now have better ideas thanks to everything should be a class world view.
The problem, in my experience, is to apply this classical classes to interpreted languages. In C++, although it became like a Leviathan with different arms for many different paradigms, there is some care for cost of abstraction. If you know the tool, you can get away with classes. Java also seems to try to make abstractions cost zero. So if you use these languages, it’s a matter of modeling, habit or domain suitability. I still think, for the longer term, OOP increases the maintenance burden but as most of the ideas in our profession, this is not tested.
However, when it comes to multi-paradigm interpreted languages like Python, objects begin to hurt. The problem, in my opinion is object-oriented design is in conflict with the Principle of Locality.
The Principle of Locality is probably is probably the most important empirical idea in software design. It’s directly related with the Pareto Principle, Zipf Law and other well-known notions. Our CPUs, disks, search engines, content delivery networks depend on it to cache the artifacts for reuse. We know it works, because a CPU with a larger L1 cache is faster, we add more caches to CPUs and disks to increase their speed.
The conflict between the PoL and OOP arises when classes includes data that are not directly related with the problem at hand. Problem at hand, for example, is to find a maximum of one million numbers, or making transformation on some variables. But if these are members of class, they bring unusable data to the locality. If the variable I’m transforming is a member of class, when I access it through the object, all other members, be it 3 or 30 or 300 are also referred. Thus, looping over the objects becomes polluting the locality with all other members of class.
In compiled languages, the advantage of iterators over plain loops is to overcome this. However, in interpreted languages no one writes specific iterators for their member variables. This means, when I have:
for obj in my_objects:
obj.do_something(a)
All the members of obj
are now in the loop. This is against the PoL.
Another problem that I see about OOP is its alienation from the basic elements of software: When you begin to work with objects that do not directly correspond to anything in computer/software architecture, it becomes more or less a castle over the clouds. You have to tie this castle, i.e. the class hierarchy somehow to the ground. The ground is CPU, memory, disk, cache or something. Compilers may do a good job of doing this, or they may not, but this detachment from the basic elements of computing machinery is the cause of what I call Object-Oriented Brain Damage.
When you begin to believe that the objects in your program are real, you try to express your problem using more objects. Classes are like kipple. They proliferate all the time. You add classes, then you add classes to create classes, than you add some base class to derive classes, then you try to fix these problems with patterns, and on.
But this whole enterprise doesn’t have anything to do with the real tools we have. Our tools are processors, memories, disks, screens, printers, etc. When we detach the problem from these basics, it doesn’t become more solvable. We only create some ideal version of our understanding, that requires additional attention to teach others, to document, etc.
It’s true that we need some abstractions over the tools, we cannot just read bytes from disk and process in CPU and print to screen, and these all are must be abstracted. But in my experience OOP is not a good way to abstract these tools. Instead it tries to abstract the problem at hand in an arbitrary way that is supposed to solve the problem. Though this solution is itself a problem to be fit into the ground.
Databases and ORMs are good examples for this. Databases and Entity-Relationship theory is a well-understood abstraction. We know how to use them in multiple CPUs, multiple disks, multiple machines across multiple continents. ORMs are not like this, they are supposed to correspond to databases, and have that cozy object-oriented feeling. However, any system that depends on ORMs learn that these are not identical to databases. They don’t have the same capabilities and performance, and over the long run depending on an ORM causes more problems to solve than just writing SQL queries. Because ORM does not consider the tools we have, and their limitations, it tries to fit an idealistic world view into databases, and when it doesn’t scale, we think this performance degradation is natural.
No, it’s not natural. When your models load the whole row from billions of records to access a single field to make calculation, you deplete the cache space quickly and it doesn’t scale. OOP might just be an educational tool, but even in this regard, I believe it causes most of the brain damage we see in enterprise software.