Thanks for posting this and for the shout-out @milancurcic. I hope the idea was helpful. You might know the old saying that “The solution to every problem in computer science is an extra level of indirection.” That’s one way to think about what the addition of the new abstract type does in your hierarchy.
Most problems people find with OOP are actually problems with OOD as @FortranFan points out. If the problem is performance, then I’d point out that the Trilinos library is one of many examples of high-performing OOP code that runs at the exascale. Whether a code is OO and whether it’s fast are mostly orthogonal except that many times people choose an OOD that hurts performance.
@certik In several papers and my book, I’ve tried to demonstrate that many of these questions aren’t purely stylistic. There are ways that we can reason analytically about how various programming paradigms affect development. In my very first paper on software design in 2006, I put forth scaling arguments suggesting that encapsulation and information-hiding combine to attack a problem that scales quadratically with the size of the code, whereas polymorphism and inheritance attack problems that scale linearly with the size of the code. That suggests that the real power in OOP lies in encapsulation and information hiding. Also, I’d emphasize polymorphism over inheritance. In Fortran, the two are largely linked by the fact that we must use inheritance (type extension) to exploit our primary mechanism for runtime polymorphism (class). I’m expecting that once generic programming is supported, I’ll write a lot more code that uses compile-time polymorphism than runtime polymorphism.
Regarding inheritance, it’s important to remember that it’s a special case of a special case and that’s the clue that even though it’s powerful, it is best used in limited circumstances. Think about the fact that the language aggregates a parent component into the child type so inheritance is the special case of a one-to-one aggregation in which the encompassing object (the child) supports the entire interface (all the type-bound procedures) of the encompassed object (the parent component). Aggregation is much more general because it can support a one-to-many relationship (there can be multiple components of the parent type or even an array of them) and the encompassing object can choose which parts of the parent type’s interface to support. Those can be supported via delegation, wherein the encompassing object’s type-bound procedure invokes a type-bound on the component object.
… and
3. The power of encapsulation is that it supports working with abstractions. We all demonstrate the power of abstractions in our daily lives. An airplane likely has hundreds of thousands or millions of components and millions of lines of code. If we had to list every component and line of code any time we referred to an airplane, all human communication would grind to a halt. Thus, the simple use of the “airplane” abstraction represents at least a 10^6 reduction in complexity. Fortunately, most derived types have far fewer components, but some of those components might be arrays with millions of elements so we’re also leveraging the “array” abstraction. The ability to layer abstractions is especially powerful.
4. OOD patterns typically encourage writing software that is highly cohesive but loosely coupled. The cohesive part implies that all the components and type-bound procedures are part of the type because they all do something related. Information-hiding in the form of private components is what gives us the loosely coupled property. Making any significant change to a public component (e.g., from a scalar to an array or from an intrinsic type to a derived type) necessitates hunting down and changing every place in the code that accesses that component. Making the same change to a private component might not impact any code outside the encompassing module as long as the same interface can be supported, i.e., the type-bound procedures can still accept and return variables in the format that the code outside the module expects. This is especially helpful to a newcomer to the project who wants to make a local change somewhere without breaking lots of code elsewhere.
To make my comments on information-hiding concrete, imagine one would like to refactor an existing project to exploit the units-tracking capabilities of the Quaff packaged developed by @everythingfunctional. I’ll let Brad correct me if I’m wrong, but I think this means that a derived type’s dimensional components that were previously of intrinsic type will become derived types so real cylinder_radius will become type(length_t) :: cylinder_radius. With private components, one can make such a change one module at a time because if that component’s numerical value appears somewhere in the interface (e.g., as the result in a “getter” function), one can always extract that numerical value and return it so that the code calling the getter is not impacted by making the component a derived type. With public components, every changed component launches a hunting expedition visiting every part of the code that was broken by a change that is conceptually simple yet really powerful.
This is a great example. The advantage of information hiding in the setter/getter approach is that indeed you can change how things are implemented internally. The disadvantage is that now you have the setter/getter everywhere.
The alternative design is that you expose real cylinder_radius to the whole code. The advantage is that things are in general simpler (no setters/getters). But it’s harder to refactor to change the type from real to type(length_t).
It’s a balance how much you want to hide things and there is no clear answer. For performance, I personally believe you want to see through the whole code as much as you can, all the way from the top to the bottom (hardware) and make such engineering design decisions that allow you to deliver the whole code and be performing. However, you can’t easily change the design. So if the requirement is to be able to easily change the design, then you need to hide things more and design better interfaces.
For me the biggest issue with reliance on setter/getter routines to access data is the potential for a large performance hit. You are left with either a copy (and sometimes a deep one) or returning a pointer. I don’t find either option particularly attractive. Folks that push OOP on the premise that the cost of developing and maintaining code far exceeds the cost of running code so performance can take a back seat have probably never worked in an environment where human lives are at stake and you have to get correct results in as short of time as possible. No amount of savings in dollars compares with saving human life. Still most of my issues with Fortran OOP are with how it is implemented in the language and not the OOP paradigm. The issues with inheritance and run-time polymorphism were known at the time the Fortran OOP was being developed but instead of developing a facility that emphasized generics (templates etc) we are stuck with a facility that is built around inheritance instead of focusing the things that have been proven to be of the most benefit to scientific programming (generics, interfaces etc.). Hopefully, the introduction of templates in Fortran will save the language much like it did for C++ (but I’m not holding my breath until that happens).
@rwmsu excellent point. My collaborators and I have certainly worked on applications where performance matters a lot, including some safety-critical codes. I would never argue for for ignoring performance all together, but I’m reminded of Donald Knuth’s “Premature optimization is the root of all evil.” Most performance concerns I hear come without data. If one arms oneself with performance profiles and sees that a certain segment of code is the bottleneck, then by all means, do whatever is necessary to speed that code up.
@certik Another thing to consider is the extent to which one wants outside contributions. It can be discouraging for a new contributor to drop in to offer one contribution in one place and then have to touch lots of code in other parts of the project.
In my latest work, I almost never have setters because I try to construct whole objects and not leave the object in a partially defined state. Most of my getters are returning arrays or whole objects so at least there’s a lot of bang for the buck. I wouldn’t advocate using a getter to return a scalar that’s needed repeatedly in a loop, for example, but I’d also question an OOD that leads to that solution. Wherever possible, I focus on writing type-bound procedures that do more computation than just getting a component.
Absolutely, this is part of the pros/cons engineering decision.
Here is a quote about this that I heard somewhere: “Software is written three times: first to understand the requirements, second to get the architecture correct, and third time for performance.”
So above I had mostly the case in mind where you already know the requirements and architecture, and just need performance.
@FortranFan I just checked 5 projects: 3 have no type extensions, 1 has it once, and 1 has it several times. These are all small projects so the total amount of code is probably < 10K lines.
These days, if I’m using type extension, it’s mostly to define abstract types – especially to support the Template Method pattern in which a deferred binding invokes other deferred bindings to specify the steps in an algorithm but let the child type determine implement each step.
I just did a quick grep through 10 of my open source libraries, and found 76 occurrences of the word extends. So as others have noted, it is quite valuable as a library author.
That said, I believe that templates will be a vast improvement over inheritance. I am excited to explore the designs possible when combining templates and run-time polymorphism. I suspect it will solve the multiple inheritance problem.
I grepped all my code and extends appears exactly zero times. Well, we do have it in 23 tests in LFortran, so you will be able to use it @everythingfunctional.
I agree that derived type structures are very beneficial. They are especially useful when developing new code by easily allowing changes to the data structures, without affecting other code.
OOP is not appropriate at the start of a new project.
By the time the data structure has been finalised, the problem is typically solved, so there is no funding for OOP adaptation.
I think it does not only depend on the size of the problem, but also whether it
can be naturally mapped to objects (internet connections, files of a certain type,
“natural” objects like stars in a larger-scale simulation) so that object oriented programming makes one’s life easier and does not feel forced.
To be honest I have never used OO-Fortran in any serious project (although I might now, if I had to rewrite my codes), but where I work now (no longer Fortran- or science-related) an object-oriented programming style is mandatory and an intrinsic feature of the frameworks used.
@Rouson
You quoted "“The solution to every problem in computer science is an extra level of indirection.” but that is incomplete - the saying goes on to say “except the problem of having too many levels of indirection”.