Thursday, June 1, 2017

Reusable Software

Ever since the invention of modular programming, reusable software has been a huge deal.  Nearly ever programming language comes with build-in reusable parts or some kind of standard library.  A great number of articles have been written, extolling the many virtues of reusable software components.  Enormous repositories of reusable parts have been created.  Most problem spaces have complex ecosystems of frameworks, libraries, and sometimes even just short snippets of code or individual reusable functions.  Anything this good has to come at a cost though, and the costs of reusable software are almost universally ignored.

First I want to say that reusable software is awesome.  I am not trying to bash it here.  Reusable software has many great benefits.  It saves a ton of programming time.  Popular reusable components tend to be fairly high quality, because they have already been tried, tested, fixed, and improved.  Reusable components are typically fairly easy to learn and use, because no one wants to use something that is more expensive to learn than writing it fresh would be, and no one wants to use something that takes more effort to use than it would take to roll fresh.  Reusable software can be used to make reasonably high quality software far faster than writing it from scratch most of the time.

Reusable software is a two edged sword, and most programmers cannot see the edge that is facing them.  Reusable software is so good that it is often difficult to see the costs, and even when the costs are known, they are often ignored.  Most of the time, the costs of reusable software are acceptable, but occasionally they can cause serious problems.  For example, recently a large number of web sites stopped working because a trivial piece of reusable code was removed from a popular nodeJS repository.  The time savings for using the code was probably between 1 and 5 minutes for a single developer.  The total cost in web site down time was probably in the hundreds of thousands or millions of dollars.  This is not a common scenario, but it is one that could have easily be avoided by spending a few minutes to write a trivial piece of code, instead of relying on an external source to remain available forever.

Perhaps the most obvious cost of reusable software is fitness of a particular purpose.  To be useful, reusable code must be usable for a variety of applications.  This means it must be generic.  On cost of making something generic is that it becomes less suitable for nearly every application, especially very specialized ones.  For flexible applications that are not critical, the application can generally be adapted to the reusable code.  This will almost certainly affect design, but it rare affects the utility and usability of the application.  Sometimes, however, it does.  Some specialized applications have strict design requirements that a generic component would violate.  In these cases, reusable software just cannot be used.  Occasionally, reusable software will seriously limit the design choices of even more mundane software in ways that are unacceptable.  In these cases, reusable software should not be used.  If using a particular framework or library feels like trying to fit a square peg in a round hole, it may be time to consider alternatives, including writing the code from scratch.

Another fairly obvious but deliberately ignored cost of reusable software is performance.  Making something generic means making it suitable for a wide variety of use cases.  This means a lot of features and a lot of "just in case" elements.  These use memory and processing power, whether you actually need the features or not.  In many applications, this does not make a huge difference.  In some though, it can make an enormous difference.  In applications that are performance critical, reusable code in bottlenecks can be a major problem.  Even in applications where performance is not critical, bloated or slow reusable code can be a problem.  Modern computers never run just one application at a time.  The typical computer is running 10 to 50 processes at any given moment, and all of those processes have to share resources.  While it might seem fine to waste a few megabytes here and there, when there are 50 processes wasting a few megabytes each, memory use can become a serious problem.  Similarly, processor time also must be shared between processes, and a process that is wasteful can affect the performance of the entire machine.  For many kinds of applications, this is rarely a problem but for some (web pages, for example, where a user may have anywhere from 5 to 100 tabs open at a time), a little resource hogging can go a long way.  In general, it is a good idea to keep in mind that reusable code is consuming resources for all of its features, even those you are not using.  A good rule of thumb is that if you are not using more than one or two features of a framework, it might be time to consider looking for something lighter or writing the code from scratch.

Reusable software makes code less maintainable.  This is very counter-intuitive.  There is a general assumption that using a framework or library takes some of the maintenance burden off of the programmer, and this is true.  Reusable code can definitely reduce the burden of maintenance, but this is not about maintenance time spent or saved.  This is about being able to maintain the code in the first place.  Learning exactly how a piece of reusable code works takes enough time that it defeats the purpose of using it, so few programmers bother.  If the reusable code itself has a bug, however, the only options are either to ditch the reusable code and home brew a replacement, pay to have someone spend hours, days, or weeks learning how the code works and fixing the bug, or report the bug to the project and wait for it to get fixed upstream.  The second option is so expensive that it is hardly ever an option.  The typical solution is to sacrifice quality, utility, and usability by working around the bug.  In addition to this, because reusable code is significantly harder to change (because the programmers using it are not intimately familiar with its code), valuable design changes to a program may be impossible.  As with bugs, if a valuable design change is needed that does not work with the reusable software, the options are to ditch it and home brew a replacement, or pay someone for a lot of labor to figure out how to adapt the reusable code.  Of course, once the reusable code has been altered, updates from the original source become invalid, and the reusable code must be maintained entirely in-house, at additional expense.  Again, this is not necessarily a problem that will come up a lot.  Major design changes after delivery are not exactly common.  Reusable code does tend to be more well tested as well, so bugs in it will be more rare.  When they do come up though, they can be very expensive.

Overall, reusable software is very valuable, and nearly ever program uses it in some way.  It also comes with some costs though, and understanding those costs is important to get the most out of it.  Sometimes it is better to write the code from scratch.  Code written to the specific application will always be superior to reusable code.  Reusable code can ultimately be more expensive than coding from scratch.  Cases where reusable code costs more than writing code from scratch are generally uncommon, but determining this should be part of the cost analysis of any project.  In addition, there are different levels of reusable software that should be considered.  Using something like the C standard library or other build-in functions and features of a language is generally a given.  For any popular language, these have been tested and optimized more than anything else.  Libraries for hardware interaction are often necessary as well, though there may be multiple options.  These tend to have more testing behind them as well.  Convenience and aesthetics libraries (jQuery, for example) are typically far less critical, so they tend to have less rigorous testing, and the benefits they provide are less critical.  Less essential libraries that have been around for less time deserve the most scrutiny.  Very small pieces of reusable code also deserve scrutiny, because it may end up taking more time and resources in the long run than just writing it from scratch.  Despite its high value, reusable code is not a silver bullet.  It comes with its own costs, and sometimes those costs can be much greater than any benefits.  Our industry needs to spend a bit more time considering the options and potential consequences than it currently does, and if it did, things would work a lot more smoothly for everyone.

No comments:

Post a Comment