Future Proofing
The art of software design is a controlled exercise in trying to predict the future with very imperfect tools.
Chances are pretty good, if you've ever been involved in the software business, that at some point you will be in the position of trying to build something that's never existed before. If you're lucky, you're the one having to make the estimate about how feasible something could be to build. If you're the one to have to implement someone else's vision, especially someone who's non-technical, you might as well pack your bags - you're going to be spending a lot of late nights at work.
There are things that you can do in the design process that may make it easier both to build and (more importantly, to maintain) software moving forward. Indeed, keep in mind whenever you're putting together software that the development process will typically be only about 15-20% of the overall lifecycle of the software, and in some cases, applications that were built with an assumption that new software would replace it within five years were still in place 15-20 years later, primarily because people were afraid to touch it.
As a consequence, before the first line of code is ever written, a number of solid design principles should be fully and firmly understood:
- Extensibility is a hallmark of good design. All too often, software is built or deployed for one purpose, and one purpose only. As the business evolves, that purpose will slip out of sync with the requirements that the business has, end eventually fall into obsolescence. However, because it was tied into other processes, this obsolescence spreads, ultimately requiring a completely new solution.
To the extent possible, spend some time looking beyond the immediate requirements to see how the software can be adapted easily (without tearing down the building) based upon likely requirements in the next few years. The solution for these future requirements do not need to be built, but mechanisms should exist to all you to extend your application for when they are needed. - Use Open Standards. An open standard is one where there is a clearly defined set of protocols that are agreed upon not just within your company but globally. Businesses get acquired all the time, and chances are that if you utilize XML or JSON or RDF standards, then the process of integrating your existing applications with the new ones rises dramatically. This also prevents you from being dependent upon a vendor that may go belly up before the lifecycle of your application is over. By working with open standards you can at least control the input and output and keep them stable in the face of change.
- Design for discoverability. Your application (and your programming philosophy) should be built around the notion that both data and services are discoverable. From a data perspective, this means generally not hard-coding assumptions about what collections of data are available, especially reference data or controlled vocabularies, but instead providing a way to survey the available data space and then let users select what particular data set they want to work with. Configurations should be persistable and determined by the user, not the programmer.
When new services are added, these should be discoverable when a user starts or logs into an application, and deprecated services should degrade gracefully. New versions should install seamlessly, and never require that a user touch a configuration file or run code from a command line. Not only can this be intimidating to a user, but its also quite possible to seriously break the underlying system if done badly. - Incorporate exception handling early in the process, and when possible provide graceful failovers when an exception occurs. Effective exception handling cascades from least general to most general, and in many cases you can use such exception handling to attempt to fix data prior to pushing it back to the client, and in sufficiently intelligent systems you can even make suggestions back to the user indicating how they can fix the data.
- Soft validations and transformations. When possible, perform validations using external schemas rather than hard-coded ones. Data structures evolve over time - by encoding the structures directly, you freeze evolution of that data. This holds for transformations of data as well - if your data is in XML or RDF, use soft validation technologies such as XSLT , XQuery or SPARQL UPDATE to perform ETL processing as these in general are easier to version as data themselves. Java or C++ code, once compiled, usually requires rebuilding and rebooting an application, which can badly impact other running applications.
- Store state independent of global metadata. Most applications today, even those that may not necessarily appear to be so such as games, usually involve maintaining both application state - where you are within the application, what things you have done and so forth - and global information - how your actions effect the environment the application shares with others. In general, these should be stored in separate places, with the emphasis largely being on maintaining global data preferentially.
Such global data is shared, not only by other users utilizing their own instances of the same application you're using, but also with other applications that may only be peripherally connected (such as data analysis tool-sets). This points to the re-emergence of global data hubs which provides for a shared context of information, but with the flexibility of storing local data either as "scratch-pad" content (holding drafts of objects, for instance), or maintaining stateful information for the application itself (what the IP address of a given application is, for instance). - Build abstraction chains. When an XML, JSON or RDF document containing information is ingested, treat it first as an incoming document, then as a document utilizing a specific protocol, then a document describing a specific type of entity, then a document of that entity type with specific configurations and schema version associations and so forth.
This allows you to always be able to process inbound information to some degree without everyone in the system always having the latest version of information, and it means that when an application does receive data it can't handle, it can at a minimum fail gracefully and early with as little potentially negative impact as possible. - Plan for end of life. While you can extend the life of an application considerably, eventually conditions will have changed sufficiently that migrating the application to a new environment may become necessary - better storage technologies, different application tools, maybe different owners.
Just as people write wills to indicate end of life conditions, so too should designers spend some time working out how information contained in an application be most easily transferable to a new environment. This may involve moving data backups, roles and permissions, data configurations, schema, local data state and so forth into open data formats. It should also provision for encrypting personal privacy or secure information as part of a data governance strategy. If designed well, most of this information should already be secured elsewhere, but planing for it makes it easier to ensure that this does happen.
As applications become increasingly data driven, it becomes increasingly required that the data designer future proof their applications, not so that they can run for decades, but so that they can follow a natural life cycle arc without negatively impacting the data environment at any point in that arc.
Kurt Cagle is the founder and chief ontologist of Semantical, LLC. He has been building applications (and data systems) at all levels of the stack from client to deep data for Fortune 500 clients, governmental agencies and universities for a while now.
Fabulous quality post as always...