Unknown parameter in bagging area
I recently had an interesting situation when I was calling a webhook API endpoint using curl and happened to misspell one of the parameters — `duraton` rather than `duration`. The API kicks off an experiment that runs for the given duration. The only problem was that I wanted a 5 minute experiment and what I got was a 12 hour experiment!
No errors were reported by the API and the experiment went ahead since the duration parameter has a default value of 12 hours.
This experience made me wonder: should systems ignore unknown parameters, or should they enforce strict validation to avoid unintended consequences?
In favour of ignoring
Let's consider the arguments for ignoring unknown parameters, which was my first instinct.
I am reminded of the software architecture Robustness Principle, which states "be conservative in what you send; be liberal in what you accept". This principle is also known as "Postel's Law" after Jon Postel who first wrote about the principle in the RFC 793: TCP specification.
This principle is intended to encourage flexibility and interoperability.
By flexibility, I mean it provides a basis for APIs to evolve over time so a client may send a request parameter unknown to the server, and the server won't freak out but function with the information provided — exactly what happened in my case.
REST API evolution often needs to add parameters and a common way I've seen this done is to make the new parameter optional. This allows older clients that don't known about the parameter to continue to interoperate with newer server implementations (forward compatibility). Newer clients would be aware of the optional parameter and use it appropriately. This is important when you have, say, a mobile app that is distributed on thousands of devices and having them all lose functionality after a new back-end release would be a poor user experience.
Having said that, this can only go so far. If the change in the API is significant enough, it probably warrants a change in the API/protocol version and require a phasing out period after which older client implementations will no longer be supported.
The Robustness Principle also provides a way to tackle interoperability. This was the specific intent in the TCP specification where interoperability between different implementations of the TCP standard was expected.
In practical use, there is a common and necessary use of ignoring unknown or unrecognised parameters that I'm sure most people run into every day on the internet: analytics parameters. These are the UTM tracking codes, affiliate tags and so forth that are commonly added to URLs but don't have any meaning to the underlying servers.
Counter-arguments to ignoring unknown parameters
The Wikipedia article above describes some counter-arguments setting out that applying the Robustness principle can lead to problems.
Recommended by LinkedIn
The paper Dropping on the Edge: Flexibility and Traffic Confirmation in Onion Routing Protocols by Rohet & Pereira that has this engaging abstract:
"The design of Tor includes a feature that is common to most distributed systems: the protocol is flexible. In particular, the Tor protocol requires nodes to ignore messages that are not understood, in order to guarantee the compatibility with future protocol versions. This paper shows how to exploit this flexibility by proposing two new active attacks..."
Ignoring unknown parameters could allow a bad actor to infiltrate a useful payload that can later be retrieved or acted on. Malformed parameters or parameter values could even be used as the attack vector themselves causing stack overflows and other security problems. Here is one example in a JSON parser that came up when I was researching CVEs related to parsing JSON parameters.
Beyond security, there's also the risk of unintended consequences. Even if an unknown parameter isn't malicious, it might lead to costly mistakes — as my own experience with the 12-hour experiment illustrates. In my case, there wasn't a security issue but I committed the resources to run a 12 hour experiment when what I intended was a 5 minute experiment and that had a cost implication.
I notice that most command-line programs will reject unknown parameters:
% curl --hlep
curl: option --hlep: is unknown
curl: try 'curl --help' or 'curl --manual' for more information
This is quite reasonable in my opinion as there is no 'interoperability' or flexibility requirement for most command line programs.
There are plenty of other examples of where unexpected items lead to a warning, if not an error -- the header image is a nod to the message anyone who has used a self-service checkout will have heard!
Conclusion
Coming back to the original question then: should you ignore unknown parameters?
Ultimately, the right approach depends on context:
What are your thoughts on whether to ignore unknown parameters? Have you had a similar experiment to me, where ignoring an unknown parameter lead to unintended consequences?