Experimentation using feature toggles
Why experiment?
One of the core principles of being agile is to be able to inspect and adapt. When multiple, apparently viable options present, teams are left with two different approaches – select an option intuitively (either through collaborative or hierarchic decision making) or select an approach based on “inspect-and-adapt” model.
Intuition is not a bad thing – in many situations, where there is a constraint on evaluating all viable options, intuition based on experience (aka gut-feeling) is proven to be a better approach than selecting the first option or most popular option. However, the key phrase here is a "constraint on evaluating options". In the business context, this constraint may appear in the form of time (we need to get a solution fast) or budget (we don’t have money to evaluate options) or skill (we don’t have skills to evaluate some of the options).
In an ideal world, without any constraints, with experimentation, we would have the hindsight to make all decisions. In the real world, we need to anticipate some of the grey areas in our work (where decisions are not clear-cut and multiple equally viable options may exist) and plan to work around them. Machine learning have made some of these decision making simpler in certain areas, but for now I am going to explain how we benefited from experiments using feature toggles. For more information on why experiment, read what experts have to say.
What is a feature toggle?
When introducing a new feature in a system, it is sometimes helpful to feedback of real users under live operating conditions as to whether the feature met its intended benefits or if it can be refined. You cannot get this type of feedback unless the feature is released in the live environment. At the same time, if the feature has poor user experience, releasing it to all users may involve a significant backlash, which is not desirable – neither for the user who gets a poor experience nor for the provider whose reputation would be under fire.
What feature toggle tries to do is achieve a middle ground whereby the feature is released in live environment – however, only a fraction of the users will have access to it. Conceptually, the application code implements something like:
IF <some condition>
THEN “Enable feature X”
ELSE “Disable feature X”
It can be implemented
- Hidden to the user e.g. in an internet based application, all traffic originating from a certain geographical area will see (or not see) this feature, OR
- With consent of the user e.g. “Click here to use beta version of this site”.
If you want to know further details, please read up what Martin Fowler has to say.
Things to consider
While implementing feature toggle there are a few things to consider:
- It would be a good practice to use an aspect of the feature that is of importance/relevance in the condition clause. Making this a habit will also make your results statistically significant. For example, if you are depending someone to make a choice to go to a beta version of the software, you can expect them to not be surprised by change, to have an open mind and give you feedback. You should use this approach when the user experience with new feature can be significantly different and it is better to only put someone through it when they are knowingly doing it. In this case, it would also be better to allow them to revert back to not using the feature.
- Don’t confuse feature toggle with A/B testing. A/B testing is more to evaluate more than one option for implementing a feature whereas feature toggling is about getting feedback on a feature to refine it. It is possible that you have done an A/B testing first and then you are refining further with feature toggle or you are using a combination to segment your experimentation.
Feature toggle may introduce technical debt.
- Feature toggle may introduce some technical debt. When you have worked out the kinks in the feature, you want to ideally remove the if-then-else constructs and enable the feature consistently. This would require rework on code without adding any new functionality and is the debt associated with the experimentation. Plan the feature toggle such that the benefits of the experimentation far outweigh the debt associated. For strategies on how to do it, head over to the corresponding section in Martin Fowler's explanation.
Feature toggle is not a replacement for functional testing.
- Feature toggle is not a replacement for good QA process before releasing to production. When a feature is released via feature-toggle, we are not testing whether a feature is working as designed. We are testing whether the feature solves the problem in an optimum way in real-world context.
How did we implement
We developed an application for a highly regulated entity. It meant that there is a close control required (for functional reasons) as to who can see what and who can do what. This prompted us to build a highly efficient role-based-access-control (RBAC). Efficient does not necessarily mean fast – in our context it means that it would be possible to control actions at a granular level (think data element) or at a coarse level (think system level) in the same framework. This also means you can control access at system level (a system has many types of entities) at entity level (each entity type have many entities associated), group level (where an entity has one or more groups) at a user level (and a user may be associated with one or more groups). As a result, all aspects of our application behaviour is controlled by our RBAC framework.
Based on this approach, we could integrate our feature toggle within the RBAC framework (i.e. permission based toggle) and release features to specific target user groups and collect feedback. RBAC doubles up as our toggle router. We could tweak the RBAC information to include more or less users of the same user group or expand the groups or switch from one group to another without having to touch code. With this “canary” release we get the feedback required, tweak what we need and finally, when the feature is deemed stable, a quick RBAC update is all that is needed to make it available to all the right users. No technical debt in the code is left behind (as no technical debt was introduced specifically for feature toggle).
Find this useful? Did you try out feature toggle?