Constantly Id'ing problems...
Time for another developer rant!
I am constantly amazed by how foolish we developers can be. In this article I am going to talk about something that I have seen a lot of. It is something that upsets me because it makes life needlessly difficult. It's come up in both PHP and Java code, so it's not a language specific problem.
To illustrate the issue I will talk you through a made up example. As usual with made up examples it will be simplified and probably be "fixable" in some other clever or simple way - like using an enum column. So please do not fixate on the example, but instead concentrate on the point I am trying to make.
Our example is this. We have a table of things. Let's say it is products. These products have a type. This type is a foreign key into a product_type table. We all get this kind of set up and this is not the problem. The problem comes with the code.
All too often I am seeing classes that use constants whose value is the database ID. So in our example we might have a constant PROD_TYPE_ELECTRICAL = 1.
Why is this a problem? Well, it is tying the code to the primary key in the database. This is usually, but not always, an auto incrementing sequence. So you have to ensure that your development, test and production databases are all completely in synchronisation with each other. Where more than one developer is adding new types at the same time this becomes challenging because both could well use the same integer id for completely different types.
So - what do we do? Not use the autogenerated id as the primary key and insert our own made up ones? It is a possible solution but it leads to many other problems. Coordinating between your developers to make sure they pick numbers that are not already in use for starters.
No. The solution that I believe is workable is based on the assumption that the developers are already coordinating enough to be able to pick names for their constants.
We can not use text in our database as the foreign key. That is just inefficient.
What we can do is add a column to our product types table. That column will be a text column. It should have a unique key on it. The text in here is what goes into our code as a constant.
Now PROD_TYPE_ELECTRICAL = "ELECTRICAL"
We have done away with the reliance on the database id so our code will work regardless of the order migration or seeding scripts are run.
So, now to retrieve the data we simply join the 2 tables on the foreign key where our new column equals the value in the constant.
Is there a performance cost here? Yes there is. A very minor cost in having a join in the query instead of a direct where clause. It is a cost that is negligible compared to the cost and potential for bugs involved in attempting to keep all databases that this code needs to run on in sync.