Basic of good optimize database application programming -- FORGOTTEN
Basic of good optimize database application programming -- FORGOTTEN
In my recent encounters with many younger generation programmers, I notice many of the practice we were trained in to ensure our application is optimize have been forgotten, this have cause a number of avoidable errors....
For example, I was ask to look into a schema which was running rather slowly... and I saw to my horror, all the text column (regardless of whether there is a specific length item like say telephone number or not), the data type chosen is all VARCHAR. So out of curiosity, I ask the developer, why Varchar... and answer is "Oh convenient, no need to count byte". I followed on with a question whether he understood the implication of a VARCHAR versus a CHAR... I got a blank stare. When I explained that VARCHAR is a more expensive operation than CHAR as it have to do 2 read for every column, first to read the length of the column then read the number of bytes for the column... I can see the surprise in the programmer.
In another example, I was told a Select statement would not load... when I look at the statement, I notice the programmer have written the program which generated a Select statement joining a large multi-terabyte table to a small table... where the SQL tries to load the multi-terabyte table and then scan the small table. Switching the order, loading the small table first, solve the problem...
Or for that matter, I got a number of programmer telling me that the database engine is dumb, they have created an index, but the engine just do not use it and always goes back to a serial scan... When I investigated this issues, more often than not, the rule that the first predicate must equal the first column of the index is not followed. Many simply do not know that just having a column in the index is not enough reason for the database optimizer to use the index.
As I encounter quite a number of this issues... I started a quick survey to see what happen.. to my horror, I found that many of this programmer either do not know, do not follow or simply took the easy way out... quite a number of times, I even found programmer diving straight into programming and building the table without even looking at normalization needs, or building data flow diagram or entity relationship diagram. This often lead to challenging outcome.
For a long while, many such archaic practice from the ERA where we have limited resources, not so powerful processor needs to follow was erase off memory becuase we now have very powerful machine... I remembered, my first personal computer is an Apple II, running on MOS Technology 6502 processor with just 16KB memory running off a 5.25 in floppy drive.. In those days, we needed to count bytes to ensure our program runs... PC today run mostly with minimum of 8GB memory running on 64 bits chips so many have not needed to worry about programming in a world with limited resources for a while... However, as we evolve into the world of mobile computing, days of machine with limited resource have again surface ... so may be now Millenium have a reason to learn from the GEN X....