Entity Framework and implicit conversions
Very recently I was faced with an issue which rang a very distant bell for me as it wasn't the first time I'd heard of something similar where SQL server was "forced" to convert something, to something else before it could do a comparison. But the cost of SQL Server doing that conversion had a major performance impact, also I think it was doing it on a JOIN.
Thats where the similarities ended. Although from what I can recall, and I'm going back over a decade here, which, for anyone that knows me, Dave remembering what he had for lunch last Tuesday is challenge enough (or maybe it isn't just me?). Anyway, I digress which is something I'm really good at doing.
So, the code in question was using EF to do a FirstOrDefault() on an entity.
An example below:
var result = myEntity.Where(x => x.ThisStringField == "ThisValue" && x.AnotherStringField == "ThatValue").FirstOrDefault();
But it was killing the database.
The fields were indexed, but that didn't matter. It was the implicit conversion which was really having a negative impact on the performance.
The problem as it were was that the table in SQL server was using VARCHAR as the datatype. However Entity Framework by default for strings assumes you want NVARCHAR to be used for the string values. As such the query when it hit the table resembled the below:
SELECT TOP 1 *
FROM MyEntity
WHERE N'ThisValue' = ThisStringField
AND N'ThatValue' = AnotherStringField
Aside from me not liking how it had the value being compared first in the comparison (why couldn't it be WHERE ThisStringField = N'ThisValue' ?) the underlying effect was it needing to convert ThisStringField and AnotherStringField to be NVARCHAR before it could do the comparison.
Note: There is an order of precedence in data types, (a handy microsoft page here), so if a NVARCHAR is being compared with a VARCHAR, the VARCHAR is converted to be NVARCHAR before the comparison occurs.
Imagine that, every row being subject to that conversion and to do that conversion, SQL needs to consume additional resources. You can understand how it might have an effect.
Ok so for the fix there are two choices.
Fix 1 - Change the table
Change the field in the table to be NVARCHAR. Ok it'll take up more storage but thats an avenue. However, its not feasible if youre in a 24 x 7 utilised system and that table is one of the core ones with not only millions of entries, but its going to be read and written to by so many different processes every second. So modifying the table and causing a schema lock while that happens, just aint going to fly.
Recommended by LinkedIn
Fix 2 - Change the entity model
Add a simple attribute to the entity, EF will pick that up and use it for when it eventually prepares the SQL to be used.
So this involved a very simple addition to the entity where we specify the TypeName
[Column(TypeName = "varchar")]
public string ThisStringField{ get; set; }
[Column(TypeName = "varchar")]
public string AnotherStringField{ get; set; }
After doing that and testing locally, the generated SQL changed to the below:
SELECT TOP 1 *
FROM MyEntity
WHERE 'ThisValue' = ThisStringField
AND 'ThatValue' = AnotherStringField
Now the string was a varchar which meant SQL server didn't need to convert ThisStringField or AnotherStringField to be NVARCHAR to do the comparison.
Also, the end result when used in anger, something that was taking seconds to run, was now completing in a fraction of a second.
ORMs give us some great flexibility out of the box, they can also hide inefficiencies if things are not configured correctly which might only become apparent if you have access to performance profilers.
Thanks to:
CP - My OB for casting her eye over a whatsapp version of this article before I put it up here
The amazing people I work with at Ensek
OB goals 🙏