Database or Code First Approach - Developer's Vision
Which of the two design workflows should I use? In this article I will explain the concepts of Database and Code First approaches, its advantages and disadvantages and reasons to choose Code First design.
At this moment, I have been working with Database and Code First approaches, and in the past I used Database First most frequently because it was the quickest path to get a system up and running, you can rapidly develop your database design in SQL Management Studio then generate the code model in just few clicks. But, in the last 5 years I have come to prefer Code First for some reasons that I will show below.
And, why this subject? because you still have an important decision to make when you start a new project or want to move a legacy project to a new architecture. At the end of your reading share your opinion and personal experience. Leave your comment.
Database First
The Database First workflow begins with a legacy database and leverages a wizard to reverse-engineer that database into a conceptual model (see Figure 1).
Figure 1 - With Database First, You Reverse-Engineer a Model from a Legacy Database
So, when you generate EDMX from an existing database it is called as Database First approach (see Figure 2).
Figure 2 - Database First
Almost always this approach seems to feel like the most logical way to work, you have already a database and you create a model from an existing database and you’ve probably seen this demonstrated many times. You open the Entity Data Model Wizard, point to an existing database, select which tables, views, stored procedures and user-defined functions you’d like to be represented in the model and click the finish button. Instantly, a model is born. Also, Entity Data Model can be updated whenever database schema changes.
So, with no more effort, developers will already benefit from the Entity Framework. You can write strongly typed queries against this model, and the Entity Framework will execute those queries for you and materialize strongly typed objects from the results. Then, as you work with the results, the Entity Framework tracks the changes and allows you to persist them back to the database simply by calling its SaveChanges command.
Possible scenarios in which Database-first are applied:
- Long lasting databases with stable schemas
- Changes to database and model are incremental
- You want to be able to see your model in one diagram and update from there
- You can divide your model to non-overlapping diagrams
- You have a database designed by DBAs, developed separately or if you have existing database
Advantages and Disadvantages
Advantages
- You can use an existing database and create your tables and associations there.
- Easy to avoid data loss on changes because you will work from the database perspective.
- Better integrated with Stored Procedures and function results (some improvements have been done in EF6)
- Query Performance (is a relative point that need to be measured).
Disadvantages
- Creating associations, foreign keys, constraints etc. from the database can be more difficult.
- Not easy to sync database changes. Let’s say you change your database on your local machine then you need external tools to sync (commit/rollback) your changes with a remote database. This can be the major disadvantage.
- When using code version control tools such as SVN or GIT, merging .edmx files can sometimes be a pain.
- When you need to add functionality to your generated models, like a calculated/computed read only property, you need to extend the model class.
- Database deployment process.
- Looking Ahead to Entity Framework 7, going forward EF will no longer support the EDMX based model.
Code First
The Code First workflow begins with classes that describe your conceptual model. There’s no visual model used with Code First. So, whether you have an existing database or not, you can code your own classes and properties that correspond to tables and columns and use them with Entity Framework without an .edmx file. In this approach Entity Framework does not leverage any kind of configuration file (.edmx file) to store the database schema, because the mapping API uses these conventions to generate the database schema dynamically at runtime.
Also it’s quite simply much easier and quicker to add a property to a class in the code than it is to add a column to a table, as you generally then also need to generate a database script or execute a database compare tool to be able to use the added column in production, as well as regenerating the model once the column is added, while all this is covered automatically by Code First and its "Migrations" for database changes.
For developers who follow the path of Domain-Driven Design (DDD) principles, prefer to begin by coding their domain classes first and then generating the database required to persist their data (see Figure 3).
Figure 3 - Code First
Let’s start playing around, first create a new Console Project, use Nuget package to include the Entity Framework reference in your project, and create a new file called Models.cs:
And then use the code below to test your database:
Next, add a connection string to App.config file:
And finally, let’s create our database from our code model. Open Package Manager Console and type the commands below:
- PM> Enable-Migrations
Checking if the context targets an existing database...
Code First Migrations enabled for project CompanyName.Product.ProjectName.
- PM> Add-Migration
cmdlet Add-Migration at command pipeline position 1
Supply values for the following parameters:
Name: Initial
Scaffolding migration 'Initial'.
The Designer Code for this migration file includes a snapshot of your current Code First model. This snapshot is used to calculate the changes to your model when you scaffold the next migration. If you make additional changes to your model that you want to include in this migration, then you can re-scaffold it by running 'Add-Migration Initial' again.
- PM> Update-Database
Specify the '-Verbose' flag to view the SQL statements being applied to the target database.
Applying explicit migrations: [201609011331147_Initial].
Applying explicit migration: 201609011331147_Initial.
Running Seed method.
That’s it. Our Code First is now working.
Now is possible add a new column to a table adding just a property to the model class, renaming a column, removing a column, and do common changes without losing any data. Also an important point, all database changes have to be done in your code!
Possible scenarios in which Code-first are applied:
- Development of model is done by programmers that are not interested in database
- General expectation is that you do not bother with database. Database is just a storage with no business logic
- Model classes contain logic
- Model classes have non-standard structures
- Model is divided between many assemblies that are not all known at design time (extensions)
- Databases are short lived (e.g. run-time of application)
- Database can be changed often
- Scalable applications
Advantages and Disadvantages
Advantages
- Supports database migrations which make it very easy to keep various databases in sync.
- This is so powerful. You can make changes to your model in code, and then the framework can keep track of schema changes, so you can seamlessly deploy upgrades, with schema versions automatically upgraded (and downgraded if required).
- Small model changes will not lead to any data loss.
- You have more customization options and full control over the code (without auto generated code).
- Manual changes to database will be most probably lost because your code defines the database.
- Easy attribute decoration - You can decorate fields with validation, require, etc. attributes, it's quite awkward with EF modelling.
- No weird modelling errors - EF modelling often has weird errors, such as when you try to rename an association property, it needs to match the underlying meta-data.
Disadvantages
- It is harder to maintain a database once you do not have a visual design tool and it must be from the code perspective (you can still visualize model via Power Tools, but the model is read-only).
- Knowledge of C# is required to create databases.
Common Misconceptions
There are a number of misconceptions in the community especially on a stack overflow, for example some developers say:
- Database First gives you more control over the database but this is actually not true at all with Code First you have full control over the database. Let’s say you want to create a trigger or modify the query or maybe create or remove a stored procedure, you can do all of these with Code First.
- Code First is for green field projects and if you have an existing database you should use database first, again it’s not right at all because you can reverse-engineer your existing database to create a Code First model and then use Code First Migrations for any subsequent changes afterwards.
Conclusion
It definitely depends on your situation which approach is most suitable for your project but I still prefer the Code First method. Why?
- Code First support database migrations.
- Full versioning of database - You can bring your database to any version with running only one command and this is extremely helpful if you are maintaining different versions of an application without Code First you either have to manually take care of versioning or use a tool like red gates source control, but that’s very expensive, with Code First you get these for free.
- Small changes to your code doesn’t lead to data loss. This also applies for database first approach of course.
- Much simpler to keep databases in sync between developers and different versions of your application.
- I think it’s the more natural way of working. You would say that the most natural way is using Database First. But remember that the database is just a tool to save your data. As a developer your point of view is your application. How it's saved shouldn't be my main concern as long as my models are clear to read. I don't want to adjust my code to be able to save it in the database. The database should adapt to my code and that is exactly how Code First works.
- You have much more control over what is created and how things work. For example, you can specify a character limit and a column size separately.
I have used Code First in many projects and even in production and have been very happy with it, but it can happen that where you work you might have a legacy database and for that reason you may not be able to use Code First because there are certain steps you need to follow to update the database then you have to use database first. If that’s the case you have no choice but if you have the chance to change the system and influence your team to embrace Code First, that’s great, go for it or at least give it a try, don’t just stick to the old ways of doing things, always try different approaches and see what works better for you.
Read more
3 reasons to use code first design with Entity Framework
Domain-Driven Design: Tackling Complexity in the Heart of Software by Eric Evans
Applying Domain-Driven Design and Patterns by Jimmy Nilsson
Programming Entity Framework: Code First by Julia Lerman, Rowan Miller
Here is a complete series for code first approach step by step:
Relationship in Entity Framework Using Code First Approach With Fluent API
Code First Migrations With Entity Framework
CRUD Operations Using Entity Framework 5.0 Code First Approach in MVC
CRUD Operations Using the Repository Pattern in MVC
CRUD Operations Using the Generic Repository Pattern and Unit of Work in MVC
CRUD Operations Using the Generic Repository Pattern and Dependency Injection in MVC
Jader, sei que escrever artigos aqui no Linkedin não é uma tarefa simples! Parabéns pela sua dedicação. E saiba que, por aqui, estou acompanhando e torcendo pelo seu sucesso!!! :) 👏
Thanks Jader, Nice article. But I have question to ask. If we follow Code first approach, then all the stuff we were doing in Database will be done in C#. Like Stored procedures, Function, Assemblies etc. Now even for a small change in a query, I need to edit my code base, build it entirely and deploy it completely. Ex: if there is a SP to fetch first 10 records, I can increase it to 20 records without compiling my code. The entire thing works without touching and/or re-deploying my c# code. I can do all T-SQL changes with disturbing c# code which working. I hope you see my problem. I am trying to avoid more builds and deployments which are more c# concentric.
Thanks for this nice article, but we have some limitations with EF6 code first. We sacrifice sometimes with EF code first, one scenario I have remember while adding new columns to the existing table after defining the column order by fluent API EF code first add the new column in the table as last column, so we can't control that. Is there any way to handle this ?
Thanks for your post Jader. If it is about either (1) redesigning a system or (2) designing something from scratch with required high performance (of course I am not indicating premature optimization) in mind then now a days I think about designing a system in the following manner. If we have some knowledge about the domain, possible use case frequency and user interaction/lookup pattern then my style would be to rely on "query driven" design. I would try to think about what lookup or interaction is most probable in the system, acceptable service level as well as what is going to be the data volume and then I would design the tables, primary keys, secondary index, decision to normalize or denormalize accordingly. And code would be based on that - basically to integrate all the pieces seamlessly. In a certain way it is database first design (even though we are not necessarily dealing with a legacy system where the database already exists). I got this style of thinking while exploring Cassandra NoSQL database and I believe there is always more for me to learn (and improve in areas) which makes my days so interesting!
Jader-well written!