Automatic Data Seeding in Django
This post is part of Proyek Perangkat Lunak (PPL) or Software Engineering Project Course. This post is aimed to explain the approach we use and the benefits we got by doing things related to software engineering in the project. #PPLpenuhmakna
Imagine one day you’re building a cool feature. You create the beautiful layout in the frontend, build its backend, and most of the time you need to introduce a new table in the database for storing new data. Great, now you try to add some data to see the working feature and how beautiful it is. You add 1 data and it looks good! What if the frontend is broken if it shows more than one data? Now you add 2 more data so you have 3 data to display. It still looks great! Now you wonder if the pagination is actually working, so you try to add 7 data and get 10 data in the end. Cool, your pagination is working!
Seeing you smiling while looking at your monitor, your friend comes over and see your feature as well. “Wow, your feature looks great! Can I have it on my local machine?” It’s obvious you say yes, who doesn’t want their feature to be used by their friends right? “Great it’s working on my machine, but where is the data to be displayed?” your friend said. Now you remember what you’re doing in the last hour. It is at this moment you figure out this does not scale. You have a whole department to install your feature on their local machine along with the data. Can you do better than inputting the data manually on each machine? Fortunately, you can.
There are many ways you can fill your database. In this article, let’s focus on using Django’s custom command. Here, let me walk you through by an example of my project by adding data to the schedule table (or model in Django term). This is the model definition:
This model is actually simplified for the sake of simplicity. Now, what will good data look like? Let’s say I want to have data from the previous ten days up to the next 90 days, so I will have 101 days to fill. Good data seed need to be random, so we don’t have bugs covered up by using the same data. Let’s use Python’s built-in random module here. How can we fill the address field? Just a random string? Well, we can do better thanks to Faker. We can fill it with fake data so it looks like a real one. Now we have this command in app/management/seed_jadwal_donor.py
Looks cool, now let’s run it.
Wow, so simple! Now let’s see if the data are actually there
We get it there!
Now as we are professional engineers, we should maintain the quality of the code right? One significant way to maintain quality is to write tests. Can we do that with this custom command? Yes, we can. Nothing is missing in Django, right? ;)
This test basically runs the command and assures sure it is running successfully by asserting there’s substring “Success” in the output. This test seems so simple and not adding more value, right? Let me tell you, this test did save us from creating a bug when we change the model. This is because Python is not static-typed so all bugs that actually can be caught in compile-time will go in runtime. This is bad and we want to avoid it so just write this simple test and let it help you guard against bugs in the future.
That’s all for data seeding, see you in the next post!