Performance benchmarking for Cloudflare D1 SQLite database
The main goal is comparing executing time for the users (1M records) and posts (10M records) tables with the id column as integer, UUID v4, and ULID.
Prehistory
As a frontend developer, I mostly worked with MongoDB in my pet projects. Currently, I'm working on a bigger product. That's a reason why I decided to rethink my decisions regarding chosen infrastructure, including database. I'm migrating to Cloudflare infra. Shortly, because it has friendly DX, transparent pricing, wider free packages, services I need, etc. So, the first step is migrating from usual MongoDB to Cloudflare D1 database based on SQLite. As a frontend developer, I have no idea how to write good SQL queries. During the investigation, I found posts about Prisma and how it's slow. That's funny. My search brought me to Drizzle . It has pretty nice documentations, it's free and open-sourced, it's faster than Prisma regarding provided benchmarks, I found many good feedbacks about it. When I started writing my first schemas, I started thinking about the architecture, security, and performance aspects on the beginning, just because I usually do that. And first point, that I'm using Nuxt application. I have a TypeScript interface provided by Drizzle from a table schema. This interface and data object type shared across server, API, and storefront components. It means, all the fields will be visible in the Network tab on request once it's fetched. I want to avoid cases when I get /api/users/1 explicit id field due to aesthetic and security reasons (yes, I know about session tokens). From the past while I was working with MongoDB, I remember that typically the _id field auto-generated by MongoDB uses UUID format. Not sure about the exact UUID version, but I'm sure it's not just an auto-incremented integer. I was just interested what if it can negatively affect a performance of the SQLite database.
Preparation for performance benchmarking
I'm using a Nuxt 3 application (with Nitro including), nitro-dev-cloudflare, wrangler, Miniflare for D1 local development, Drizzle ORM, Drizzle Kit, Drizzle Studio, and Drizzle Seed functionality. I decided to seed the database with 1 million users and 10 millions posts. But found out, that D1 has pretty limited request row. In my case, I could see only around 10-15 users and posts, then I just got error:
too many SQL variables at offset 425: SQLITE_ERROR
So, instead of using the drizzle-seed package, I just manually seeded the tables in a for loop. Also, I had to increase RAM memory to 10 GB (default is 2 GB) for Node in my ~/.zshrc:
export NODE_OPTIONS="--max-old-space-size=10000"
Request example
It's pretty similar for all cases:
Performance benchmarking — Integer
Response time is ~480ms:
Performance benchmarking — UUID v4
Response time is ~1789ms:
Performance benchmarking — ULID
Response time is ~878ms.
In comparison with UUID v4 (36 chars), ULID has fewer chars (26 chars), but it's lexicographically sortable, time-based. I guessed it has to be better for performance and database size if I want to keep UUID flow for the id column.
Recommended by LinkedIn
Conclusions
The best option
Andrii Sherman suggested to me the best approach. I can create a custom type via Drizzle ORM that returns integer in the same way for the id column. So, just go forward with auto-incremented integer primary key. But! It can be encoded only on select (no need for insert) to hide the exact value when returned from API.
Screenshot explanation:
Database example:
API response:
So, it's equally fast as just using integer. In addition, I encoded it via the Hashids NPM package. It means, it could be decoded only when you know exact value of a decoding secret key. It solves what I meant. Furthermore, it seems much better in the API response, but still be the fastest option as needed due to the performance requirements.
There's a related commit.
There's a related GitHub repository:
Subscribe to my GitHub and LinkedIn profiles here to stay tuned for future investigations and insides about a product I'm working on.
Credits
Huge thanks to Andrii Sherman for time, assistance, useful tips, and for the detailed information that was pretty handy for this investigation.