JavaScript and the C Bug

Lucas Silva

Published Jul 23, 2023

Recently I came across one of those image posts with tips for programmers that resurfaced in my mind a topic I wanted to write about. The post in question was about array deduplication and three solutions were given (as can be seen ahead). I’ll not reference it here, as my point is not picking on someone, rather I just want to share somethings I know in the hopes it’ll complement the knowledge of someone else as it happens to me on a daily basis.

Não foi fornecido texto alternativo para esta imagem

The moment I saw it I was bitten by a very annoying bug: the C bug. Let me explain: one of the first programming languages I learnt was C (in fact, it was the second one) almost 15 years ago. Since then it’s influenced the way I program. The constant need to manage the heap allocations and the use of raw/crude polymorphismish with pointers to void or unions followed me as a ghost up the abstraction hill. When I’m short on time, I just ignore it (hey, we’ve all been there: a single line of expressive code for a life of pain); when I have some to spare, I make the requested cheap optimization.

Then comes the deduplication post. The first alternative: fine, I’d write it to get things going and never look back if it stays quiet. However on solutions two and three the first thing I saw when looking at them was O(n²).

We all know that Array.prototype.filter and Array.prototype.reduce must go through all the elements of the array, but on top of that Array.prototype.indexOf and Array.prototype.includes will have to run, at most, through all the elements. Hence, for each step of filter and reduce we check, at most, the entirety of the array. Also, in the third alternative, the splice itself is hiding a for loop to shallow copy all the values from the source to the result (that's the power of abstractions). Take a look at the data below. The functions above were tested against an array of 100,000 random numbers.

One could say that he/she never has to deal with large datasets, thus there is no need to worry about which solution is implemented. Fair. However it’s worth pointing out that as code piles up, things like these can escape simple benchmarks and may hurt you.

Anyhow, let’s not stop with the complexity analysis and take a look at the heap. Again, no deep V8 implementation analysis or advanced algorithms, let’s just mention what can be seen at first glance:

1) The solution using a set:

Recommended by LinkedIn

Avoid if-else hell in javascript

Afsal M H 5 years ago

Beyond JavaScript - Why 0.1 + 0.2 doesn't equal 0.3 in…

Umang Sinha 1 year ago

C Web Compiler Project

Matthew Yong 6 years ago

We take an array of n elements (so let’s say we have n + n objects in memory: n for the elements and another n for each of the array’s slots);
Then we allocate a set. I’ll just assume that the set is implemented as a tree (I could just test in the browser’s console, but that’s too much effort for my analysis), so our heap would have n + n + m (m is for each of the set’s nodes, such that m ≤ n);
At last we spliced the set into an array: 2n + 2m. That is 2n + 2m blocks of memory; each block may have different sizes, but let’s keep on going with the least effort investigation.

2) The filter alternative: easy peasy filter squeezy. All we do is create a new array with filtered data, which is equals to 2n + m;

3) The splice solution is inspired by pacman, as once it starts, it will attempt to devour the whole RAM unless it’s stopped. Each step of the reduce will allocate a new array with the length equals to the index of the step (plus the hidden for loop). Until the GC kick in and, hopefully, cleans up our mess, the heap will look like this:

step 0: []
step 1: [value1] + []
step 2: [value1, value2] + [value1] + []
step 3: [value1, value2, value3] + [value1, value2] + [value1] + []
…
To infinity and beyond

All the things I said above can be seen just by looking at code, but you can go a step further (not so far as profiling, at first) and use the tools you have at hand to see how memory is being used. Check out the following snippet.

If you run the lines above, don’t hang on the numbers themselves, just pay attention to the magnitude of time and memory each solution takes. For instance, on my computer, most of the runs generated the results as can be seen here:

I don’t think you should optimize your code to the last percentile, I just want to point out that you shouldn’t pretend a computer is a box of limitless resources and that, by making simple analysis, you can deliver better code or at least get some performance budget for the days you have to move fast or are too tired. Also, by publishing it, I hope won't ignore theses principles as I once did and regret.

JavaScript and the C Bug

Lucas Silva

Recommended by LinkedIn

Others also viewed

Node.js Guide 7: Embedding Node.js in C++ Applications: The C++ Embedder API Explained

🚀 The V8 Engine Explained: The Secret Behind JavaScript’s Speed

Your first shiny app!

Reacting to React: Let's Talk Syntax

Node.js Mantra - Why Node.js ?

ReactJS Fragments, Import and Export Components

From Modern C++ to Modern Web Applications

JavaScript Events

Learn any of these 16 programming languages and you'll always have a job

Data structures in Javascript

Explore content categories