99% of node js developers have one of these misconceptions

99% of node js developers have one of these misconceptions

  1. Set timeout

Many developers have a misconception that set timeout executes the exact time given, instead, it specifies the threshold after which a provided callback may be executed. Timer callbacks will run as early as they can be scheduled after the specified amount of time has passed; however, Operating System scheduling or the running of other callbacks may delay them. In order to make asynchronous execution possible node js event loop goes back and forth between operations and it has a certain order of operations (phases) to execute callbacks, timers, I/O, and network operations. setTimeOut and setInterval are executed in the timer phase. The event loop gives a priority to the poll phase where node js executes IO-related events and callbacks instead of waiting for a timer to complete.

Let’s see an example:

const fs = require('fs');

//scheduled a timeout of 10ms
setTimeout(() => {
  console.log('time out');
}, 10);

fs.readFile('unexisting_file.txt', (err) => {
  // wait 4 seconds
  const timeScheduled = Date.now();
  while (Date.now() - timeScheduled < 4000) {
    // do nothing
  }
  if (err) {
    console.log("the file doesn't exist");
  } else {
    console.log('done reading the file');
  }
});        

If you run this you will wait 4 seconds and the file doesn’t exist will be logged then time out will be logged.

No alt text provided for this image

As far as our understanding of how setTimeout works we should see time out first after exactly 10ms but that is not how it works. As I mentioned earlier the event loop doesn’t sit waiting for the 10ms time to elapse. Instead, the event loop schedules the timer and enters the poll phase and it will seereadFile() has been completed before the set timeout because we gave it an unexciting file So it will execute the callback of the readFile. It is when the callback finishes, the event loop will see that the threshold of the soonest timer has been reached and then wrap back to the timers phase to execute the timer's callback. But what if we passed an existing file? It all depends on the size of the file and how long it takes readFile()to complete reading the file. If the readFile takes time more than the time scheduled by set timeout then the timer will complete before executing the readFile.

2. Single thread

This is the most common one among developers that node js use a single thread to magically process all of this. There is some fact to the single thread thing but there is also some big misconception here. The fact is node js use a single thread to handle requests instead of assigning a thread for every request sent. So whenever a client sends a request to one of your APIs it doesn’t create a thread.

There is the main thread(the event loop) you can think of it as more like a main door where every operation goes into and is processed. But where is the misconception then? The event loop is like the center of everything, from the moment you start your application, it loops infinitely to run your code. It takes the script given processes it, offloads it, and communicates with a different part of node js, the operating system, and the thread pool to deliver your task.

Even though node js doesn’t create a thread for every request and you don’t have to deal with multi-threading and the difficulty with multithreading, it still uses multiple threads behind the scene. It makes use of the operating system kernel and the thread pool to offload tasks. The thread pool also known as the worker pool has by default four threads. When the event loop encounters time-intensive tasks such as File IO operations, some network operations by the DNS module, and crypto functions, it assigns one of the threads from the worker pool to deal with it. This gives the event loop time to spare on other operations.

No alt text provided for this image


3. JSON.parse and JSON.stringify

Thinking it safe to use these two functions in every scenario you have in your project is really dangerous and you should have the right understanding of how these functions work and know when to use them and when to avoid them.

What? But I have used it for many years and it is very important. Yes, I have also used it for many years and it indeed served me well but the problem starts when your input grows, it takes a very long amount of time and it can potentially block the event loop. The time complexity for these functions is O(n) and as the n grows the time it takes also grows. If your application takes and processes JSON objects, from a user, you should be cautious about the size of the objects or strings it takes.

Part of the reason why these functions are blocking is that they are processing the whole input and it works fine for small to medium size data but if you want to compute large data, you should consider other alternatives that use a stream to work with the given object or string chunk by chunk. There are npm packages that offer asynchronous JSON APIs such as:

4. Npm packages

Okay, what is the mis understanding here? There is a big wrong assumption when it comes to npm packages that they are safe to use and it is not the right way to go to run into npm package every time you have a problem to solve. Here are three reasons you should be cautious of npm packages.

  • Security

As a node js developer, It is very common to pull from npm repository, and not every package is secure. A lot of packages have security vulnerabilities that might be fixed in their latest version or not. You can run npm audit to scan your project’s dependencies for known security issues. A report will be generated to give you information about the vulnerability and on which version the vulnerabilities are fixed. In addition, it also has a URL that you can visit to learn more.

No alt text provided for this image


Another point worthy of mention is that node programs can access all the system calls, such as writing to the disk and accessing the network. So when you download a package from npm it has complete access to your computer and network and that is why you should be careful when downloading npm packages.

  • Performance

Sometimes it is better to reinvent the will than having to work with a package that lacks performance. Some of the things you should check to know if a package performs well are to see if it has a blocking code, you can see here what blocking codes look like and load testing can also be helpful to see the response time given different inputs. Checking the test coverage, the commits, and the number of downloads also can give you some pointers on how good the package is.

  • Application size

Npm packages can add to the size of your applications and if you are writing simple functions, it is better to create it yourself and put it in your utility directory than looking for an npm package. Big file size can affect the performance of your application and you should aim to minimize it.

5. Require

In node js require is used to import modules and it has been very useful in eliminating repetitive code and the need to reinvent the wheel. That being said I believe there is a misunderstanding and wrong usage of node js require because of not knowing how it works behind the scene.

The require function can import three different extensions .js, .json, and .node. and it had a different implementation for each. But what makes all three implementation common is it uses fs.readFileSync function to read and import the file. If you open the node repl by simply typing node and enter require.extensions[‘.js’].toString() you can see the js require implementation.


No alt text provided for this image


You would also find the fs.readFileSync function on both .json and .node implementation, so you should be careful about the size of the file you are importing especially if you are working with a big JSON file, it is best to use stream instead of directly requiring it or it might be time to create a place for it in the database. Minifying your js scripts is also a good idea when you are working on big js files.

When you require(‘npm_packagename’) it doesn’t just magically import the package, it goes through the node_modules folder to find the package and the algorithm to resolve module names is just widely complex and it can take time since node_modules is pretty big. So that is one of the reasons to use your own functions if it is easy to write and to always minimize it when you can, I see some npm packages like even or odd and their number of downloads is crazy.

This might not matter much honestly but it doesn’t hurt to know. So in node js, you can require files without specifying the extension (.js, .ts, .json), that might make you feel cool and I also do that most of the time but this can add up a very small time and complexity to node js figuring out the extension. Node js require when not given an extension, tries to find js files first, and if it can’t find that it will try to find .json files and then .node files. You can save a bit of time by simply specifying the extension name. You can learn more about behind the scene of require here.

Kal T. Great work !. I am still having some doubts related to libuv thread pool so can you please clarify the below questions? 1. When calling some other external api from node js using axios/node-fetch or database call will the request go to libuv thread pool or OS kernal? Since in the documentation they mentioned only native library like fs, crypto etc will go to thread pool. 2. If incase it is going to OS kernal, then what will the OS kernal requiment to handle huge concurrent request?

Like
Reply

To view or add a comment, sign in

More articles by Kal T.

Others also viewed

Explore content categories