Napa.js: a multi-threaded JavaScript runtime
Napa.js is a multi-threaded JavaScript runtime built on V8, which was originally designed to develop highly iterative services with non-compromised performance in Bing. As it evolves, we find it useful to complement Node.js in CPU-bound tasks, with the capability of executing JavaScript in multiple V8 isolates and communicating between them. Napa.js is exposed as a Node.js module, while it can also be embedded in a host process without Node.js dependency.
Why Napa.js?
On July 17, 2007, as a corollary to Tim Berners-Lee's Rule of Least Power, Jeff Atwood proposed Atwood's Law which states that "Any application that can be written in JavaScript, will eventually be written in JavaScript."
This prophecy is being fullfilled by Node.js, its vast module eco-system on NPM, and more projects like Angular.js, React, and etc. in the following years. "What JavaScript can do" leads to "what JavaScript will eventually do".
In Microsoft, as we studied backend services in Bing, many of them were computational and implemented in pure C++, for performance reasons. Yet they evolve quickly, while algorithms were added, modified and removed every week in Bing's serving stack. If we can find a way to accommodate both performance and agility, it will redefine our engineering process. We surveyed a few dynamic languages, JavaScript stood out, thanks to the blazing fast V8 JavaScript Engine, and a large JavaScript module eco-system on NPM.
We have four key requirements to satisfy: 1) the solution should provide mechanism to quickly iterate on the algorithm; 2) the solution should take advantage of multiple cores; 3) across multiple workers they can share memory with structures, and 4) fine granularity parallelism should be possible, that we need to minimize communication cost between threads.
At the time, there were 2 established ways to support CPU bound tasks in Node: Computational code in C++ exposed as async JavaScript function, or Node cluster. The former cannot satisfy requirement #1, and the latter cannot satisfy #3.
So we started Napa.js, with learning and inspiration from Node. We were convinced that multi-thread programming model in JavaScript is essential, that could enable developers to have a dynamic balance between JavaScript and C++ at an arbitrary ratio. For even performance critical projects, developers can start everything in JavaScript, iterate, converge, and gradually convert logics into C++ when things are final. Not surprisingly, with the 80/20 rule applied here, that less than 20% lines of code impact 80% of its performance, we may end up with a happy ending: a mix of JavaScript and C++, with most frequently changing parts (like control flow) in JavaScript, and highly reusable, performance critical code in C++, that deliver close-to-native performance services. This programming model can support virtually any service that is written in C++ today.
How Napa.js works
Following concepts are essential to understand how Napa.js works.
Zone
In Napa.js, all work related to multi-threading are around the concept of Zone, which is the basic unit to define policies and execute JavaScript code. A process may contain multiple zones, each consists of multiple JavaScript Workers.
Within a zone, all workers are symmetrical: they load the same code, serve broadcast and execute requests in an indistinguishable manner. Basically, you cannot ask a zone to execute code on a specific worker. Workers across different zones are asymmetrical: they may load different code, or load the same code but reinforce different policies, like heap size, security settings, etc. Applications may need multiple zones for work loads of different purposes or different policies.
There are 2 types of zone:
- Napa zone - zone consists of Napa.js managed JavaScript workers (V8 isolates). Can be multiple, each may contain multiple workers. Workers in Napa zone support partial Node.JS APIs.
- Node zone - a 'virtual' zone which exposes Node.js event loop, has access to full Node.js capabilities.
This complex enables you to use Napa zone for heavy-lifting work, and Node zone for IO. Node zone also compensates Napa zone on its incomplete support of Node APIs.
Following code creates a Napa zone with 8 workers:
var napa = require('napajs');
var zone = napa.zone.create('sample-zone', { workers: 8 });
And code snippet to access the Node zone:
var zone = napa.zone.node;
Two operations can be performed on zones:
- Broadcast - run code that changes state on all workers, returning a promise for pending operation. Through the promise, we can only know if operation succeed or failed. Usually we use broadcast to bootstrap application, pre-cache objects, or change application settings.
- Execute - run code that doesn't change worker state on an arbitrary worker, returning a promise of getting the result. Execute is designed for doing the real work.
Zone operations are on a basis of first-come-first-serve, while broadcast takes higher priority over execute.
Code below demonstrated how broadcast and execute collaborate to complete a simple task:
function foo() {
console.log('hi');
}
// This setups function definition of foo in all workers in the zone.
zone.broadcast(foo.toString());
// This execute function foo on an arbitrary worker.
zone.execute(() => { global.foo() });
You can find more details about zone here.
Transporting JavaScript values
V8 is not designed for running JavaScript across multiple isolates, which means every isolate manages their own heap. Passing values from one isolate to another has to be marshalled/unmarshalled. The size of payload and complexity of object will greatly impact communication efficiency. In Napa, we try to work out a design pattern for efficient object sharing, based on the fact that all JavaScript isolates (exposed as workers) reside in the same process, and native objects can be wrapped and exposed as JavaScripts objects.
Following concepts are introduced to implement this pattern:
Transportable types
Transportable types are JavaScript types that can be passed or shared transparently across workers. They are used as value types for passing arguments in broadcast and execute, as well as sharing objects in key/value pairs via set and get.
Transportable types are:
- JavaScript primitive types: null, boolean, number, string
- Object (TypeScript class) that implement Transportable interface
- Array or plain JavaScript object that is composite pattern of above.
- Single JavaScript value undefined
You can read more details about transport here.
Cross-worker Storage
Store API is introduced as a necessary complement of sharing transportable types across JavaScript workers, on top of passing objects via arguments. During store.set, values marshalled into JSON and stored in process heap, so all threads can access it, and unmarshalled while users retrieve them via store.get.
Following code demonstrates object sharing using store:
var napa = require('napajs');
var zone = napa.zone.create('zone1');
var store = napa.store.create('store1');
// Set 'key1' in node.
store.set('key1', {
a: 1,
b: "2",
c: napa.memory.crtAllocator // transportable complex type.
};
// Get 'key1' in another thread.
zone.execute(() => {
var store = global.napa.store.get('store1');
console.log(store.get('key1'));
});
Though very convenient, it's not recommended to use store to pass values within a transaction or request, since its overhead is more than passing objects by arguments (there are extra locking, etc.). Besides, developers have the obligation to delete the key after usage, while it's automatically managed by reference counting in passing arguments.
You can find more details about store here.
Where we are
Napa.js is published to NPM as package napajs, while its source code available on Github. We are actively developing new features, fixing bugs, and exploring patterns that can best utilize Napa.js to achieve native performance with JavaScript productivity. In the coming months, we are focusing on minimizing cross-thread communication overhead and minimal GC impact on long running latency-critical applications.
Stay tuned.
Great work! Congrats!
So proud to have worked on the early concepts of this project. Congrats, Daiyi and the rest of the team! Great work!
Amazing contributions, congrats to you and your team!
Epic work Daiyi!!