Pardon Tech - Promise Causality

Node’s the async_hooks API has been “experimental” since it was introducted in Node 8.

Stability: 1 - Experimental. Please migrate away from this API, if you can. We do not recommend using the createHook, AsyncHook, and executionAsyncResource APIs as they have usability issues, safety risks, and performance implications. Async context tracking use cases are better served by the stable AsyncLocalStorage API. If you have a use case for createHook, AsyncHook, or executionAsyncResource beyond the context tracking need solved by AsyncLocalStorage or diagnostics data currently provided by Diagnostics Channel, please open an issue at https://github.com/nodejs/node/issues describing your use case so we can create a more purpose-focused API.

We’re using it anyway.

Ordering Causality

Other http frameworks provide a single thread of execution, and a global environment object which is used to configure each request and response in turn.

With Pardon, we offer the same ergonomics of a “global object”, but the behavior is independent of the execution order: since only awaited assignments to the global object count.

The base of this is a tracking system that understands what happened to reach a certain point, by way of a series of tracked values.

In a single threaded usage, track(value) looks like it pushes a value onto a list and awaited() returns that list.

const {
  track,
  awaited
} = tracker();

track('a');
awaited(); // ['a']

track('b');
track('c');
awaited(); // ['a', 'b', 'c']

However if we have some asynchronous processes that track values, we won’t see them get tracked until we we await those promises.

const { track, awaited } = tracker();

async function f() { await randomDelay(); track('f'); };
async function g() { await randomDelay(); track('g'); };

const pf = f();
const pg = g();

awaited() // []

const ppg = randomDelay().then(async () => { await pg; awaited(); /* ['g'] */});
const ppgf = randomDelay().then(async () => { await pg; await pf; awaited(); /* ['g', 'f'] */});

await pf;

awaited(); // ['f']

await pg;

awaited(); // ['f', 'g']

// further awaits of ppg/ppgf here will not change the result.

Notice that awaited() above can return any combination of 'f' and/or 'g' in different orders: depending purely on the order they were awaited, rather than the order the track calls were actually run!

This might feel a bit weird at first, but it is stable, as the composition/order of the awaited() values will be based only on the (stable) lexical shape of the code, rather than the unstable order of execution.

In tests, we use this to track assignments to a global environment value. We also track dependent requests that might be executed via script helpers to surface dependent requests in favor.

How this Tracks

The tracked values are stored in nodes of an execution graph maintained with the async_hooks API NodeJS offers but tells us not to use.

The nodes of the execution graph have two links:

the init link is for the context in which the promise is created, and
the trigger link is to the promise that preceeds the execution.

The creation of this a node looks something like this:

function init(trigger: Promise<unknown>) {
  return promise = trigger.then(() => { ... });
}

Note that trigger is in general a still-pending operation, (as is promise, of course), while init is the context when the promise was created, so awaiting the promise will include tracked values from inside init (and wherever the function was called), and then the values that were tracked in the course of whatever trigger is waiting for.

Calling awaited() collects tracked values by searching this graph.

Another leak in the abstraction is that async functions are not run entirely async.

In the following kind of case:

async function hello() {
  track('world');
  ...
}

hello(); // not awaited

awaited(); // ['world']

Calling hello() results in the 'world' value being tracked, without awaiting the promise, because NodeJS runs the body of an async function synchronously as far as it can. (Similarly new Promise(() => { ... }) executes synchronously as well, while Promise.resolve().then(() => { ... }) is executed asynchronously).

Garbage

Like all non-trivial abstractions, this one leaks: Memory.

The awaited values are garbage-collected along with promises, as much as possible, but we have two helpers to cut the graph for correctness and better garbage collection, respectively.

shared(() => { ... }) drops awaited values going into the execution of ....
disconnected(() => { ... }) drops awaited values leaving the execution of ....

Both of these methods asynchronosuly start executing their function and return a promise for its result.

Shared Execution

shared should be used to wrap the creation of promises that are reused.

Consider the following,…

function getAuthToken() {
  return authTokenPromise ??= fetchAuthToken();
}

in this case the authTokenPromise will inherit the tracked values leading to the first call of fetchAuthToken() only. This introduces a race condition as the tracked results of getAuthToken() would depend on who called it first. Additionally the fetchAuthToken() function would have access to the awaited() values of its first caller.

To fix this, we use shared to disconnect the execution graph leading into the call.

function getAuthToken() {
  return authTokenPromise ??= shared(() => fetchAuthToken());
}

Inside fetchAuthToken() the awaited() list will be clear. Any values tracked inside fetchAuthToken() can still be awaited from the authTokenPromise.

Disconnected Execution

The disconnected helper does the opposite: it prevents awaited values from escaping the disconnected(() => { ... }) boundary. This is meant to be more of a garbage-collector hint than a behavioral mechanism.

For instance, when running hundreds of test cases we don’t want the async graphs to pile up in memory, and disconnected applied to the right places does the trick here.