How to implement a data structure in Typescript

Sandro Maglione

Sandro Maglione

Web development

I implemented a Trie data structure for Effect. During the process I learned all the details of what is required to build a usable and efficient data structure in Typescript:

  • What features to consider when implementing a data structure
    • Performance (time and space)
    • Stack safety
    • Immutability
  • How to make the data structure inspectable (JSON.stringify, toString)
  • How to compare equality by value
  • How to make the data structure Iterable

Data structure features

Some feature of a data structure will radically change its implementation.

While the algorithm behind the data structure may be straightforward, adding some requirements can make the actual implementation more challenging

For the Trie I implemented I considered 3 requirements:

  1. Performance
  2. Stack safety
  3. Immutability

Performance

This is the obvious (and crucial) feature.

The point of a new data structure is to be more performant for a specific usecase.

Therefore, big-O notation is the first aspect to consider.

Fortunately, most of the times you do not need to calculate this metric yourself, you can find plenty of analysis on the web 💁🏼‍♂️

⚠️ Important: The performance in the end depends on the internal implementation.

Stack safety

This instead is definitely not obvious (it was not obvious for me 💁🏼‍♂️).

Every time a recursive function is called some space on the stack (memory) is allocated.

If the recursion is too deep, we risk to reach the limit of the stack and the app will crash 🤯

I prevented this issue by not using recursion at all. This introduces some challenges and makes the code more complex.

It's always possible to convert a recursive implementation to non-recursive using an array (manual stack)

let stack: Node<V> = [];
stack.push(root); // Start from the root of the Trie
 
// Continue until there is something in the stack 👇
while (stack.length > 0) {
    // Pop off end of stack.
    const obj = stack.pop();
 
    // Do stuff.
    // Push other objects on the stack as needed.
    ...
}

Immutability

An immutable data structure cannot be modified in-place: methods like insert and delete will not update the original variable but return a new updated value.

This requirement can have an impact on performance.

You cannot create a full copy of all the nodes every time you update some value. Instead, I used a technique called Path Copying:

Path Copying: create a copy only of the nodes in the path from the root to the updated node, while keeping all the other nodes shared

Find the node to add, update or delete, trace the path from the root, and create only the updated nodesFind the node to add, update or delete, trace the path from the root, and create only the updated nodes

In practice this adds even more complexity to the implementation 💁🏼‍♂️

There is more.

Every week I build a new open source project, with a new language or library, and teach you how I did it, what I learned, and how you can do the same. Join me and other 600+ readers.

Data structure implementation

The first aspect to consider is the API. For a Trie these where the main methods:

  • insert
  • remove
  • get
  • longestPrefixOf
  • keysWithPrefix
  • size

Each of these methods require a specific and efficient algorithm.

Fortunately, most of the times you do not need to invent the algorithm yourself

You can search online and find plenty of algorithms with examples. You job is to implement the algorithm considering all the requirements.

Inspectable

Making a data structure inspectable in Typescript requires to implement the following interface:

export interface Inspectable {
  toString(): string
  toJSON(): unknown
}

In Effect the implementation is the following:

export const toJSON = (x: unknown): unknown => {
  if (
    hasProperty(x, "toJSON") && isFunction(x["toJSON"]) &&
    x["toJSON"].length === 0
  ) {
    return x.toJSON()
  } else if (Array.isArray(x)) {
    return x.map(toJSON)
  }
  return x
}
 
export const format = (x: unknown): string => JSON.stringify(x, null, 2)
 
const TrieProto: TR.Trie<unknown> = {
  // ...
  toString() {
    return format(this.toJSON())
  },
  toJSON() {
    return {
      _id: "Trie",
      values: Array.from(this).map(toJSON)
    }
  },
  /// ...
}

Iterable

By implementing the iterable protocol it is possible to allow using a for...of loop and Array.from to iterate the values of the data structure:

interface Iterable<T> {
  [Symbol.iterator](): Iterator<T>;
}

You can also implement the iterator protocol to allow using the data structure in generator functions (yield):

interface IterableIterator<T> extends Iterator<T> {
  [Symbol.iterator](): IterableIterator<T>;
}

I defined a TrieIterator that satisfies the iterable protocol:

The next() method returns one-by-one all the values when iterating over the data structure

class TrieIterator<in out V, out T> implements IterableIterator<T> {
  next(): IteratorResult<T> {
    // ...
  }
 
  [Symbol.iterator](): IterableIterator<T> {
    return new TrieIterator(this.trie)
  }
}

Equality

When you compare two instances you usually want the comparison to be based on their values (not by reference).

This requires a custom equality implementation. In Effect this is achieved using the Equal module:

export const symbolHash: unique symbol = Symbol.for("Hash")
export interface Hash {
  [symbolHash](): number
}
 
export const symbolEqual: unique symbol = Symbol.for("Equal")
export interface Equal extends Hash {
  [symbolEqual](that: Equal): boolean
}

For Trie we have the following:

const TrieProto: TR.Trie<unknown> = {
  // ...
  [Hash.symbol](): number {
    let hash = Hash.hash(TrieSymbolKey)
    for (const item of this) {
      // Compute the hash for each item in the Trie 👇
      hash ^= pipe(Hash.hash(item[0]), Hash.combine(Hash.hash(item[1])))
    }
    return hash
  },
  [Equal.symbol]<V>(this: TrieImpl<V>, that: unknown): boolean {
    if (isTrie(that)) {
      const entries = Array.from(that)
 
      // Compare each value inside the Trie 👇
      return Array.from(this).every((itemSelf, i) => {
        const itemThat = entries[i]
        return Equal.equals(itemSelf[0], itemThat[0]) && Equal.equals(itemSelf[1], itemThat[1])
      })
    }
    return false
  },
  // ...
}

Pipable

Data structures in Effect are also pipable: you have access to a pipe function to chain method calls:

const trie = Trie.empty<number>().pipe(Trie.insert("sandro", 500))

For reference, you can view here below the standard implementation of the Pipeable interface:

This can then be used by simply passing an instance of the data structure to pipe:

const TrieProto: TR.Trie<unknown> = {
  // ...
  pipe() {
    return pipeArguments(this, arguments)
  }
}

Refinement

You also want to check if a value is a valid instance of your data structure (similar to isArray).

We achieve this by assigning a Symbol parameter internal to the data structure:

const TrieSymbolKey = "effect/Trie"
export const TrieTypeId: TypeId = Symbol.for(TrieSymbolKey) as TypeId
 
const TypeId: unique symbol = TrieTypeId as TypeId
export type TypeId = typeof TypeId

We then assign TypeId to the data structure while also defining its type variance:

export interface Trie<out Value> extends Iterable<[string, Value]>, Equal, Pipeable, Inspectable {
  readonly [TypeId]: {
    readonly _Value: Types.Invariant<Value>
  }
}

You can read more about type variance here: Covariant, Contravariant, and Invariant in Typescript

This allows to define an isTrie refinement function that simply checks the existence of the TypeId property:

export const isTrie = (u: unknown): u is TR.Trie<unknown> => hasProperty(u, TrieTypeId)

Prototype and Object.create

Having done all the above we can finally define the complete prototype instance for the data structure:

const TrieProto: TR.Trie<unknown> = {
  [TrieTypeId]: trieVariance,
  [Symbol.iterator]<V>(this: TrieImpl<V>): Iterator<[string, V]> { /** */ },
  [Hash.symbol](): number { /** */ },
  [Equal.symbol]<V>(this: TrieImpl<V>, that: unknown): boolean { /** */ },
  toString() { /** */ },
  toJSON() { /** */ },
  [NodeInspectSymbol]() { /** */ },
  pipe() { /** */ }
}

We can then create a new instance of Trie by using Object.create:

Object.create() creates a new object using an existing object as the prototype

const makeImpl = <V>(root: Node.Node<V> | undefined): TrieImpl<V> => {
  const trie = Object.create(TrieProto)
  trie._root = root
  trie._count = root?.count ?? 0
  return trie
}

This is all!

Internal structure of the final implementation of Trie.Internal structure of the final implementation of Trie.

There are indeed a lot of aspect to consider when implementing your own data structure.

Nonetheless, remember that the hard part is then the actual implementation. I needed to make sure to satisfy all the requirements.

You can view and read the final implementation on Github:

If you are interested to learn more, every week I publish a new open source project and share notes and lessons learned in my newsletter. You can subscribe here below 👇

Thanks for reading.

👋・Interested in learning more, every week?

Every week I build a new open source project, with a new language or library, and teach you how I did it, what I learned, and how you can do the same. Join me and other 600+ readers.