Nested Threads and the Mastodon Context API

Display mode

Back to Articles

If you've ever been doomscrolling on a microblogging site like Mastodon (other brands are available), you'll have noticed that the conversations that result from a given message are presented as a flat list:

In this example, we can see some kind of conversation between Alice (the original poster) and Bob, with an interjection by Charlie; we also see two other comment threads. This is all derived from context, however, and there's no obvious structure to the threads. A nested presentation could perhaps be more conducive to understanding the flow of the conversation here:

The original message, by Alice

As it turns out, we can generate a tree-style view given the parent-child links from the first example; in this article I'll take a look at how one can use Mastodon's Context API to gather and produce the necessary data.

Context: Ancestors and Descendants

If one were to open a particular toot on the above-mentioned thread (say, Charlie's interjection), a Mastodon client would fetch the toot's context: its direct ancestors up the tree, as well as any descendent threads down the tree. Presentationally, this might be shown as:

We see the opened toot emphasised, with Alice's reply below; importantly, we only get Bob's reply to the OP because it's in the part of the tree that's a direct ancestor. If a Mastodon client were to present the full conversation, it would need to repeat this process with the top of the tree by fetching the context for Alice's message: every toot in the conversation would then be a member of the context's descendants.

In Mastodon's JSON API, this is implemented by the context endpoint, for which an example partial return looks like:

Fetching context for a toot

$ curl https://mastodon.world/api/v1/statuses/111148878835275146/context | json_pp
{
   "ancestors" : [
      {
         ...
         "content" : "<p>This might have slipped under the radar these past few"...,
         "id" : "111148824053932160",
         "in_reply_to_id" : null,
         ...
      },
      {
         ...
         "content" : "<p><span class=\"h-card\" translate=\"no\"><a href=\"https://infose"...,
         "id" : "111148835195532183",
         "in_reply_to_id" : "111148824053932160",
         ...
      }
   ],
   "descendants" : [
      {
         ...
         "content" : "<p><span class=\"h-card\"><a href=\"https://mastodon.or"...,
         "id" : "111149026486499760",
         "in_reply_to_id" : "111148878835275146",
         ...
      }
   ]
}
 

Here we have a situation analogous to Charlie's conversation from earlier: the toot for which we requested context has a parent and grandparent, as well as a direct child. A few things should be noted:

No "this" element
The context endpoint returns ancestors and descendants of the given toot ID, but it doesn't return the content of that message; if we want the content of the toot in question, the separate statuses endpoint must be called.
Parent-child relations
Each ancestor and descendant returned follows a standard message format: along with the content (and other fields such as account describing the author), each message has an id and an in_reply_to_id. The latter indicates the parent message for which this is a child.
Root node
The top of the tree is the first ancestor in the list. We can see that it's the OP because the in_reply_to_id is null: it has no parent.

Having obtained the root node's ID from our context call, we can issue another context request to obtain parent-child links for the full conversation thread. This call will return no ancestors, and a batch of descendants:

Fetching context for the root of the tree

$ curl https://mastodon.world/api/v1/statuses/111148824053932160/context |
jq 'keys[] as $k | (.[$k] | (.[] | {id: .id, parent: .in_reply_to_id}))'
{
  "id": "111148835195532183",
  "parent": "111148824053932160"
}
{
  "id": "111148878835275146",
  "parent": "111148835195532183"
}
{
  "id": "111149026486499760",
  "parent": "111148878835275146"
}
{
  "id": "111148886753997430",
  "parent": "111148835195532183"
}
{
  "id": "111148912289735390",
  "parent": "111148824053932160"
}
{
  "id": "111149031550141696",
  "parent": "111148824053932160"
}
{
  "id": "111149299225622339",
  "parent": "111148824053932160"
}
{
  "id": "111149308672212244",
  "parent": "111149299225622339"
}
{
  "id": "111149319647286908",
  "parent": "111149308672212244"
}
{
  "id": "111149323173850002",
  "parent": "111148824053932160"
}
{
  "id": "111149387367883525",
  "parent": "111148824053932160"
}
{
  "id": "111149724821007295",
  "parent": "111148824053932160"
}

Converting parent-child links to a tree

Once we have a list of parent-child relationships between nodes in the conversation, we can build a nested array of node IDs and their children: this will involve attaching a children array to each node of toot data, and filling the array with references to the direct children. In PHP, we could go about the production of the tree as follows:

PHP code to generate a conversation tree given a root toot ID

$mastodon_inst = 'mastodon.world';
$root_id = '111148824053932160';

// First, we fetch the root toot...
$root = json_decode(file_get_contents(sprintf(
  '%s/api/v1/statuses/%s',
  $mastodon_inst,
  $root_id
)), true);

// Then context for the root
$ctx = json_decode(file_get_contents(sprintf(
  '%s/api/v1/statuses/%s/context',
  $mastodon_inst,
  $root_id
)), true);

// Initialise a map of toots by ID
$toots_by_id = [$root['id'] => ($root + ['children' => []])];

// There will only ever be descendants in the context
foreach ($ctx['descendants'] as &$child) {
  $child['children'] = [];
  $toots_by_id[$child['id']] = &$child;
}

// Finally, add each toot to the map as a child of its parent
foreach ($toots_by_id as &$subtoot) {
  if ($subtoot['in_reply_to_id']) {
    $toots_by_id[$subtoot['in_reply_to_id']]['children'][] = &$subtoot;
  }
}

With the map filled in, a little light recursion is sufficient to generate a printed representation of the tree:

Printing the tree structure

function print_toot($toot, $level = 0) {
  printf(
    "%s[%s] (%s)\n",
    str_repeat('  ', $level),
    $toot['id'],
    $toot['account']['acct']
  );
  foreach ($toot['children'] as $child) {
    print_toot($child, $level + 1);
  }
}
print_toot($toots_by_id[$root_id]);

Our sample tree, as printed by the above

[111148824053932160] (briankrebs@infosec.exchange)
  [111148835195532183] (QuatermassTools@infosec.exchange)
    [111148878835275146] (penguin42@mastodon.org.uk)
      [111149026486499760] (mkoek@mastodon.nl)
    [111148886753997430] (ozdreaming@infosec.exchange)
  [111148912289735390] (ePD5qRxX@mastodon.online)
  [111149031550141696] (dwaites@infosec.exchange)
  [111149299225622339] (CyberLeech@cyberplace.social)
    [111149308672212244] (neurovagrant@masto.deoan.org)
      [111149319647286908] (CyberLeech@cyberplace.social)
  [111149323173850002] (VZ@fosstodon.org)
  [111149387367883525] (systemadminihater@cyberplace.social)
  [111149724821007295] (j@jaesharp.social)

And we're done

An implementation much like this has been used on ThreadTree, the page I wrote when the original annoyance came up. The only additional note regarding the code behind ThreadTree is that in my case, the annoyance started on Twitter, so the code still refers to tweets in many places even though the site functionality has been ported to Mastodon; the principle of nested threading and contexts applies in much the same fashion.

Thanks go to Terence Eden's article on this very topic which helped with the building of ThreadTree, and mapgie who (re-)introduced me to Twitter and caused this whole mess.