Communicating through UUID conflicts

“This is definitely stupid, yet slightly applicable”

Part 1: Posing the problem

For one issue at work, we tossed around the idea of generating UUIDs client-side in order to ensure that we had a unique immutable id without having to go to the server to get a permanent id back. Obviously, if somebody’s maliciously creating ids, they can choose arbitrary values and create collisions. Additionally, one client could tell if another client had already chosen a certain value for an id by trying to commit a certain UUID to the DB. For example:

I ended up posing the purely academic question of how we might be able to send arbitrary messages over that channel. As an example of the mechanism of communication:

If we interpret the 409s as 1s, and the 201s as 0, we get the string 10101, or 21 in binary. One client has now communicated some information to the other, merely through conflicts in IDs!

Design Goals and Optimization Criteria

We’ve established that basic communication can occur between different clients, but designing a protocol requires having solid design guidelines. Here are a few possible goals for a system like this, where an asterisk denotes whether I’ll be addressing those goals in this design:

  1. The protocol should be relatively robust to existing records in the database*
  2. Two communicators must not know when the other is sending a message*
  3. Each client should be able to send multiple messages over a period of time*
  4. The protocol should mitigate concurrency issues*
  5. The protocol should mitigate the impact of malicious actors
  6. The protocol should support more than two clients over the shared medium*
  7. The protocol should reduce of number of requests to the server
  8. The protocol should support client discovery

With these constraints and criteria, we can begin to start crafting a communication protocol and have a general notion of why we might prefer one protocol over another. To be perfectly clear: This is a toy problem with interesting constraints.

Modeling the Problem Domain

Each client can create a record (like a todo) with a certain UUID in this space:

Assume that if an ID hasn’t been used already, you get a 201 Created back from the server. If an ID has been used, you get a 409 Conflict back. All GETs will 404 Not Found, so in order to tell whether an ID has been used, you have to try to create a “todo” with that ID.

The interesting bit is that by reading whether or not another client has created a record with a UUID, you inevitably end up creating one if it doesn’t already exist. In other words, by reading whether a bit is set, you inherently overwrite it to be set. That is, the “create a record” operation can be viewed as a joint “read and write true” operation:

To Shared Memory

We can imagine a table of sequential UUIDs and whether they're taken:

We can model the space of taken vs. untaken UUIDs as shared memory, where the address is the sequential index of the UUID, and data is whether the record exists or not.

This address space will be the basis of the rest of the discussion.

Part 2: Tackling the Problem

From here on out, I’ll be trying to design a message protocol that conforms to the constraints above.

The first attempt I’ll construct consists of a sled of 1s at the bottom of the memory space, a data structure I’m calling a “shared message store” that’s just beyond the sled of 1s, and a data heap that starts at the top of the address space and grows downwards in address space.

The rough overview is that when a client polls for messages, it sequentially reads the sled of 1s to try to find the shared message store, reads the whole shared message store, performs manipulations to the shared message store, and rewrites the shared message store just beyond where the previous one ended (which is now filled with 1s due to the read.)

Prerequisite: The sled of ones

The purpose of the sled of ones is to enable a client to read and write values from the memory, even though all values are overwritten on read. I’ll use this mechanism extensively in the protocol. Following is a quick example to illustrate the mechanism:

Initial Memory Layout for the Protocol

To make it easier on ourselves to start, we’re going to start with the assumption that each client can perform an entire routine atomically. This is, of course, a horrible assumption, but I’ll try to address it later. Here is the memory layout I’m starting with:

Glossary of fields

  • start: The start bit, indicating the beginning of the message store
  • msg_count: The number of messages in the message store
  • head_ptr: The pointer for the top of the message data heap. Gets updated when a new message is added to the data heap.
  • dest_addr: The client id of the destination for that message.
  • msg_ptr: The pointer to the to the message data for that message record
  • msg_size: The length of the message data for that message record

To read from the message, read down the field of 1s until you find a zero,
then read the subsequent 2 bytes to get the message count. Since messages are of a fixed size, we can now read the whole set of message records. Once we’ve read the message queue, we can read the data in the message queue.

Since after reading, the field of 1s now extends through where the shared message store was written, we rewrite the message store at the end of the new field (removing any messages addressed to that client.) As an example:

After reading, re-copy the message store just past the end of the newly invalidated block, exactly as described above.

Inserting a message into the Message Store

Inserting a message is equivalent to reading a message, but right before rewriting the message, perform these steps:

  1. Write the message data to just before the head_ptr address.
  2. Update head_ptr to the start of the newly written data blob.
  3. Add a new message record to the client’s representation of the message store.
  4. Update msg_count


I’m going to attempt to tackle this one whittling down from strong guarantees
to weak ones. The concurrency models from strongest to weakest are:

  1. Clients can execute an entire routine atomically
  2. Clients can execute a readwrite of k addresses atomically (with some limit).
  3. Clients can execute a readwrite of 1 address atomically.

The above protocol is robust to (1) but to no others, and thus will have to be amended.

Tackling k = O(log n)

Can we get atomic locks if we assume for an address size of n, k = O(log n)? The answer seems to be yes, via the following mechanism. First, we split up the shared message store into a static part and a dynamic part:

We move the residual part of the message store to just beyond the head_ptr:

Our whole memory space now looks like this:

With this memory layout, we can now lean heavily on k = O(log n) to construct a locking mechanism, as detailed below:

Detail of the Locking Mechanism

Let’s modify our original behavior, such that we read in chunks of k. In other words, we set our word size to be k.

We then issue a read for word 3, reading that byte sequence in a single atomic operation (as per our assumed atomicity constraint). The client reads 00101011 from Word 3, leaving the memory space as follows:

The interesting thing is that if we choose the sled of ones to
be written in chunks like this, we can define a locking mechanism
on top of it. We state for a client trying to acquire a read/write lock:

  1. A word of all ones is part of the ones sled and the client
    should read the next word.
  2. A word of all zeros means that someone else has a lock
  3. A word that starts with 01 means that the word is a valid head
    ptr, and the client has gotten the lock.

Choosing a start sequence of 01 allows us to disambiguate between
these 3 possibilities because a valid message will never be all 1s or all
0s. The convenient thing about this series of choices is that a lock is guaranteed to belong to at most one client at a time:

Hitting a series of all 1s does not change the state, nor affect the lock condition.

Hitting a series of all 0s does not affect the lock condition, per the following:

Hitting a valid head ptr atomically grabs a lock for the client reading:

This requires atomic manipulation of 2 + log2(n) bits where n is the size of the space of available UUIDs.

Initializing the Memory Space

This protocol does not (yet) handle the initial coordination of the clients. If the memory space were blank at the onset of clients joining, then all clients will loop consistency, each thinking that some other client must have pulled the lock.

To solve this, I’ll add in a quick asymmetry to the specification: a client that writes to the first word automatically gains the read/write lock and assumes the default head pointer state of the form [ 01 | top_of_address_space ]. Since a field of zeros is a valid shared message store, the client performs no additional setup beyond simply assuming the lock.

An extremely rough implementation of the protocol up to this point can be found in my scratchpad repo.

Tackling k=2

Really, the essence of the locking mechanism we need is the start sequence. We can replace the whole lock operation with just the sled of 1s, followed by the two-bit sequence 01. From a practical perspective, this makes it far less likely that two clients will conflict horrifically if atomicity can only be guaranteed to k=1, but that’s little consolation when the jitter between requests is by no means negligible.

To tackle k=2, I partition the memory into 3 sections and treat each section as a contiguous shared memory space. I arbitrarily assign:

First 1/3rd (M1): Locking mechanism (just the 01 sequence from the head_ptr data structure)
Second 1/3rd (M2): Shared Message Store (identical the original construction)
Final 1/3rd (M3): Message Data

Since the locking mechanism can no longer store a pointer, we must still
put in a sled of 1s on both M1 and M2. This is the primary modification to
get us to k=2, and is the best construction I’ve come up with so far.

Tackling k = 1?

I have no idea how to do this, and strongly suspect it’s impossible. I don’t know how to prove it impossible either, so if anybody has suggestions, I’m all ears.

Error correction

Obviously, we’re not the only one using this system; there will be existing entries in the UUID space. We’ll have to work around them and mitigate the (extremely rare) errors incurred. I’m going to state for

We have 2 possible system behaviors to watch out for:

  1. The system allows records can be deleted and their IDs reused.
  2. The system does not allow for the reuse of IDs.

For generality, we’ll assume the former model. For messages, pointers, counts, and other pieces of data, a simple garden-variety Hamming code should be sufficient to detect and correct single errors.

For forward-correction in the start sequence and locking mechanisms, the 3 word states we need to distinguish are all 1s (for the ones sled), all 0s (for resource locked), and a valid start sequence of our choosing. To ensure a Hamming distance of 3 between any two of these cases, we need 6 bits. We’ll therefore expand the start sequence from 2 bits to 6 bits, and choose [ 000111 | head_ptr ] as the memory layout. On observing a value, we’ll correct to whichever valid 6 byte sequence has the minimum Hamming distance to the observed value.

And I think that solves just about all the problems I’m willing to tackle within this thought experiment.

Part 3: Further Explorations and Contextualization

Given that the constraints in this problem set are relatively new, there’s ample space for creative exploration. My proposed protocol is far from the only possible style of solution, and after a few conversations, I started seeing how different strategies might emerge.

For example, my solution does not utilize time domain — I implicitly assume that we can never assume time as a total ordering. However, on seeing this problem, my coworker proposed a concept we started calling a synchronization block: a region of memory shared asymmetrically by two clients, by which client 2 can notify client 1 of an arbitrary event. Client 1 sequentially reads position 1, position 2, position 3, until it reads a 1 (unless it reaches the end of the buffer).

Under the assumption that time does not pass uniformly, this does not work. But under the constraints of real-world systems, we can often reasonably make these assumptions and get away with such a data block. The primary point of this illustration is how small variants yield interesting and novel constructs in the problem space.

Why on earth did you spend time on this?

In my mind, the interesting part of this article is the problem more than the protocol. The protocol’s purpose is to show that the problem itself is worth investigating. Even though this is a toy problem, it has interesting consequences that yield nicely under some, but not a lot, of mental energy. I love those sorts of tasks.

As always, if you find anything wrong with this article, please reach out and tell me. Happy hacking, all.

Fork me on LinkedIn