Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,575
|
Comments: 51,188
Privacy Policy · Terms
filter by tags archive
time to read 6 min | 1090 words

I have a piece of code that has been living, rent-free, in my head for the past 30 years or so.

In middle school (I was 12 - 13 at the time), I was taught Pascal as the entry-level programming language. I found it to be a really fascinating topic, as you can imagine.

One of the things I did was try to read other people’s code and see what sense I could make out of it. That was way before the Internet was a thing, though. Even at that time, Pascal was mostly on its way out, and there wasn’t much code that a kid could access.

To give some context, at the time, if you wanted to move data from one place to the next, the only real option was to physically disassemble your computer, take out the hard disk, wrap it in a towel for protection, and carry it to your destination. I recall doing that and then spending some hours at a friend’s house, trying to figure out the right jumper configuration so the BIOS would recognize two drives at once.

Because of this, I had a great interest in disk parking, a technique that helps ensure that taking the hard drive from one location to the next wouldn’t ruin it. I remember running into a particular piece of code to handle disk parking in Pascal and being quite excited by this. Here I had something that I needed to do, and also in a language that I was familiar with.

I remember staring at the code and trying to make sense of it for a while. I couldn’t figure it out. In fact, I couldn’t even begin to understand what was going on there. I remember feeling both frustrated and stupid because I could understand the syntax but not what it was doing.

In fact, it was so far above my head that I was upset about it for days, to the point that decades later, it still pops into my head. When it came back for a visit this week, I decided to try to figure it out once and for all.

That led to the first problem, which is that I cannot actually recall the code. I remember it was in Pascal, that it dealt with parking the disk needles, and it was full of weird numbers that I couldn’t grasp.  Turning to Google didn’t help much, I found some code samples, but nothing that really jived with what I recalled. I ended up cobbling something that more or less matches what I had in my head.


program DiskPark;


function ParkHead(drive: char): Boolean;
var
  regs: Registers;
begin
  with regs do
  begin
    AH := $13;  
    AL := Ord(drive) - Ord('A') + $80;
    DL := AL;  
    Intr($13, regs); 
    ParkHead := (Flags and FCarry) = 0;
  end;
end;


begin
  if ParkHead('A') then
    WriteLn('Success: parked')
  else
    WriteLn('Failure, no idea why!');
end.

The actual code that I remember was beefier, and I think that it handled a bunch of additional scenarios, but it had been 30 years, I can’t recall it.

Amusingly enough, I actually had to update my publishing software, since I never thought I would be publishing Pascal code snippets 🙂.

What is interesting is that, as I look at this code and try to view it through the glasses of my 12-year-old self, I can understand exactly the frustration.

Look at this code, it is filled with random numbers and letters. What is AH or the Intr() call? I mean, looking at this, there is no place to get started understanding this.

Let’s look at this line:


AH := $13;

There is no definition of AH anywhere, I can’t tell where it is coming from. And $13, what is that about? A bad luck omen because I can’t figure it out, I think…

The problem was that my 12-year-old self wrote programs like “Guess the number”, and really advanced stuff was bubble sort. I approached the code above with the same mindset. There is a logic to this that I can figure out if I can just stare at this long enough.

The problem is that the code above is basically arbitrary, there is nothing intrinsic about it.

The weird variables that come from nowhere (AL, DL, and AH) are registers (at the time, I don’t believe I was even aware that those existed). The weird numbers are just arguments that we pass to the BIOS when you invoke a command.

In the same sense, consider this code:


return syscall(437, ptr, 525376);

This line will tell you absolutely nothing about the reasoning behind the code. While the numbers seem arbitrary,in this case, they correspond to calling SYS_openat with the flags O_APPEND | O_CREAT | O_CLOEXEC.

The whole program above is just the way you have to go to invoke a system call (as we would call it today), and there is no logic or sense to it. Just something that you have to know. And once you do know, all of this can be packaged into a much simpler, higher-level concept: the system call. (Yes, it is an interrupt invocation, not the same thing. You may go stand in the pedantic corner, mind.)

Hopefully, I wouldn’t ever have to think about this program any longer. I managed to not only understand what it does, but actually grok it.

time to read 33 min | 6568 words

I ran into this fascinating article (I wrote another blog post discussing it) and that got me thinking. How would I approach building a dead-drop implementation? For that matter, what do we need from a dead-drop system?

I think that the following are reasonable (loosely based on what Secure Drop aims for):

  • Completely anonymous:
  • No accounts, no registrations.
  • No server-side state about users.
  •  Prevent metadata tracking:
  • It’s not just message contents that are hidden.
  • Cannot tell if A talks to B or C.
  • Accessible via Tor to protect against traffic analysis.
  • Cannot tell who is talking to me at all.
  • Assume that the server may be compromised by a malicious entity.
  • No special trust should be granted to the server.

Reasoning

(You can skip this part to read the actual implementation below)

Let’s consider an actual usage scenario. We have Whistleblower Will (WW from now on) who wants to send sensitive information to Journalist Jane (JJ from now on). Let’s assume that the adversary in this case is a Big Bad Behemoth. I believe that the current villain de jour is Boeing, which is a huge company and had a couple of strange incidents with whistleblowers recently.

If WW wants to send stuff to JJ, why can’t he just email jj@nice.journalist from his personal email ww1994@aol.com ? Practically speaking, email today is being sent using TLS anyway, so we can assume that no one can read the message in the middle. Moreover,  even pretty sophisticated traffic analysis would find it difficult to track the email, as WW is talking to AOL (or GMail, or their likely email provider), and then AOL is communicating with the nice.journalist server (which is likely hosted by Exchange, Gmail, etc.) In other words, any such message would likely be lost in the noise of regular traffic.

However, while that is likely to happen, it isn’t guaranteed. Given the risk of life & limb involved in our scenario, we would like to ensure that this is the case. I’m writing this post because it is an interesting scenario, not because I actually have a use case. As usual in my encryption posts, this is merely my musings on the matter, don’t take me as an authority on the subject. That said, I’m actually quite interested in realistic threat models here. Please provide any feedback, I would love to know more about this.

One thing to pay attention to with regards to this scenario, however, is that if my threat model is a Bad Company that is one thing, but what if my threat model includes a nation state? I would point you to this wonderful article: This World of Ours by James Mickens which manages to be both hilarious and informative. The level of capability that you face when your opponent is a nation state is high.

When thinking about nation-states and whistleblowers… Snowden is the first name that comes to mind. In this case, the issue isn’t whether there is some plaintext being sent over the wire for everyone who cares to listen. Assume you are sending such an email from AOL to Gmail. Both companies will provide any data they have, including the full contents of any messages, if provided with an appropriate court order.

Moving from legal (but dubious) actions to the other side, I’m fairly certain that it would take very little investigative work to find an appropriate person who:

  • Works at an email provider.
  • Has access to the email contents.
  • Is able to make use of an appropriate cash infusion in their life.

I would also further assume that the actual amount required is surprisingly low.

Another thing to consider is that our whistleblower may want to provide the information to the journalist, but may absolutely not want to be identified. That includes being identified by the journalist.

In short, the whole mindset when building something like a dead drop is extremely paranoid. With that in mind, let’s see how we can build such a system.

Implementation

The concept behind this system is that a journalist will publish their public key in some manner, probably in their newspaper. That is a fairly simple and obvious step, of course. The issue with needing a dead drop isn’t about being able to hide the contents of the messages. If that were the case, a PGP encrypted message would suffice. The real issue is that we want to hide the fact that we are even communicating at all.

Note that this isn’t about any form of instant messaging. This is about dropping a message (text, files, etc.) and having it sent securely to the other side. The key aspect is that not only are the contents hidden, but also the fact that we even sent it in the first place. And even if you are watching the source or the destination, you can’t tell who the other side is. For reference, think of spycraft techniques in the Cold War era.

Given a journalist public key, the whistleblower will package all the data they want to send in a zip file and encrypt that using the public key. That is easy enough, the problem now is that we need to push that file somewhere and then have the journalist get it. That is where our system has a role.

In the design of the system, I tried to intentionally reduce the amount of information that we provide to the server. That way, even if the server is malicious, there isn’t much that they can do with what they have.

I decided to use a serverless architecture here, for two primary reasons. The first reason is that I’m currently teaching a Cloud Development course and that was a nice project to play with. The second reason is that by running everything as a serverless function, I reduced the amount of data that I can easily aggregate since invocations are independent of one another.

From a client perspective, here is the manner in which I’m able to send my sensitive information to a journalist. We need two pieces of information, first is the address of the dead drop, in this case I’m using a dummy .onion service and the journalist public key. The public key is used to encrypt the information so only the journalist can see it.

Let’s look at the code first, and then discuss what is going on there:


BASE_URL = "https://deaddrop0j22dp4vl2id.onion" # example only
JOURNALIST_PUB_KEY = base64.b64decode('GVT0GzjFRvMxcDh9c6jpmXkHoGB5KoIp9vyU3RozT2A=')


data = open('secrets.zip', 'rb').read() # file to send
searler = SealedBox(PublicKey(JOURNALIST_PUB_KEY))
enc_file = searler.encrypt(data)


with torpy.http.requests() as s:
    res = s.get(BASE_URL + "/upload-url").json()
    s.put(res.get('url'),files={'file':enc_file}).raise_for_status()
    file_id = res.get('id').encode('ascii') + b'='
    enc_id = searler.encrypt(base64.urlsafe_b64decode(base64_id))
    s.put(BASE_URL +"/register-id", data=enc_id).raise_for_status()

The first step is to encrypt the data to send (in this case, the file secrets.zip) using SealedBox. That is the act of encrypting the data using a public key so only the corresponding private key can open it.

The next step is to use Tor to call GET /upload-url to get a JSON object back, with url and id properties. The output of this request looks something like this:


{
  "id": "ArMvWgDBEXyVap-O7VbD-ELzDJ0ZB_2ir9E51RVv9-4",
  "url": "https://cloud-dead-drop.s3.amazonaws.com/uploads/ArMvWgDBEXyVap-O7VbD-ELzDJ0ZB
_2ir9E51RVv9-4?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIARAM4NNC5DUOXEDI7%2F20240522%2Fil-central-1%2Fs3%2Faws4_request&X-Amz-Date=20240522T18412
7Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Security-Token=IQoJb3JpZ2
luX2VjEE-redacted-Z1BIyzgmj%2F9NhhNqdIPwnSV%2F6nvRhWrthEz0H8jRNU6U%2BoPh7zZTtQ
IrU5ahNmpWjLNGUnqNMYfNCNU%2FRX%2BUyERFwlMT7yrYIbxUyWUDwde1IXOHjkTns07kXmBlLG1u
Bvt6RDrE0xjFs%3D&X-Amz-Signature=c9498fb-redacted-9150ce4cb85"

}

The output is basically a random name of the file and an S3 presigned URL. Note that the file id is a base64 value, with the padding removed, which is why I have to add back the = character when decoding the value.

We can use that URL to upload the file to S3 (or a compatible service), and then we PUT /register-id with the encrypted file id, again using the journalist public key.

In other words, we have broken the upload process into four separate stages:

  • Encrypt the file
  • Request an upload url and get a random file id
  • Upload the encrypted file
  • Encrypt the file id and register it

The entire process is done over Tor, ensuring that our physical location is not leaked.

It’s interesting to note that just using Tor isn’t enough. A Harvard student making bomb threats was arrested because he used Tor via the campus WiFi. It was easy enough to narrow down “all Tor users in a particular time frame from location” and go through them one at a time.

SecureDrop has a whole list of steps to take in order to increase your safety at this stage, by the way, Tor is basically just the start here, and physical security is of paramount importance.

This looks like a really convoluted way to do things, I know, but there is some logic behind everything. Let’s look at the actual implementation and see how all those pieces look from the other side. In terms of the architectural diagram, here is what we use to generate the upload URL:

The following code implements the logic for the upload-url endpoint. As you can see, there is really nothing here. We generate a random 32-byte token, which will serve as the file id, generate the pre-signed URL, and hand it over to the caller.


upload_bucket = os.environ.get('UPLOAD_BUCKET')
s3 = boto3.client('s3')
def generate_upload_url(event, context):
    id = secrets.token_urlsafe(32)
    resp = s3.generate_presigned_url('put_object',
      Params={'Bucket': upload_bucket, 'Key': 'uploads/' + id },
      ExpiresIn=3600)  
    body = json.dumps({'id': id, 'url': resp})
    return {'statusCode': 200,'body': body}

The caller is free to make one or more such calls and use (or not) the pre-signed URLs that it got. The backend doesn’t have any say in this, nor any way to influence the caller.

The intent here is that we split the responsibilities, instead of having an upload that the server can gather information about. If we assume a malicious server, then requesting an upload URL doesn’t provide much information, just the Tor exit node IP that was used.

Proper setup would mean that the S3 bucket we use is not logging anything. Even if we assume that it is doing so, the only data in the log would be the Tor exit node IP. A malicious server would also have access to the actual uploaded file, but that is not usable without the private key of the journalist.

At this point we have an uploaded file, but how do we actually get the journalist to know about it? This is the part where the real fun happens. First, let’s look at the architectural diagram, then we’ll discuss how it works in detail.

There are a lot of moving pieces here. The design is intentionally meant to be hard to pierce since we are trying to ensure that even if the system is operated by a malicious entity, it will still retain much of its secured capabilities. (Again, this is a mental exercise for me, something to do for fun. If your life/liberty is at stake here, you probably want to get a second opinion on this design).

How do we let the journalist know that we have a new file for them (and along the way, not let anyone else know about it)? The whistleblower has the file id from the server, the one used as the file name for the upload.

I’m using Libsodium SealedBox. SealedBox is a way to encrypt a value given a public key in such a way that only the owner of the associated private key can access it.

SealedBox has an envelope size of 48 bytes. In other words, encrypting a 32-byte id + 48 envelope gives us exactly 80 bytes.

The whistleblower will use SealedBox to encrypt the file name, and then call the register id endpoint. Here is the associated lambda backend for the register-id endpoint:


queue_url = os.environ.get('NOTIFICATIONS_QUEUE')
sqs = boto3.client('sqs') 


def register_id(event, context):
    return register_id_internal(event['body'])
def register_id_internal(msg):
    # 32 bytes payload + SealedBox = 80 bytes -> base 64 == 108 bytes
    if len(msg) != 108: 
        return {'statusCode': 400, 'body': 'Invalid ID'}
    sqs.send_message(QueueUrl=queue_url, MessageBody=msg)
    return {'statusCode': 204}

It… doesn’t do much (you may have noticed a theme here), we are simply verifying that the size of the value matched, and then we send that to an SQS queue.

By the way, if we are sending 80 bytes, why are we receiving 108 bytes? This is because we are using lambda, and the binary data is being base64’ed by the lambda infrastructure.

Just registering the (encrypted) file id in the queue isn’t really helpful. It is just… sitting there, who is going to read or operate on that? That is where the two timers come into place. Note that the architecture diagram has events that happen every minute and every 5 minutes.  Every minute, we have a lambda running to maybe publish a decoy value. Its code looks like this:


queue_url = os.environ.get('NOTIFICATIONS_QUEUE')
sqs = boto3.client('sqs') 


def maybe_publish_decoy(event, context):
    if secrets.randbelow(4) != 0:
        return # 75% of the time, do nothing
    # 25% of the time, generate a decoy message
    return register_id_internal(
        base64.urlsafe_b64encode(secrets.token_bytes(80)).decode('ascii')
    )

The maybe_publish_decoy() lambda will usually do nothing, but 25% of the time it will register a decoy value in an SQS queue. Note that the actual message we generate is just random bytes, nothing meaningful.

Remember that when a user registers a file id, the file also ends up in the queue. Because both user ids and decoy ids end up in the same queue, and they both look like random bits, there is no way for an external observer to tell which is which.

For that matter, there is no way to tell for the system itself. Once a decoy is posted to the queue, there is no way to know whether it is a decoy value or a real one.

The last component in our system is actually working with the messages in the queue. Those are handled by the publish_ids() lambda, which is invoked every 5 minutes. Let’s look at the code:


MAX_MSGS = 8
upload_bucket = os.environ.get('UPLOAD_BUCKET')
def publish_ids(event, context):
    while True:
        result = sqs.receive_message(QueueUrl=queue_url,
                                     MaxNumberOfMessages=MAX_MSGS)
        msgs = result.get('Messages', [])
        ids = [msg['Body'] for msg in msgs]
        while len(ids) < MAX_MSGS:
             rnd = secrets.token_bytes(80)
             fake_id = base64.urlsafe_b64encode(rnd).decode('ascii')
            ids.append(fake_id)


        ids = sorted(ids, key=lambda _: secrets.randbelow(1024))
        output = byte('\n'.join(ids), 'ascii')
        now = datetime.datetime.now(datetime.timezone.utc).isoformat()
        s3.put_object(Bucket=upload_bucket, Key='ids/' + now, Body=output)


        if len(msgs) == 0:
            break
        sqs.delete_message_batch(QueueUrl=queue_url, Entries=[
           {'Id': msg['MessageId'], 'ReceiptHandle': msg['ReceiptHandle']}
           for msg in msgs
        ])

Every 5 minutes, the publis_ids() lambda runs, and it starts by reading messages from the queue. We read a batch of maximum 8 messages at a time, and round up with made up ids if there are not enough messages to store. Here is what such a file looks like:


K7VruHnGhlpzWssB92OpMUjAw0-FoDyED_6p4w2LMgcV7JrsVB4SQdH7VNQzAT-jywYZsVhHM8lNF-JiWWUgXONK_Qb2DJw29aLVqw9rvIs=
HG3vHyYVCC42gzeZqgugwIciPqzeQEdNcQrFdqcpUcY5dMRInZKA_ZSFBuyvPdAfJnZm8wkS-jE0cdXZUZmp1wx2CZYWcPGu1uXocdWn2D4=
OpJWZRfQkMGuXRci8x8YrHx0REE4PdBZctj27gXjH0JvRtaSFMweL47q9nB9r6XomGnOfu5632JbEuMKPOEkkdYiVvst-1Qpw1TNzTPcQmY=
drjhGt6d-aV3h8_BjC81cE5kayXiWikgD8qxWEPYL0T4l8BrW-MadhanXcr465vIs7eBzK-DdwrmtqO8rQsrHN60-f2KirpN-qHpdlpxSbk=
rDKdp0CHSm4-Dvf8BOToLQSv79GpfqLnV3fXLECwUK9HdVEDeRK-T3SycyDmwvjUgjkH0vNMB9Yx_AeaHIS87hD2mCpyEGYKNpGMsnWZlHg=
CQ-5sgobc29-1x6adr09tOgk2yb4WNirzZ2dflQOHkXKDY0uk5B9pq_KKDjNoyWZVsRazgvqRPz3mqan2yKb3P0xAQDmF2CjyN6hMR3bjsQ=
44DYBoGFiPeN8dP7FGn579W7vFgUp8-lblI7nfFP3a0TUqo5sjCnV_Ozr4aPXbdVam6kpyhkpqkQeSeroQNP_x7iq2dpNskjx2x4WO8ezJ0=
TMC5ralR9BjHwTf0xk36kuUcbseD6HVkZgK3e1bpckyk62O_trNINa7FMNLLEwUZeQRvUBuj1CRhNWiz0wRjBvv_hxpbi9ToFymJXkz1ocA=

We then write the file to another S3 folder, note that we use the iso format for the file name, which gives us lexical sorting for the values by time. Here is what this looks like on the bucket itself:

As you can see, roughly every 5 minutes, we have some values being written out. The file size is always the same, and we can’t tell based on the presence of a file if there were messages posted at a given time.

In the time frame shown here, I didn’t post any messages, but you can see that we still have two files written at 04:15, likely because there were enough messages in the queue to force us to run a second file (basically, a race condition between the decoy lambda and the publish lambda). That is a desirable outcome, by the way.

This is… pretty much it, I have to say.  There aren’t any additional behaviors or things to explore. This Rube Goldberg machine is meant to create a system that breaks apart the different sections and loses information as we move forward.

I’ll cover the impact of this design in the case of a malicious server later on in the post. For now, I want to cover how the journalist can read the data. The server currently holds two folders:

  • uploads/ - allows anonymous GET for files, auto-expires in 14 days, uploads require a pre-signed URL
  • ids/ - allow anonymous GET and LIST, auto-expires in 14 days, only written to be the backend (from the queue)

A journalist is going to be running a check every 5 - 10 minutes, using Tor, on the contents of the ids/ bucket, like so:


def read_messages(session, base_url, reveal, last_file = ""):
    while True:
        res = session.get(base_url, params={
            'prefix': 'ids/', 
            'start-after': last_file, 
            'list-type': 2})
        res.raise_for_status()
        dict_data = xmltodict.parse(res.content)
        contents = dict_data.get('ListBucketResult').get('Contents')
        if len(contents) == 0:
            time.sleep(5 * 60)
            continue
        for file in contents:
            last_file = file.get('Key')
            data = session.get(base_url + last_file)
            for line in io.BytesIO(data.content).readlines():
                id = base64.urlsafe_b64decode(line)
                try:
                    file_name = reveal.decrypt(id)
                    yield file_name
                except:
                    pass

What we are doing here is scanning the id/ folder, getting all the id files that we haven’t seen yet. For each of those files, we get it, and try to decrypt each of the lines in it. When done processing all the files in the folder, we’ll wait 5 minutes, and then read the next file.

This relies on the fact that S3 buckets return items in lexical sort order, and our publish_ids() lambda generates the file names in the ids/ folder using lexically sorted timestamps.

If there are many journalists listening on the system, each one of them will be making 1 - 2 remote calls every 5 minutes, seeing if there are any ids in there that they can decrypt. Note that this gives the server absolutely no information about what data each journalist is able to access. It also drastically reduces the amount of information that you need to deal with and distribute.

Each journalist will need to go through roughly 250 KB per day of ids to scan for messages aimed at them. That is assuming that we have a low load system, with < 8 messages / every 5 minutes. Note that this is still over 2,300 messages/day.

Assuming that we have a very high load and have to push 100,000 messages a day, the amount of data that each journalist will have to scan is under 10 MB. Those are good numbers, especially since we don’t intend this to be a high-traffic system.

With the file id in hand, the journalist can now download and decrypt the data, like so:


BASE_URL = "https://deaddrop0j22dp4vl2id.onion" # example only
PRIVATE_KEY = 'Iei28jYsIl5E/Kks9BzGeg/36CKsrojEh65IUE2eNvA='
key = PrivateKey(base64.b64decode(PRIVATE_KEY))
reveal = SealedBox(key)
with torpy.http.requests() as s:
    for msg in read_messages(s, BASE_URL, reveal):
        res = s.get(BASE_URL + "uploads/" + msg)
        try:
            print(reveal.decrypt(res.content))
        except:
            pass

We get the file ids, download them from the S3 bucket, and decrypt them. And now the journalist can start actually looking at the data. To reply, the whistleblower would need to send their own public key to the journalist, and subscribe in the same manner to the updates.

Note that this is not meant to be an online protocol, and you can scan the data once a day or once a week, without any real problems. That makes this system attractive since you can schedule a weekly trip to a remote location where you can anonymously check if you have anything new in your “mailbox”, without anyone being able to tell.

With the raw technical details out of the way, let’s consider some of the implications of this sort of system design.

The cautious playbook

A journalist would publish their public key, likely in their paper or website. They are interested in getting such information from anonymous sources. At the same time, they need to ensure that no one can tell if someone sent them information. A whistleblower wants to send information, but be protected from anything knowing that they sent it. Ideally, we should even limit the knowledge that any information was sent.

The way SecureDrop recommends, access to the system is only via Tor and usually from a location that isn’t near your usual haunts, which you traveled to without a phone, paid in cash, using a live-cd Tails instance.

The idea of adding additional layers beyond the system encryption is that even with a powerful adversary, the number of hurdles they have to go through is very high. You have to break the system encryption, and then Tor, and then you reach some coffee shop in the middle of nowhere and need to go over footage to try to identify someone.

The crazy thing is that this is actually viable. It isn’t science fiction today to obtain the security footage (which we can safely assume exists), run it through a facial recognition system and get a list of people to check. So we need to have multiple layers of defense in place.

From the point of view of this dead drop system, all the data is encrypted, the only information you have is about analyzing traffic patterns. A good way to alleviate even that is to not run everything at once.

The whole point of breaking it into discrete steps is that you can execute it in isolation. You can get the upload url, upload the file, and then actually register the id a day later, for example. That makes timing analysis a lot harder.

Or consider a few bots that would (via different Tor exit nodes for each operation):

  • Request upload urls on an ongoing basis
  • Occasionally upload random files to those urls
  • Read the ids/ folder files
  • Download some of those urls

If we have a few of those bots, and they send each other “messages” that generate decoy traffic, it is going to be much harder to track who and what is happening, since you’ll have activity (anonymized via Tor) that hides the real actions.

I think that those bots are likely to be another important layer of security, defeating traffic analysis and masking actual usage by people.

Consequences of a malicious takeover

Let’s consider for a moment what will happen if the system is taken over by a malicious party. In this case, we assume that there is a valid system that was taken over. Therefore, we need to figure out the impact of old messages that were sent and new messages that will be sent from now on.

The design of the system calls for the following important configurations:

  • Disabling logging on access (S3, endpoints, etc).
  • All files (ids, uploads, etc) are deleted within 14 days.
  • We are routing data in such a way that information is lost (registered id will be merged with decoys, etc)

Assuming that a proper system was taken over by a malicious party, the only additional information that they now have is all the contents of the uploads/ folder, which aren’t visible to other parties (you have to know the file name to download).

Given that the files are encrypted, there isn’t much that is leaked. And the bots I talked about will mask the real traffic with dummy files.

Once the system has been taken over, we can assume that the following happens: There is correlation now between calls to upload-url and the actual uploaded file. You can also correlate a registered id with an upload, assuming sufficiently low traffic (which is likely).

Those are the only additional bits of information that you gain from having access to the system. When we register an id to be published, the whistleblower sends the encrypted value, which the server has no way of correlating to the recipient.

The server can now do traffic analysis, for example, monitor that someone is reading the ids/ folder and then downloading a file from the uploads/ folder. But that is of limited utility, that would be:

  • Masked by the bots traffic.
  • Only give you the Tor exit node.

A malicious party could also disable expiration and retain long-term all the uploaded files, in case they get the keys at a later time, but beyond that, I can’t think of anything else that they will get from actually having control over the system.

In particular, for the whistleblower, there is no data leakage about who they are or who they are talking to. Even with the collaboration of the journalist, if the whistleblower didn’t provide that information, they remain anonymous.

Side channels

Another aspect of this system is that we don’t need to go with the id registration and publication to journalists. The fact that this is stored (and encrypted) means that you now don’t need to pass potentially a lot of data to a journalist, but can just give them the id itself. That is 108 bytes in base64 format, and doesn’t convey any additional information beyond that.

The question is, of course, if you can pass the id, why not just pass the encrypted file directly.

Attacks and mitigations

The design of the system makes certain attacks impossible to execute or impossible to hide. For example, you can perform denial of service attacks on a journalist by sending many messages that they would have to go through (you have the public key, after all). But that is obviously detectable.

In general, denial of service attacks on the system can be mitigated by requiring proof of work to submit the files.

What about blocking access from a source or to a particular journalist? There is no identifying the source, so you cannot block based on that. You can also not block the journalist, since the server doesn’t have any idea who the destination is.

You can block access to the system, simply by ignoring id registrations. From the outside world, it will look like someone just didn’t send us any messages. However, that is easily detectable. A journalist can send a message to herself, and upon not receiving it, detect that the system is not operationable.

Key leakage

What happens if the key pair of a journalist leaks? That would be catastrophic for the journalist and the sources because it would enable decryption of any messages that were sent. This is mitigated by only keeping messages for a maximum of 14 days, but we must assume that an adversary has copies of all the messages that were ever sent.

In this scenario, key leadage would compromise all communications intended for the journalist. Technically speaking, we can try to do a key exchange using this system, and have a temporary key assigned for a particular conversation. The problem is that this sort of system is mostly offline, with days or weeks between interactions.

That means that we need to persist the keys (and can thus assume that a key pair leak will also leak any “temporary” keys). Something that you can do is publish not just a public key but several over time, with built-in expiry. Let’s say that you publish your key in January, and replace that with a new key in February. In March, you’ll destroy the January key, so you don’t have a way to leak that.

Having rotating keys is a cute idea, but I think in practice this is too complex. People have a hard enough time remembering things like their passwords, requiring them to remember (and change) multiple passphrases is too much. On a yearly basis, however, that makes a lot more sense. But then again, what does “destroy the old key” mean exactly.

Summary

Well, this post has gone on entirely too long. I actually started writing it to play around with serverless system architecture, but I got sidetracked into everything else. It is long enough that I won’t try to dive into the serverless aspect in this post. Maybe in a future one.

As a reminder, this is a nice design, and the blog post and research consumed quite a few very enjoyable evenings, but I’m not a security or cryptography expert. If you require a system like this, I would recommend consulting an actual professional.

time to read 10 min | 1920 words

I just read this (excellent) introduction to the Secure Drop protocol, meant for journalists to have a way to safely accept anonymous information from sources.

Using Tor and something like Tails (with a pretty good guide on how to connect to it from an unusual location) you can probably get to the right website in an anonymous manner. With the current version of Secure Drop, you go to a website, such as TechCrunch, and get a .onion address that you can put into the Tor browser. You then get a codename and can upload files and a message.

You are trusting the server to a very large extent, as far as I can see. If the server you are reaching has been compromised, it can just send anything you add directly to the Bad Guys. I mean, the website recommends using GPG to encrypt the messages locally, but then it gives you the GPG key.

For fun, I tested this with the Washington Post. They have a couple of ways to provide information anonymously. One is via Secure Drop and the other is via GPG email. The keys for both are different. That doesn’t inspire confidence.

For that matter, the TechCrunch instance is running on version 2.5.2, whereas the current version is 2.8.0. I assume that those instances are basically set up once and then never maintained. For context, the TechCrunch current version of Secure Drop was released in Feb 2023, quite a while ago.

Let’s assume that there is a security vulnerability (I know of none, just to be clear) in an old version that hasn’t been updated. What happens if it is attacked and taken over by an adversary? Since the current model is to trust the server, you get access to everything.

I think that the new Secure Drop protocol's whole point is to get end-to-end encryption and break the requirement to trust the server. And more to the point, I think that the way they do that is really interesting.

Here are their stated requirements:

  1. No accounts, and therefore no user authentication.
  2. No message flow metadata, meaning messages can’t be linked together, and different types of messages are indistinguishable from one another.
  3. No changes in server state are observable externally.
  4. No ciphertext collection or information leaks via trial decryption – a given recipient receives pertinent ciphertext only.

Basically, there are three methods:

  • Send
  • Fetch
  • Download

A source needs to send information to a journalist. The journalist publishes their public key somewhere. For example, as a QR code in a physical magazine, etc. Using the journalist's public key, we can start the ball rolling.

They encrypt the data locally and then Send the encrypted file to the server, where it is stored. The next step is for the journalist to Fetch and Download the data.

Each time the source sends information to the server, it is considered a separate action, and there is no way to link them together. To reply back, the sender needs to include their public key in their message.

But there is a problem here. You don’t want the server to know who is the recipient of the messages. Note that this holds both ways, you should not be able to see that a particular journalist got a message or trace a reply to the source.

In other words, the server should have as little information as possible about what is going on. And the way they implemented that feature is absolutely beautiful. Remember, the source will encrypt the message locally using the journalist’s public key.

That is done using an ephemeral key pair, used only for this message. Along with the encrypted message, they send the (ephemeral) public key for the message and also compute a Diffie-Hellman key from the ephemeral private key and the journalist public key.

This is what it would look like:


def send(msg, journalist_public_key):
   tmp_key_pair = generate_key_pair()
   enc_msg= encrypt(msg, journalist_public_key)
   msg_gdh = scalar_mult(tmp_key_pair.private, journalist_public_key)
   return enc_msg, msg_gdh, tmp_key_pair.public

The Gdh here refers to Group Diffie-Hellman. Note that this isn’t actually used for anything, we just send that to the server.

When a journalist tries to fetch messages, they aren’t providing any information to the server. The server has no way of knowing who it is talking to. And yet the idea is to send information that only the journalist can read. In order to do that, the server will generate an ephemeral key pair (only for that request) and Diffie-Hellman key from the current request’s private key and Gdh value provided by the source.

The server will compute another Gdh value. This time between the message’s public key and the private key of the request. It will send the encrypted message, the public key of the request and the server computed Gdh value to the journalist. Here is what the code looks like:


def fetch(msg_id, msg_gdh, msg_public_key):
   tmp_key_pair= generate_key_pair()
   enc_key = scalar_mult(tmp_key_pair.private, msg_gdh)
   msg_srv_gdh = scalar_mult(tmp_key_pair.private, msg_public_key)
   enc_msg_id = encrypt(msg_id, enc_key)
   return enc_msg_id, msg_srv_gdh, tmp_key_pair.public

That looks like utter nonsense, right? What is going on here? Let’s go back to first principles. Classic Diffie-Hellman is based on multiplication on a finite field. Let’s assume that we agreed on Base = 5, and Modulus = 23 (values taken from Wikipedia).

The journalist generates a private key: 13 and computes 513 mod 23 = 21. The journalist thus publishes 21 as its public key.

Journalist key pair
Private: 13Public: 21

The source generates a private key (for a single message): 3 and computes 53 mod 23 = 10. The source also computes a Diffie-Hellman key exchange from the message private key and the journalist public key using:  213 mod 23 = 15. That is marked as the message GDH (Group Diffie-Hellman).

JournalistSource (per message, ephemeral)Adversary knows…
Private: 1Public: 5

Private: 3Public: 10

Message GDH: 15

Journalist public key: 5

Note that usually, the GDH value (15 in our case) would be used as the agreed-upon encryption key between the source and the journalist. Instead of doing that, we are jumping through a few more hoops.

When the source sends a message to the server, it sends it its public key (10) and the GDH (15). This is stored internally and isn’t really doing anything interesting until a journalist needs to fetch messages from the server.

This is where the magic happens. During the processing of the fetch request, the server doesn’t have any idea who the journalist is, there are no keys or authentication happening. A key requirement is that the journalist is able to get new messages without the server knowing what messages they are able to read.

Let’s consider Lois and Clark as two journalists, and assume for simplicity that there is only a single message involved. Both Lois and Clark send a request to the server to get new messages. Since the server cannot tell them apart, it will respond in the same way.

The first step in the server processing the request is to generate an ephemeral private key: 9 and compute 59 mod 23 = 11. It will then compute a Group Diffie-Hellman, between its private key and the message public key: 109 mod 23 = 20, this is called the request GDH. It also computes another Group Diffie-Hellman, this time with the message GDH and its private key: 159 mod 23 = 14. That serves as the encryption key for the message, and the server uses that to send an encrypted message ID to the journalist.

Journalist (Lois)Journalist (Clark)Server
Private: 13Public: 21

Private: 17Public: 15

Request Private: 9Request Public: 11

Request GDH: 20

Encryption key: 14

--Server reply:
--Request GDH: 20encrypted(msg_id, 14)

Note that in the table above, the reply from the server is computed in the same manner to both journalists (the server cannot tell which is which). However, for each request, the server will generate a new request key pair. So on each request, you’ll get a different result, even for the same data.

Now, what can the journalists do with this information? We have the message public key and the request GDH. Let’s have Lois do Group Diffie-Hellman on them: 2013 mod 23 = 14. That gives us the encryption key that was used by the server. We can use that to decrypt the message the server sent (which by itself is a reference to the actual encrypted message held by the oblivious server).

What about Clark? Using his own key, he can compute 2017 mod 23 = 7. And that means that he cannot get the encryption key to figure out what the server sent.

How does this work? Let’s break it down.

Message GDH213 mod 23 = 15        
Encryption key for server159 mod 23 = 14
Request GDH109 mod 23 = 20
Encryption key for journalist2013 mod 23 = 14

The reason we get the same encryption key is that we actually end up computing:


enc_key_server = pow( 
   pow(journalist_public_key, message_private_key),
   request_private_key
) % 23 # ((21^3)^9) % 23 = 14


enc_key_journalist = pow( 
   pow(message_public_key, request_private_key),
   journalist_private_key
) % 23 # (((10^9)^13) % 23 = 14

In other words, when the server computes the encryption key, it involves:

  • Message public key
  • Journalist public key ^ message private key
  • Request private key

And when the journalist computes it, it involves:

  • Request public key
  • Journalist public key ^ message private key
  • Journalist private key

That is a really cool result. It makes sense, but it was non-obvious. Hence why I wrote it out in detail to understand it properly.

In terms of cryptographic theory, I’m not aware of any other use of something like that (note, not a cryptographer), and it would probably require a thorough review by actual cryptographers to verify.

In practical terms, actually making use of this is pretty hard. It uses a pretty low-level cryptographic primitive that is usually not exposed by modern encryption API. More to the point, I think that it isn’t a good direction to go toward, and I have in mind a much simpler solution instead.

time to read 2 min | 326 words

Take a look at this wonderful example of foresightedness (or hubris).

In a little over ten years, Let’s Encrypt root certificates are going to expire. There are already established procedures for how to handle this from other Certificate Authorities, and I assume that there will be a well-communicated plan for this in advance.

That said, I’m writing this blog post primarily because I want to put the URL in the notes for the meeting above. Because in 10 years, I’m pretty certain that I won’t be able to recall why this is such a concerning event for us.

RavenDB uses certificates for authentication, usually generated via Let’s Encrypt. Since those certificates expire every 3 months, they are continuously replaced. When we talk about trust between different RavenDB instances, that can cause a problem. If the certificate changes every 3 months, how can I trust it?

RavenDB trusts a certificate directly, as well as any later version of that certificate assuming that the leaf certificate has the same key and that they have at least one shared signer. That is to handle the scenario where you replace the intermediate certificate (you can go up to the root certificate for trust at that point).

Depending on the exact manner in which the root certificate will be replaced, we need to verify that RavenDB is properly handling this update process. This meeting is set for over a year before the due date, which should give us more than enough time to handle this.

Right now, if they are using the same key on the new root certificate, it will just work as expected. If they opt for cross-singing with another root certificate, we need to ensure that we can verify the signatures on both chains. That is hard to plan for because things change.

In short, future Oren, be sure to double-check this in time.

time to read 4 min | 668 words

I got a really interesting comment on a blog post talking about query optimization. The context was that working with dates is much easier for a query engine than working with date & time at the millisecond granularity. You can read the details on that in the post. Here, I want to focus on the modeling difference between the two.

In real-life scenarios, DateTime.Date is kind of useless. I don't know if any application will use such a query unless they are a UK-only application or an application that says they only deal with UTC. In practice, it is often hour or minute precision due to the time zone.. The majority of time zones are hourly based, with some operating on minute bases. e.g. 15 or 30 minutes. (Newfoundland UTC-03:30, India UTC+5:30, Eucla UTC+8:45)

This is a great example of the difference in thinking between developers and business people. Because we typically work with date & time, we tend to assume that all the associated considerations for time also apply to dates. However, that isn’t actually the case.

The date 2024-05-13 is a date. It is the same in the UK and in India. The start time for that date may be different, but the date is the same. A date doesn’t have a time zone. Mostly because that isn’t a meaningful distinction.

Let’s consider the most famous date of all, your birthday. You were born at a given point in time (date time) in a particular location (time zone). However, for the vast majority of scenarios, such details are irrelevant.

If you were born on Baker Island (UTC-12) and currently reside in Line Islands (UTC+14), you’ll celebrate your birthday based on the date, not the time. That is made obvious when you consider that a year is not an exact measurement in terms of time and the duration of time within a year varies considerably between different years.

When we talk about businesses and dates, it really gets more complex. Consider the simplest scenario, we have a business that has stores in Honolulu, Hawaii, and Dallas, Texas. On Dec 31, 2023, a purchase was made in the Honolulu store at 8PM. That is already Jan 1st, 2024 in Dallas, mind. What year would taxes be paid on that sale?

You can make all sorts of cases here, for paying that in the current year or the previous one, etc. In practice, it doesn’t matter. The date the sale was made is what determines the tax year. That means that a sale in Honolulu will be registered in 2023, while a sale that happened half an hour earlier in Dallas will be registered for 2024.

The reason for that is simply that there is no really good answer here when you start working across time zones. And trying to maintain the distinction between when the day started is not really meaningful from a business perspective.

Here is another consideration, if I signed an office lease for 6 months starting on January 1st. However, in March, daylight savings time came into effect. When is my lease going to expire?

The answer is May 31, 23:59, regardless of changes in DST. Because the duration is in months (and days) not in terms of time(and hours).

There are scenarios in business that I care deeply about the time that passed. A great example would be for payroll purposes. I did a night shift with daylight savings time in it. You bet that I want to get paid for the total number of hours that passed, not the difference between the hours on the clock. For fun, it gets really complex when you have shifts that cross payroll periods (how do you calculate overtime), but that is a topic for another time (pun intended).

The common case, however, is that you only care about the date, and the timezone is not relevant.

time to read 27 min | 5255 words

I ran into an interesting Reddit comment about deniable encryption and decided to spend an evening playing with it. The concept is that we have a way to encrypt a message in such a way that we can provide a key that would reveal a different message.

The idea is that if you are forced to reveal your key, you can do so, without spilling your secret. From a technical perspective, this is a truly fascinating scenario. Of course, it comes with the problem that if you’ve provided a key that doesn’t show anything the adversary is happy with, they’ll assume that there is another key.

Note: As usual when talking about cryptography, I’m at best an amateur in this area. This is strictly me having fun, don’t try to keep your Bitcoin keys here (instead, send them to me by snail mail).

In theory, there is a simple way to do so. Behold my prediction for the winner of the 2024 US election. I don’t want to reveal to you ahead of time, but here is the encrypted value:


4a/8AqOcEFzlMRP9VsLgEtXIq8+Cc11bKKp6+iR3c975qOrNtA==

After the election, I’ll share the key that will show that I properly predicted this (you should send me bitcoins at that point). Let’s commit further and show you how you can verify this, it’s really simple:


string Decrypt(string encrypted, string key)
{
    var t = Convert.FromBase64String(encrypted);
    var k = Convert.FromBase64String(key);
    return Encoding.UTF8.GetString(
         t.Select((b, i) => (byte)(b ^ k[i])).ToArray()
    );
}

In the interest of time (and those bitcoins), I’ll let you know that the answer is either:


tsaSbMbuMDODESHNZPbAR4bozqPnECkyR8Rak1dNU4qL3Ye9mg==
tsaSbMbuMDODESHNZPbAR4bozqPnECkyR8Rak1dNU5yQzI+jmg==

Voila, we are done, right? Not only did I demonstrate my ability to properly predict the future, but I was also able to show how you can use two separate keys to decrypt the same data.

This is just a property of the way I “encrypted” the data. What happened is that I took some random bytes, and when I needed to produce an answer, I XORed those bytes with the message I wanted to get and then I sent you the XORed value. When you XOR it again with the “message” you previously got, we get the output I want. In essence, that “message” you got is a one-time pad, and I can use that to send you any message I want.

This also has all the usual limitations of one-time pads, you can only send data up to the size of the key, it doesn’t protect you from the text being corrupted, and it is malleable. In other words, you have no way to ensure (cryptographically) that the message you received was actually sent by me.

Modern cryptography relies on something called AEAD (Authenticated Encryption with Associated Data), which ensures that you can send as much encrypted data as you want and ensures that no one can alter the data I receive if the decryption process is successful.

What I aim to do is create a proper way to encrypt a message and be able to retrieve it later, but also provide another key if needed. Here is the API that I have in my mind:


var output = DeniableEncryption.Encrypt(
    ("P@ssw0rd", "Joey doesn't share food!"),
    ("swordfish", "Meet at dawn by the beach to toast the new year"),
    ("adm1n!str@t0r", "We were on a break!"),
    ("Qwerty!Asdf@2024", 
          "Bitcoin seed: lonely ghost need apology spend shy festival funds"
    )
);

As you can see, we are actually encrypting multiple messages here, each with its own password. The output of this code will be something like this:


OwlVWixT3NMen2vuklHsY34SEIfu+zuDKJx5UMQKEjP4GnDnSDsagUiMWxcl83kC4GRI1s64JEU7
x7vf4u15FZOw3DDzwCG71Mqatfjc7nTzAox7Cr9FtVnxqsqkIyeOVsq6yHbiP56HAlUkbu7/D3kp
RrRmNtdqo5S6Dl7Y50fH492W/+/wtEhNePfWP7YhO1KsvUcnX2S4B7VKnTZbhJDhvxgTUlMK4/Cl
UdQiP3H0R1CZ8ucB1mb1yP/gkwIPYA7ComAzKtM9VOviCzqP5wqzdq7KcWM2FdJH3Lqpuoi376Lr
3Dnh4FUDe8jJhU2xlNhj7O9tXPczIeeUu7GuG0FHegeHprqRc7AkKK5b5kiEN7VnQz58fk29WBAQ
2LTYrBwbDn47Sw4PybMFcl9Wy4Yuw9ElHQoZcvXGk9hAXsqWdRyRgOq8HREiuZpzvSyx5v56T2+u
3hGVeXStKfYF6T5R9w8wqPKQ8UB3/blKguQZjtJNlueXDSpJ5Jzl/7FKUCSJfK16l9NgiVpcGGLg
qke4cZ1aVdpL6rsCH8uLLjEO0MVliuLjjX2VP76UrJBpTdh2uuyjaNDa6tFxQ/zx0jz5VlVbNZv+
w4KOl9jkQYk1U9VZ4K1v8IP6swZ1AOmfkWDuXmeNHlBeFk7Au0+ZDgpAJptwZrSDP2rgRgnrTlDL
qN9YqSlV4Q2TBmcVmQy9+sW1aI7yHhUdzfx3wZ551BmJX0IBLJviMKm+1DglKlHfs/+BZ1fz1HR2
ygnSSq2FXMf8y8vDZRjxb54Q/4JzgvYMpq47/ukmzPgc6JcYfkLtDCNXeJ+0bUDjn5r+2I8d3DLF
FbkGa6oB+um1ipAAeO5paH6aqzERtK3qEDw+nmnHW+y7BQUrQ7VRq2rmmPI2Hq8aMGcyCu5MpI97
/TXgcZDsUxFgRvA4ZQWz0UJ1ZNZOHgCRWgsAOkp1Y6zUKYwcJlhiICfUlfuj+6I2YayldANn/GR/
jn8/U4W2UXN3cAkl9AtgkxooGVQThPnAVTqo4l+EiL9X7WLo3EyaTJSNzxSSgudMQZJJJc+nYK66
A592plmhCzsvWedUeBzacE/pifyWM9DGDHqa5K75VLGV1AenIIBsUkYc44UPDLazpLSUBfTXB7LZ
Tig4sHmmLvfiXrD/lH1jFKEsAdTHiYWLooxTBD0Q2COpEM6kyKkljwfko8FVpO6HRNwiyqwQCsDM
xSMOO2Vk5qQVAb7VOEXN0fQjxEjSeB890XODduP6d941dqk+L1iAQK50GHk8WWCkZFn6FricVCFs
5T6fEWRj6wJlD4EISmoNVmanqAmF79Spg4YCOr3N2EbSdokf5d4ZNA7GtDFX9esbHTIPS0SiXyUx
MHgS99CllTACSEpmEisu/JiEsWKHOg0oy6oCZbi9hrMbeMGGQ9jM4sIxQ20/u7dV1maut67CPN0H
F5VYkRQ0/PNqhvT4Tb3CuGLndxD/nsvs9MgOOZljVZWLaF//Gdno2CNNPjnnHTuURScpcIFKMN5c
80NDgQX6kO8Hoiho7NFc61QwDCNYk1xk2qVDo9jcOaJ2zdtFWJfVavhLCNnEfWoHxzBmS4oL6Ss=

And I’m able to turn that back into the encrypted message using this code:


var msg in DeniableEncryption.Decrpyt(pwd, encrypted);
Console.WriteLine(msg);

If I don’t have the password, on the other hand, it should be completely unfeasible for me to figure out what the message is. In fact, let’s try to list the requirements from such a scheme:

  • With a password, I can easily decrypt the message.
  • Knowing one password isn’t useful for decrypting a message using any other password.
  • I cannot tell how many messages are hiding in the encrypted text.
  • I cannot detect anything about the messages themselves.

I’m an amateur at best in cryptography, so I’m not going to try to construct something myself. Let’s see if I can cobble together something that would at least hold up for a bit.

I’m using real passwords, and I need to turn them into encryption keys. I’m going to be using PBKDF2 to do that, and for the encryption itself, I’ll use the AES-GCM algorithm. Here is the rough format of the output.

We start with a salt (32 bytes generated using CSRNG) which is used to feed into the PBKDF2 algorithm, then a set of offsets into the file, and the actual data itself. Note that we are always “storing” exactly 8 messages.

In practice, I’m going to allow up to 6 user-defined messages, to ensure that we always have “empty” slots. The size of the data is also meaningful, so we need to ensure that we aren’t leaking that.

What I’m doing is ensuring that we round up (by 64 bytes) the size of all the messages that we want to encrypt and ensure that each data block is of the same size. To avoid leaking even what is the exact size at 64-byte intervals, I’m writing some additional random bytes at the end.

Let’s look a bit deeper into the format of the data block itself. We start by writing the actual size of the block, then the nonce and authentication tag (important for the AES-GCM usage), and then the encrypted message. The rest is filled with random data.

You’ll note that the offsets in the overall output format and the size in the data format implies that we are leaking information about the messages we encrypted. Given that I need to know where to look for the value in the value, and I need to know the size, why am I spending so much time trying to obfuscate that?

The idea is that I actually have two levels of encryption here. When I derive the key with PBKDF2, I’m asking it to use SHA512 and give me 40 bytes of derived key material. I’m actually only using 32 bytes of those as the actual encryption key, leaving me with 8 bytes (two pairs of 4 bytes) that I can use to XOR with the offset and the length. That hides the actual offset and size (basically using some of the PBKFD2 output as a stream cipher).

It has all the usual problems of raw stream cipher, but I don’t care about malleability or authentication in this scenario. I rely on AES-GCM to handle that part of the process and just need to hide the information from other prying eyes. A man-in-the-middle attack targeting those values is going to be able to cause me to try (and fail) to decrypt a value, so I don’t think that this matters.

With all of that said, let’s look at the actual code for the encryption portion:


public static byte[] Encrypt(params (string Password, string Value)[] items)
{
    if (items.Length > MaxUserItems)
        throw new ArgumentException("You are allowed up to 6 items");
    if (items.GroupBy(x => x.Password).Any(x => x.Count() != 1))
        throw new ArgumentException("No reusing passwords");


    var totalSize = items.Max(
        x => Encoding.UTF8.GetByteCount(x.Value) + 
             sizeof(int) + AesGcm.NonceByteSizes.MaxSize +
              AesGcm.TagByteSizes.MaxSize
    );
    var sizeAlignedUp = (totalSize + BlockSize - 1) & -BlockSize;


    var additionalSizeMixed = RandomNumberGenerator.GetInt32(1, 4) *
        RandomNumberGenerator.GetInt32(BlockSize / 2, BlockSize);
    var outputBuffer = RandomNumberGenerator.GetBytes(
       ItemsCount * sizeAlignedUp + OffsetsBlockSize + SaltSize +
       additionalSizeMixed
    );
    Span<byte> output = outputBuffer;


    var salt = output.Slice(0, SaltSize);
    var offsetsBlock = MemoryMarshal.Cast<byte, int>(
       output.Slice(SaltSize, OffsetsBlockSize)
    );


    int index = RandomNumberGenerator.GetInt32(ItemsCount);
    foreach (var (pwd, val) in items)
    {
        ReadOnlySpan<byte> derived = Rfc2898DeriveBytes.Pbkdf2(pwd, 
           salt, Iterations, HashAlgorithmName.SHA512, 
           sizeof(int) + DerivedKeySize + sizeof(int)
        );


        var plaintext = Encoding.UTF8.GetBytes(val);
        var requiredSize = sizeof(int) + AesGcm.NonceByteSizes.MaxSize +
           AesGcm.TagByteSizes.MaxSize + plaintext.Length;


        var offset = sizeAlignedUp * index + SaltSize + OffsetsBlockSize +
            RandomNumberGenerator.GetInt32(sizeAlignedUp - requiredSize);


        var sizeMask = MemoryMarshal.Read<int>(derived.Slice(0, sizeof(int)));


        offsetsBlock[index] = offset ^ sizeMask;


        index = (index + 1) % ItemsCount;


        Span<byte> mem = output.Slice(offset, requiredSize);


        var lenMask = MemoryMarshal.Read<int>(
           derived.Slice(sizeof(int), sizeof(int))
        );
        var mask = lenMask ^ plaintext.Length;
        MemoryMarshal.Write(mem, mask);


        var derivedKey = derived.Slice(
           sizeof(int) + sizeof(int),
           DerivedKeySize
        );


        using var cipher = new AesGcm(derivedKey, AesGcm.TagByteSizes.MaxSize);


        cipher.Encrypt(
            nonce: mem.Slice(sizeof(int), AesGcm.NonceByteSizes.MaxSize),
            plaintext: plaintext,
            ciphertext: mem.Slice(sizeof(int) + AesGcm.NonceByteSizes.MaxSize +
                          AesGcm.TagByteSizes.MaxSize, plaintext.Length
            ),
            tag: mem.Slice(sizeof(int) + AesGcm.NonceByteSizes.MaxSize,
                   AesGcm.TagByteSizes.MaxSize)
            );
    }


    return outputBuffer;
}

We validate that the user provided us with up to 6 messages (MaxUserItems) to encrypt and that there are no repeated passwords, then we compute the size required to encrypt the longest message. We align that on 64 bytes (BlockSize) and use that to compute the actual overall buffer size. Note that we also add a bit of additional space at the end, to confuse attempts to figure out values based on size (such as the BEAST attack).

We then get the output buffer. Note that in this case, we are asking the RandomNumberGenerator class to give us a buffer that is already filled with random data. The idea is that we don’t need to worry about filling stuff up with cryptographically secured data. We start with random noise, and we add whatever meaning we need from there.

The first 32 bytes (SaltSize) are the salt, this is used to mitigate rainbow table attacks, among others. The next 32 bytes are used as the offsets array, which are used to store the location of the actual encrypted messages.

For the message we want to encrypt, we start by using PBKDF2 to derive a 40-byte cryptographic key. We are using SHA512 (which has a block size of 64 bytes) and 210,000 iterations to derive the key, per the OWASP recommendation.

We want to be unpredictable, so we aren’t writing the first element to the first offset position. Instead, we start the offset position in a random location. We figure out what is the size of the encrypted value (including the size, nonce, tag, and actual encrypted bytes) and stash that at a random location in a random offset in the output buffer.

We then take the first 4 bytes of the derived key value and XOR that with the offset of the value we’ll be writing. We are using those bytes as a stream cipher, basically. We write the encrypted offset to the offsets table. Note that in order to decrypt that, you need to re-run the PBKDF2 computation, which requires that you have the password.

The next 4 bytes (4..8) are used as a stream cipher to encrypt the length of the value we are about to encrypt. And the other 32 bytes (8..40) are used as the encryption key itself.

Note that we are “missing” things like nonce generation. We don’t need that, since the nonce buffer we point to has already been seeded with random values from a cryptographic source.

The Encrypt() does most of the work, and… this is pretty much it. There isn’t a lot of code, most of it is in how we put things together.

The decryption portion is a lot more interesting, I think, so let’s take a look at it:


public static string? Decrpyt(string pwd, byte[] encrypted)
{
    Span<byte> mem = encrypted;
    var salt = mem.Slice(0, SaltSize);


    ReadOnlySpan<byte> derived = Rfc2898DeriveBytes.Pbkdf2(pwd, salt,
       Iterations, HashAlgorithmName.SHA512, sizeof(int) + 
       DerivedKeySize + sizeof(int)
    );


    var offsetMask = MemoryMarshal.Read<int>(derived.Slice(0, sizeof(int)));
    var lenMask = MemoryMarshal.Read<int>(
       derived.Slice(sizeof(int), sizeof(int))
    );
    var derivedKey = derived.Slice(sizeof(int) + sizeof(int), DerivedKeySize);


    var offsetsBlock = MemoryMarshal.Cast<byte, int>(
        mem.Slice(SaltSize, OffsetsBlockSize)
    );


    for (int i = 0; i < ItemsCount; i++)
    {
        var offset = offsetsBlock[i] ^ offsetMask;
        if (offset < SaltSize + OffsetsBlockSize ||
            offset + sizeof(int) > mem.Length)
            continue;


        var maskedLen = MemoryMarshal.Read<int>(
           mem.Slice(offset, sizeof(int))
        );


        var len = maskedLen ^ lenMask;


        if (len < 0 || offset + len + sizeof(int) > mem.Length)
            continue;


        using var cipher = new AesGcm(derivedKey, AesGcm.TagByteSizes.MaxSize);
        var outputBuf = new byte[len];
        try
        {
            cipher.Decrypt(
                nonce: mem.Slice(
                          offset + sizeof(int), 
                          AesGcm.NonceByteSizes.MaxSize
                ),
                ciphertext: mem.Slice(
                    offset + sizeof(int) + AesGcm.NonceByteSizes.MaxSize +
                    AesGcm.TagByteSizes.MaxSize,
                    len
                ),
                tag: mem.Slice(
                        offset + sizeof(int) + AesGcm.NonceByteSizes.MaxSize,
                        AesGcm.TagByteSizes.MaxSize
                ),
                outputBuf);
        }
        catch (CryptographicException)
        {
            // expected, we may hit a dummy value or wrong password
        }
        return Encoding.UTF8.GetString(outputBuf);
    }
    return null;
}

Here we take the first 32 bytes (the salt) and use PBKDF2 and the password to generate the derived key. Again, we are getting 40 bytes back. The first 4 bytes are the offset mask (to figure out where to look for the values, the next 4 bytes are the length mask, to figure out the length for decryption, and the last 32 bytes are the decryption key.

Without the password, we cannot get to the derived key, remember. Then we start scanning through the offsets block. For each of the items we XOR the value in the offsets with the mask. Here we have three options:

  • The XORed value is completely off, which we detect and skip.
  • The XORed value is correct and points to the right offset to continue the operation.
  • The XORed value appears to be correct (its value in bounds). We’ll continue the operation, but fail in the next stage when we actually try to decrypt the value. This is because we are using AES-GCM, which is an AEAD (authenticated encryption) that validates (using cryptographic primitives) that the decrypted value matches the value that was encrypted. I wrote a blog post (part of a larger series) explaining this in detail.

With the offset, we can now read the masked length of the buffer, which has the same problems as the masked offset. We XOR that with the right mask and need to deal with the obvious wrong, correct, or appears to be correct but actually wrong scenario as well. We don’t really care, since we leave the actual validation to the authenticated encryption portion.

If we are able to correctly decrypt the value, we immediately return it. But if not, we’ll try with the next offset, etc. Note that for decryption, we are scanning the offsets array and attempting to check whether the key we derived from the password is able to decrypt the current value. During encryption, we randomized where everything goes, and here we can just do a simple scan and stop on the first value that was successfully decrypted.

As I mentioned, that was a lovely evening to spend on an interesting exercise. I think that this is a valid way to go about building a deniable encryption scheme. The full code is here, I would love your feedback on both the code and the actual idea.

I like that I can provide multiple passwords and messages, in a simple manner. I think that a viable use case would be to encrypt three values. Safe, honeypot, and the real deal. For example:


var output = DeniableEncryption.Encrypt(
    ("safe", "I don't like Mondays"),
    ("honeypot", "I microwave fish in the office break room and I’m not going to stop"),
    ("motherlode", "Bitcoin seed: armor cactus gaze off future blade artist")
);

There is no way to tell whether there is a third option here, and the format is intentionally always assuming 8 “entries”, even if you provide less than the maximum. Of course, that also raises the problem of what if after you give up the motherlode, the other side still suspects there are more secrets. At this point, I’ll point you out to Mickens and a wonderful article about threat models.

 Check out the code and let me know what you think about this.

time to read 3 min | 566 words

We got an interesting question in the RavenDB Discussion:

We have Polo (shirts) products. Some customers search for Polo and others search for Polos. The term Polos exists in only a few of the descriptions and marketing info so the results are different.

Is there a way to automatically generate singular and plural forms of a term or would I have to explicitly add those?

What is actually requested here is to perform a process known as stemming. Turning a word into its root. That is a core concept in full-text search, and RavenDB allows you to make use of that.

The idea is that during indexing and queries, RavenDB will transform the search terms into a common stem and search on that. Let’s look at how this works, shall we?

The first step is to make an index named Products/Search with the following definition:


from p in docs.Products
select new { p.Name }

That is about as simple an index as you can get, but we still need to configure the indexing of the Name field on the index, like so:

You can see that I customized the Name field and marked it for full-text search using the SnowballAnalyzer, which is responsible for properly stemming the terms.

However, if you try to create this index, you’ll get an error. By default, RavenDB doesn’t include the SnowballAnalyzer, but that isn’t going to stop us. This is because RavenDB allows users to define custom analyzers.

In the database “Settings”, go to “Custom Analyzers”:

And there you can  add a new analyzer. You can find the code for the analyzer in question in this Gist link.

You can also register analyzers by compiling them and placing the resulting DLLs in the RavenDB binaries directory. I find that having it as a single source file that we push to RavenDB in this manner is far cleaner.

Registering the analyzer via source means that you don’t need to worry about versioning, deploying to all the nodes in the cluster, or any such issues. It’s the responsibility of RavenDB to take care of this.

I produced the analyzer file by simply concatenating the relevant classes into a single file, basically creating a consolidated version containing everything required. That is usually done for C or C++ projects, but it is very useful in this case as well. Note that the analyzer in question must have a parameterless constructor. In this case, I just selected an English stemmer as the default one.

With the analyzer properly registered, we can create the index and start querying on it.

As you can see, we are able to find both plural and singular forms of the term we are searching for.

To make things even more interesting, this functionality is available with both Lucene and Corax indexes, as Corax is capable of consuming Lucene Analyzers.

The idea behind full-text search in RavenDB is that you have a full-blown indexing engine at your fingertips, but none of the complexity involved. At the same time, you can utilize advanced features without needing to move to another solution, everything is in a single box.

time to read 2 min | 267 words

This happened a few minutes ago, I got a call from an unknown number. That was my wife’s work number, and she called to ask me an urgent question, it seems:

 

“Can you tell me how to compress a PDF file?” she asked.

For the next part, it might be better if I paint you the whole picture. Imagine bullet time, where everything slows down, and I start to analyze the question and my possible answer. The following thoughts run through my mind during that time.

  • PDF files are already compressed by default.
  • Pretty sure that the file format is already using compression.
  • You could strip unneeded elements from the file, removing fonts is one example, I think.
  • If there are images, can probably downscale or re-sample them to reduce their size.
  • What about just running this through Zip?
  • Where did this question come from?

That took about two seconds in real time. The decision tree for any possible answer here grew exponentially. I had to make a call.

“No, that isn’t easily possible,” I answered.

I got some more details as well.

“This is for uploading a document to the XYZ system, it only accepts up to 4MB files, but this PDF is 5.5MB. I guess I can just scan this document as two separate pages instead of one, right?”

A workaround found, and a detailed dive into lossless vs. lossy compression compared to the file format choice avoided, I agreed that this was probably the best option and finished my coffee, pondering the ethical dilemma of answering the actual question or the intended question.

time to read 9 min | 1696 words

Following my previous post about updating the publishing platform of this blog, I realized that I dug myself into a hole. The new workflow was pretty sweet. To the point where I wrote my blog posts a lot more frequently than before, as you can probably tell.

The problem was that I wanted to edit and process the blog post inside Google Docs, where I have a great workflow for editing, reviews, collaboration, etc. And then I want to push that same document to the blog. The killer for me is that I want that to be a smooth process, and the end text should fit into the blog. That means, if I want to emphasize something, it should be seen in the blog as bold. And if I want to write some code, that should work as well. In fact, the reason that I started this process is that it got so annoying to post code to the blog.

I’m using Google Docs’ export functionality to get the HTML back, and I did some basic cleaning to get it blog-ready instead of being focused on visual fidelity. I was using HTML Agility Pack to do that, and it turned out to be the wrong tool for the job. The issue is that it processed the data as if it were an XML document. I actually got a lot of track record with XML, so that wasn’t the issue. The problem is that I wanted to do a series of non-trivial things with the HTML, and there aren’t any off-the-shelf facilities to do that in .NET that I could find.

For example, given how important it is to me to show code snippets properly, I wanted to be able to grab them from the document, figure out what language I’m actually using there and syntax highlight it properly. There isn’t anything like that in .NET, all the libraries I found were for JavaScript.

You know the adage about: Let’s rewrite it in Rust? I rewrote my entire publishing process to JavaScript. Which then led me to another adventure. How can I do two contrary things? When I’m writing this document, I want to be able to just write the code. When I publish it, I want to see the syntax highlighted code, properly formatted and working.

Google Docs has support for writing code blocks inline (for some small number of languages), which is great for the editing process. However,  the HTML that this generates is beyond atrocious. What is even worse, in HTML, it doesn’t align things properly using fixed-sized fonts, etc. In other words, it is almost there, but not quite.

When analyzing the Google Docs output, I noticed a couple of funny characters in the code output. Here is what it looks like. I believe this is a bug in the export process, probably related to the way code blocks work in Google Docs.

Dear Googlers, if you are reading this, please make a note that this thing has just been Hyrum's Law. It is an observable state, and I’m relying on it to do important tasks. Don’t break this in the future.

It turns out these are actually a pair of Unicode characters. More specifically, they are Unicode characters that are marked for private use:

  • 0xEC03 - appears to be used to mark the beginning of a code block
  • 0xEC02 - appears to be used to mark the end of a code block

Note the “appears”, and my blatant disregard for things like software maintenance discipline and all things proper and good in the world of Computer Science. This is a project where there are no rules, there is one customer, and he can code 🙂.

As mentioned earlier, while extracting the Google Doc as HTML and processing it, I encounter those Unicode markers that delineate the code section. This is good, because in terms of HTML itself, what it is doing inside is a… mess. Getting the actual text as it is supposed to be is not easy. So I exported the file again, as text. Those markers are showing up in the textual edition as well, which made things a lot easier for me.

With all of this done, allow me to show you some truly horrifying beautiful code:


let blocks = [];
for (const match of text.data.matchAll(/\uEC03(.*?)\uEC02/gs)) {
    const code = match[1].trim();
    const lang = flourite(code, { shiki: true, noUnkown: true }).language;
    const formattedCode = Prism.highlight(code, Prism.languages[lang], lang);


    blocks.push("<hr/><pre class='line-numbers language-" + lang + ">" +
        "<code class='line-numbers language-" + lang + "'>" +
        formattedCode + "</code></pre><hr/>");
}


let inCodeSegment = false;
htmlDoc.findAll().forEach(e => {
    var text = e.getText().trim();
    if (text == "&#60419;") {
        e.replaceWith(blocks[codeSegmentIndex++]);
        inCodeSegment = true;
    }
    if (inCodeSegment) {
        e.extract();
    }
    if (text == "&#60418;") {
        inCodeSegment = false;
    }
})

That isn’t a lot of code, but it does plenty. We scan through the textual version of the document and find all the code blocks using a regular expression. We then try to figure out what language I’m using and apply code formatting during the publication process (this saves the need to change anything on the blog, which is nice, especially since we have to take into account syndication).

I push the code snippets into an array and then I process the actual HTML document using the DOM and find all the code snippets. I replace the start marker with the actual formatted code and continue to discard all the other elements until I hit the end of the code segment. The rest of the code remains pretty much the same as before.

I was writing this in VS Code and copilot suggested the following code for handling images:


htmlDoc.findAll('img').forEach(img => {
    if (img.attrs.hasOwnProperty('src')) {
        let src = img.attrs.src;
        let imgName = src.split('/').pop();
        let imgData = entries.find(e => e.entryName === 'images/' + imgName).getData();
        let imgType = imgName.split('.').pop();
        let imgSrc = 'data:image/' + imgType + ';base64,' + imgData.toString('base64');
        img.replaceWith('<img src="' + imgSrc + '" style="float: right"/>');
    }
})

In other words, instead of uploading the images as separate files, I can just encode them into the blog post directly. I like that idea very much because it means that I don’t have to store the images elsewhere.

Given that I don’t have any npm packages to abandon, I don’t know if I can call myself a JavaScript developer, but I did put the full code up for people to take a peek and then recoil.

time to read 14 min | 2727 words

Fungible is a funny word, mostly because you are most likely familiar with the term from NFT (non-fungible tokens) and other similar scams. At its core, it is the idea that for certain things, the instance doesn’t matter, just the amount.

The classic example is that if I lend you a 50$ bill, and you give me back two 20$ bills and a 10$ bill, you’ve still given me back my money. That is even though you very clearly didn’t. I didn’t get the same physical 50$ paper bill back, I got bills for that same amount. On the other hand, if I give you my dog for the weekend, I would be quite upset if I got back three different dogs, even if the total weight is the same.

This is actually a lot more than I want to know about fungibility, to be honest. But it turns out that if you are running a cloud business or just use the cloud in general, you have to be well-versed in the matter. Because in the cloud, money isn’t fungible. In fact, it doesn’t behave a lot like money at all.

Let’s assume that we are a cloud company called cloud.example.com, offering VPS for ourr users. You are in charge of writing the billing code, and it is pretty simple, right? Here is some code that can compute the charges:


function compute_charges(custId, start, end) {
  let total = 0;
  let predicate = instance =>
    (instance.custId === custId  && instance.started < end) &&
    (instance.ended > start      || instance.ended == null);


  for (let instance of query_instances(predicate)) {
    total += instance.hours_running(start, end) *
             instance.price_per_hour;
  }


  return total;
}

As you can see, there isn’t much there. We find all the instances that were running in the billing period and then calculate the total hours they ran during that period. Please note, this is a simplified model as we aren’t dealing with stopping & starting instances, etc.

The output of the compute_charges() function is a number, which will presumably be handed over to be charged over a credit card. There are other things that we need to do as well (generate an invoice, have a usage report, etc), but I want to focus on the money issue here.

The simplest model is that at the end of the billing period, we charge the customer (using a credit card, for example) and receive our payment. Everyone is happy and we can go home, hopefully richer.

The challenge arises when we want to offer additional options to the customer. For example, we may be willing to give the customer a discount if they are going to commit to a minimum amount of money they’ll spend each month. We may want to offer them upfront payment options or give monetary incentives to a particular aspect of the business (run on ARM instances instead of X64, for example).

Each time that we make such an offer, we are going to be turning around and (significantly) complicating the way we bill the customer. Let’s talk about something as simple as committing to run an instance for a whole year. No upfront payment, just a commitment to pay for a particular server for a year. In AWS or Azure, that would be Reserved Instances, so you are likely very familiar with the idea.

How is that going to be expressed in code? Probably something like this:


function compute_charges(custId, start, end) {
    let total = 0;
    let predicate = instance => /*..redacted.*/;
   
    var hrsPerIns = {};
    for (let i of this.instances(predicate)) {
        let hours = i.hours_running(start, end);
        hrsPerIns[i.type] = hours + (hrsPerIns[i.type] || 0);
        total += hours * i.price_per_hour;
    }


    for (let c of this.commitmentsFor(custId, start, end)) {
        let hours = c.committed_time(start, end);
        let hoursUsed = hrsPerIns[c.type] || 0;
        let unusedCommittedHours = Math.max(0, hours - hoursUsed);
        total += unusedCommittedHours *
                this.instance(c.type).price_per_hour;
    }
 
    return total;
  }

To be clear, the code above is not a good way to handle such a task, but it does show in a pretty succinct way the hidden complexities. In this case, if you didn’t meet your commitment, we’ll charge you for the unused commitment as well.

A more complex system would have to account for discounted rates while using the committed values, for example. And in that case, the priority of applying such rates between different matching commitments.

Other aspects may be giving the user a discount for a particular level of usage. So the first 100GB are priced differently from the rest, applying a free tier and… you get the point, I think. It gets complex.

Note that at this point, we aren’t even talking about money yet, we are discussing computing the charges. The situation is more interesting when we move to the next stage. On the face of it, this seems pretty simple, all you need to do is charge the credit card, no?

Okay, maybe you need to send an invoice, but that is about it, right?

Well… what happens if the customer made an upfront payment for one of those commitments? Or just accidentally paid twice last month and now has credit on your system.

I’m going to leave aside the whole complexity around payments bouncing (which is a whole other interesting topic) and how to deal with the actual charging. Right now I want to focus on the nature of money itself.

Imagine you have a commitment with a customer for an 8-core / 64 GB VPS server for a whole year. And they paid upfront, getting a nice discount along the way. How would you record that in your system?

The easiest is to create the notion of credit for the user, which you deduct whenever you need to charge them. So we’ll first compute the charges, then deduct the existing credits, and debit the customer if anything remains. This is simple, easy to work with, and wrong.

Remember that discount the user received? They paid for that particular VPS type, and if you now need to charge them for anything else (such as storage charges), that money cannot come from the funds paid for the VPS.

In other words, the money the customer paid is not fungible. It isn’t applicable for any charge, it is colored. It is dedicated to a particular purpose. And managing that turns out to be pretty complex. Mostly because we are trying to fit everything into the debits and credits on the account.

A better model is to avoid using money, in the same way that if you mix inches and centimeters you’ll eventually end up in a bad place on Mars. The solution is to treat each individual charge as its own “currency”.

In other words, when computing the charges, we aren’t trying to find the cost of running a particular instance for the billing period. We are trying to find how many “cost units” we have for that time period.

Instead of getting a single number that we’ll charge the customer, we’ll obtain a detailed set of the changes in question. Not as money, but as cost units. Think about those in a similar way to currency.  Note that all the units are multiples of 730 hours (number of hours per month, on average).


compute_charges(custId, start, end) => {
    custId: 'customers/3291-B',
    start: '2024-01-01', end: '2024-01-31',
    costs: [
      {type: '8Cores-64GB-hours',  qty: 2190},
      {type: '4Cores-32GB-hours',  qty:  730},
      {type: 'disk-5000-iops',     qty: 2920},
    ],
}

The next step after that is to get your allocated budget for the same billing period, which will look something like this:


compute_budget(custId, start, end) => {
    custId: 'customers/3291-B',
    start: '2024-01-01', end: '2024-01-31',
    commitments: [
      {type: '8Cores-64GB-hours',  qty: 2190},
      {type: '4Cores-32GB-hours',  qty: 1460},
      {type: 'disk-5000-iops',     qty: 730},
    ],
}

In other words, just as we compute the charges based on the actual usage for that billing period, we apply the same approach on the commitments we have. The next stage is to just add all of those together. In this case, we’ll end up with the following:

  • 8Cores-64GB-hours ⇒ 0 (we used as much as we committed to)
  • 4Cores-32GB-hours ⇒ -730 (we committed to more than we used)
  • Disk-5000-iops ⇒ 2190 (remaining use after applying commitment, priced as you go)

We aren’t done yet, after commitments, there are other plans that we may need to run. For example, we’ll provide you with some global discounts for VM rental (which doesn’t apply to disks, however). Working at the level of cost units (or colors, or currency, whatever term you like) allows us to apply those things in a very fine-grained manner. More importantly, the end result and all its intermediate steps are very clear. That is quite important when you look at a six-figure bill with hundreds of line items and you want to see whether the billing matches your contract or not.

As you can imagine, given the inherent complexity of the system, being able to clearly “show your work” is quite important. Especially when there is a misunderstanding or questions are being raised (and there will be).

What we have done now is compute the actual charges based on their type, but we need to convert that to real money. There are several steps along this process:

  1. We need to charge all the active commitments. Those may have been pre-paid (in which case there is no current charge), but they may have a (fixed) monthly cost that we need to add to the current invoice.
  2. We need to perform a “currency conversion” between the units we have and actual money. In the example above, we have a negative number of units (for 4Cores-32GB-hours), as we committed to more hours than we actually used. We are still being charged for this by applying the rate from the commitment.
  3. On the other hand, when we examine the disk costs, we used more than we committed to. Here we need to make a decision about what price we’ll charge the user. It can be the commitment price or the pay-as-you-go price. So even for the same currency we may have different rules.

After all of this is done, we are now left with a final number. The actual amount of money that we need to charge the customer. This is the point at which we check if the customer has any credit already paid in the system or if we need to make an actual charge. That aspect is complicated by whether you are charging a credit card (same for any other automatic billing option) or issuing an invoice to be paid manually.

For a manual invoice, you now have a whole other process. For example, you may offer discounts for the customer if they pay within 14 days versus the usual 30, or charge a fee for paying within 60 days, etc.

I’m not touching on collections or what to do when you fail to charge the customer. It is shockingly common to encounter payment failures. To the point where we never had a single payment run that didn’t include at least several such cases. The reasons range from deal size too big to (temporary) lack of funds to suspicious-seeming activity. You need to be able to handle that as well. But those are topics for another post.

In this post, my aim was to discuss just the issue of the complexity of money in the cloud business. I find the model of treating the charges as separate “currencies” to be a nice one overall, but I would love to hear about other people’s experiences in this matter.

FUTURE POSTS

  1. Optimizing the cost of clearing a set - about one day from now
  2. Scaling HNSW in RavenDB: Optimizing for inadequate hardware - 3 days from now

There are posts all the way to May 14, 2025

RECENT SERIES

  1. RavenDB News (2):
    02 May 2025 - May 2025
  2. Recording (15):
    30 Apr 2025 - Practical AI Integration with RavenDB
  3. Production Postmortem (52):
    07 Apr 2025 - The race condition in the interlock
  4. RavenDB (13):
    02 Apr 2025 - .NET Aspire integration
  5. RavenDB 7.1 (6):
    18 Mar 2025 - One IO Ring to rule them all
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}