Unsafe operations terminology and operational hazards

time to read 2 min | 327 words

One of the features we are working on have the notion of a consensus cluster, as well as the ability to force a new cluster if a majority of the nodes in the cluster are down. The details aren’t important, but the first iteration of the UI went something like this:

image

Initialize new cluster is an unsafe operation, it make the current node into a single node cluster (which obviously has its own majority), and Take over a node will force a node that is part of an existing cluster to joint the current cluster, bypassing the usual safety measures.  The Leave cluster command is for usual behavior, when you want to safely remove a node from the cluster.

We had a few problems with this UI (note that it was there simply to make it easy to test the behavior of the system, so don’t get too hang up on the first draft).

One problem we had is that this is shown front and center. It isn’t an operation that we want to make it easy for the admin to run accidently (maybe through just exploring the interface).

That is easy, just drop it into an “Advanced” section, right? But I also had an issue with the terminology. It is too.. bland.

Instead, we are going to rename the buttons as follow:

  • Go AWOL from cluster – step down into a single node cluster.
  • Kidnap node into cluster – force a node to the current cluster.

The idea with this terminology is that it is obvious (hopefully) that those aren’t standard operations, and that you should consider them carefully.

I’m not sure about Go AWOL, because that might be a very US centric term, other things we consider are:

  • Abrogate cluster
  • Repudiate cluster

For the same logic.

Thoughts?