An algorithm for generating compact UUIDs efficiently.
Snowflake itself is an algorithm designed at Twitter (https://github.com/twitter/snowflake),
which constructs a 64-bit UUID using:
- 41-bit timestamp--allows ~ 69 years of unique Milliseconds before repeating
- 10-bit "configured machine id" --allows 1024 machines to generate ids concurrently
- 12-bit counter -- allows 4096 UUIDs to be generated every millisecond
This organization is roughly time-ordered, which was a requirement for Twitter's use case, but
is awkward for SpliceMachine (since random UUIDs would be preferable). Additionally, 1024
machines is close, but may in some (highly special) cases be insufficient.
Thus, this is an adjusted Snowflake algorithm, using similar components, but in different order
(to ensure randomness), and with slightly adjusted sizes.
First, one notices that 41+10+12 = 63 bits. When considering the actual implementation
of Snowflake, one notices that the sign bit on the long is not used. If that bit is used,
an additional bit is available for use. Adding that to the machine id allows for 12 bits,
which allows 4096 machines to concurrently run.
Finally, to remove the sort order, placing the counter before the timer will essentially
bucket UUIDs into 4096 separate buckets, which gives a much faster roll-over rate on
UUIDs, causing order to be removed. In summary the structure used here is:
- 11-bit: counter
- 41-bit: timestamp
- 12-bit: machineId