Sorry, the UUID should have been changed long ago!

ULID: Universally Unique Lexicographically Sortable Identifier (Universally Unique Lexicographically Sortable Identifier)

UUID: Universally Unique Identifier (Universally Unique Identifier)

Why not choose UUID

UUID currently has 5 versions:

Version 1: Impractical in many environments because it requires access to unique, Stable MAC address, easy to be attacked;

Version 2: Replace the first four digits of the timestamp of version 1 with POSIX UID or GID. Same problem as above;

Version 3: Based on the MD5 hash algorithm, a unique seed is required to generate a randomly distributed ID. This can lead to fragmentation of many data structures;

Version 4: Based on random or pseudo-random number generation, no other information is provided except randomness ;

Version 5: Generated through SHA-1 hash algorithm, generating randomly distributed IDs requires a unique Seeds, which can lead to fragmentation of many data structures;

The commonly used one here is UUID4, but even if it is random, it is still There is a risk of conflict.

and UUIDs are eitherbased on random numbers, orBased on different timestamps, ULID is both based on time The stamp is based on random numbers. The timestamp is accurate to milliseconds. There are 1.21e + 24 random numbers in milliseconds. There is no risk of conflict, and converting to a string is more friendly than UUID.

ULID Properties:

ulid() # 01ARZ3NDEKTSV4RRFFQ69G5FAV
  • 128-bit compatibility with UUIDs
  • 1.21e + 24 unique ULIDs per millisecond
  • Sort lexicographically (i.e. alphabetically)!
  • Canonically encoded as 26 strings instead of the 36 characters of a UUID
  • Use Crockford’s base32 for better efficiency and readability (5 bits per character)
  • not case sensitive
  • No special characters (URL safe)
  • Monotonic sort order (correctly detects and handles identical milliseconds)

ULID Specification
The following is the current specification for ULID implemented in python (ulid-py). Binary format implemented

01AN4Z07BY 79KA1307SR9X4MV3

|----------| |----------------|
 Timestamp Randomness
  10chars 16chars
   48bits 80bits

Composition

Timestamp

  • 48-bit integer
  • UNIX time in milliseconds
  • Space will not be exhausted until AD 10889.

Randomness

  • 80 bit random number
  • If possible, use encryption to ensure randomness

Sort

The leftmost character must come first and the rightmost character must come last (lexical order). The default ASCII character set must be used. Within the same millisecond, sort order is not guaranteed

Encoding method

As shown in the picture, Crockford’s Base32 is used. The alphabet does not include the letters I, L, O, and U to avoid confusion and misuse.

0123456789ABCDEFGHJKMNPQRSTVWXYZ

Binary layout and byte order

Components are encoded into 16 octets. Each component is encoded with the most significant byte first (network byte order).

0 1 2 3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
| 32_bit_uint_time_high |
 + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
| 16_bit_uint_time_low | 16_bit_uint_random |
 + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
| 32_bit_uint_random |
 + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
| 32_bit_uint_random |
 + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +

Application scenarios

  • Replace the database with auto-increment id, without DB participating in primary key generation
  • In a distributed environment, replace UUID, which is globally unique and ordered with millisecond precision.
  • For example, if you want to partition the database by date, you can use the timestamp embedded in the ULID to select the correct partition and table.
  • The latest Java interview questions have been sorted out, and you can answer them online in the Java Interview Library applet.
  • If millisecond precision is acceptable (no ordering within milliseconds), you can sort by ULID instead of a separate created_at field
    Usage (python)

Installation

pip install ulid-py

Create a brand new ULID.

Timestamp value (48 bits) from time.time() with millisecond precision.

Random value (80 bits) from os.urandom().

>>> import ulid
>>>ulid.new()
<ULID('01BJQE4QTHMFP0S5J153XCFSP9')>

Create a new ULID based on an existing 128-bit value (e.g. UUID). Supported ULID value types are int, bytes, str, and UUID.

>>> import ulid, uuid
>>> value = uuid.uuid4()
>>> value
UUID('0983d0a2-ff15-4d83-8f37-7dd945b5aa39')
>>>ulid.from_uuid(value)
<ULID('09GF8A5ZRN9P1RYDVXV52VBAHS')>

Create a new ULID from an existing timestamp value (such as a datetime object). Supported timestamp value types are int, float, str, bytes, bytearray, memoryview, datetime, Timestamp, and ULID

>>> import datetime, ulid
>>> ulid.from_timestamp(datetime.datetime(1999, 1, 1))
<ULID('00TM9HX0008S220A3PWSFVNFEH')>

Create a new ULID based on an existing random number.

Supported random value types are int, float, str, bytes, bytearray, memoryview, Randomness, and ULID.

>>> import os, ulid
>>> randomness = os.urandom(10)
>>>ulid.from_randomness(randomness)
>>> <ULID('01BJQHX2XEDK0VN0GMYWT9JN8S')>

Once you have a ULID object, there are multiple ways to interact with it.

The timestamp() method will give you a timestamp snapshot of the first 48 bits of the ULID, while the randomness() method will give you a random number snapshot of the last 80 bits.

>>> import ulid
>>> u = ulid.new()
>>>u
<ULID('01BJQM7SC7D5VVTG3J68ABFQ3N')>
>>> u.timestamp()
<Timestamp('01BJQM7SC7')>
>>> u.randomness()
<Randomness('D5VVTG3J68ABFQ3N')>
github: https://github.com/ahawker/ulid
syntaxbug.com © 2021 All Rights Reserved.