Most compact way to store UUIDs in sqlite

‹ Version in which .vagrant changed from a file to a directory | Vagrant setup for Drupal on Ubuntu ›

If you want to store a UUID in sqlite and use the minimal possible amount of disk space, it seems like the best solution may be to represent each UUID as a 16 byte binary string and store it in a BLOB column.  I tested inserting 100,000 UUIDs into a table using three strategies with Sqlite 3.7.9, and the resulting file sizes were:

  • 2447360 bytes (46% smaller than canonical form): One BLOB column to store the 16-byte binary string
  • 2568192 bytes (43% smaller than canonical form): Two INTEGER columns to store the most significant and least significant 64 bits in the UUID, respectively
  • 4112384 bytes (9% smaller than canonical form): One TEXT or BLOB column storing the minimal hex string (e.g. ca7b9ad52afa4c78bad98d0d751fee33)
  • 4498432 bytes: One TEXT column storing the canonical form (e.g. ca7b9ad5-2afa-4c78-bad9-8d0d751fee33)

I used code similar to the following:

import os
import os.path
import sqlite3
import uuid

COUNT = 100000

database = 'testinteger.sqlite'
if os.path.exists(database):
   os.remove(database)
conn = sqlite3.connect(database)
c = conn.cursor()
c.execute("CREATE TABLE integers (msl integer, lsl integer)")

def signed(iv):
   if(iv & 0x8000000000000000):
      iv = -0x10000000000000000 + iv
   assert int(iv) == iv
   return int(iv)

for i in xrange(COUNT):
   u = uuid.uuid4()
   msl = signed(u.int >> 64)
   lsl = signed(u.int &  (1<<64) - 1)
   c.execute('INSERT INTO integers (msl, lsl) VALUES (?,?)', (msl, lsl))

conn.commit()
conn.close()

database = 'testblob.sqlite'
if os.path.exists(database):
   os.remove(database)
conn = sqlite3.connect(database)
conn.text_factory = str
c = conn.cursor()
c.execute("CREATE TABLE blob (uuid blob)")

for i in xrange(COUNT):
   u = uuid.uuid4()
   c.execute('INSERT INTO blob (uuid) VALUES (?)', (u.bytes,))

conn.commit()
conn.close()

database = 'teststring.sqlite'
if os.path.exists(database):
   os.remove(database)
conn = sqlite3.connect(database)
c = conn.cursor()
c.execute("CREATE TABLE string (uuid TEXT)")

for i in xrange(COUNT):
   u = uuid.uuid4()
   c.execute('INSERT INTO string (uuid) VALUES (?)', (unicode(u),))

conn.commit()
conn.close()

Subscribe to All Posts - Wesley Tanaka