UUIDv7 vs UUIDv4 as primary keys

By Sundeep Chand

📅

10 Jan 2026

Intro

I recently came across a few blog posts around how UUIDv4s (Universally Unique Identifier) are suboptimal to be used as primary keys in relational databases. This idea has been out there on the internet since more than 2 years now. The development of UUIDv7 is based on the concepts of Shopify's ULID (Universally Unique Lexicographically Sortable Identifier). While both UUIDv4 & UUIDv7 are 128-bits long, UUIDv7 encodes a timestamp (with ms precision) in the most significant 48-bits. Hence, when generating IDs with UUIDv7, the IDs will be a monotonically increasing sequence. Whereas UUIDv4 is purely random. Refer to the images below for a high-level structure of these IDs:

In this blog post I briefly note down my learnings on database internals while trying to understand the subject. In the next few sections, I'll walk you through the internals of MySQL and how switching to UUIDv7 from UUIDv4 helps improve index performance with certain benchmark numbers.

A quick primer on how MySQL stores data in disk

All databases read & write to disk in units known as pages (different from OS pages). This is because disk I/O is significantly costly & hence it is more efficient to read colocated groups of data than performing random reads/writes for individual rows. These pages are then cached in the DB memory (known as the 'buffer pool' in MySQL) to respond to subsequent queries.

MySQL's InnoDB storage engine stores the tables as B+ Trees in the disk, where the primary key column values form the keys of the B+ Tree nodes. This is known as the primary index, which is used to efficiently search for records based on primary key. The storage engine is designed in such a way that the a single page (16kb in InnoDB) can accommodate exactly 1 B+ Tree node. The figures below show a rough B+ Tree structure that MySQL will store in the disk for the given table.

The B+ Tree structure for the given table. For this ex, each of the nodes contain only 2 records, but in real world scenario, each of the nodes will contain as many records as a single page can accommodate (i.e. page_size / record_size).

In MySQL the leaf nodes of the primary index's B-Tree contain the actual data rows. This kind of index is known as a clustered index, where the physical layout of data on the disk is ordered based on the primary key. This makes data access faster with primary keys because additional page lookup is not needed, but, at the same time it has other trade offs like costlier B+ Tree rebalances during page splits (because of shifting of actual row entries), 2 B+ Tree traversals in case of reading by secondary keys (See fig below), etc.

The B+ Tree structure for a secondary index created on name column. Note that the data nodes contain the primary key, which is then used to perform another search on the primary index to retrieve the entire row

Benchmarks

For benchmarking the performance difference between UUIDv4 & UUIDv7, I wrote a Java program to insert 5 million records to my local MySQL DB in batches. Performing the inserts in batches, helps us benchmark the effect of B+ Trees and minimize the effect of network calls in the reported time. The table schema is as follows, where we keep only 2 columns: id of BINARY type to hold raw bytes of UUID string & payload of type VARCHAR to hold arbitary payload:

  CREATE TABLE uuid_bench
 (
   id BINARY(16) PRIMARY KEY,
   payload VARCHAR(100)
 ) ENGINE=InnoDB

The reason for preferring BINARY over VARCHAR for UUIDs is that the former is much more space efficient as compared to the later. The program is presented in the following section & I discuss the results in the later section.

Dependencies (pom.xml)

  <dependencies>
 	<dependency>
 		<groupId>com.mysql</groupId>
 		<artifactId>mysql-connector-j</artifactId>
 		<version>9.5.0</version>
 	</dependency>

 	<!--Dependency for generating UUIDv7-->
 	<dependency>
 		<groupId>com.fasterxml.uuid</groupId>
 		<artifactId>java-uuid-generator</artifactId>
 		<version>5.1.0</version>
 	</dependency>
 </dependencies>

Code (Main.java)

  package com.sundeepchand;
 
 import com.fasterxml.uuid.Generators;
 
 import java.sql.*;
 import java.util.List;
 import java.util.UUID;
 
 class Row {
   public byte[] id;
   public String payload;
 }
 
 @FunctionalInterface
 interface UuidGenerator {
   UUID generateUuid();
 }
 
 public class Main {
   private static final int ROWS = 5_000_000;
   private static final int BATCH_SIZE = 10_000;
 
   private static final String JDBC_URL =
       "jdbc:mysql://localhost:3306/uuid_benchmark?useSSL=true&rewriteBatchedStatements=true";
   private static final String USER = "root";
   private static final String PASS = "my-secret-pw";
 
   public static void main(String[] args) {
     try (Connection conn = DriverManager.getConnection(JDBC_URL, USER, PASS)) {
 
       conn.setAutoCommit(false);
 
       createTable(conn, "uuid_v4_bench");
       benchmark(
         conn,
         "UUIDv4",
         "uuid_v4_bench",
         generateDataset(() -> UUID.randomUUID())
       );
 
       createTable(conn, "uuid_v7_bench");
       benchmark(
         conn,
         "UUIDv7",
         "uuid_v7_bench",
         generateDataset(() -> Generators.timeBasedEpochGenerator().generate())
       );
 
     } catch (Exception e) {
       e.printStackTrace();
     }
   }
 
   private static void benchmark(Connection conn, String name, String tableName, List<Row[]> dataset)
       throws SQLException {
 
     long start = System.nanoTime();
 
     try (PreparedStatement ps =
            conn.prepareStatement("INSERT INTO " + tableName + " (id, payload) VALUES (?, ?)")) {
 
       for (Row[] batch : dataset) {
         for (Row row : batch) {
           ps.setBytes(1, row.id);
           ps.setString(2, row.payload);
           ps.addBatch();
         }
         ps.executeBatch();
         conn.commit();
       }
     }
 
     long end = System.nanoTime();
     double seconds = (end - start) / 1e9;
 
     System.out.printf(
         "%s: %,d inserts in %.2f s (%,.0f inserts/s)%n",
         name, ROWS, seconds, ROWS / seconds
     );
   }
 
   private static void createTable(Connection conn, String tableName) throws SQLException {
     try (Statement st = conn.createStatement()) {
       st.execute("DROP TABLE IF EXISTS " + tableName);
       st.execute("CREATE TABLE " + tableName + """
         (
          id BINARY(16) PRIMARY KEY,
          payload VARCHAR(100)
         ) ENGINE=InnoDB""");
     }
     conn.commit();
   }
 
   private static List<Row[]> generateDataset(UuidGenerator uuidGenerator) {
     List<Row[]> dataset = new java.util.ArrayList<>();
 
     for (int i = 0; i < ROWS;) {
       Row[] batch = new Row[BATCH_SIZE];
       for (int j = 0; j < BATCH_SIZE; j++) {
         Row row = new Row();
         row.id = convertUuidToBytes(uuidGenerator.generateUuid());
         row.payload = "Payload for row " + (i + j);
         batch[j] = row;
       }
       dataset.add(batch);
       i += BATCH_SIZE;
     }
     return dataset;
   }
 
   public static byte[] convertUuidToBytes(UUID uuid) {
     byte[] bytes = new byte[16];
     long mostSigBits = uuid.getMostSignificantBits();
     long leastSigBits = uuid.getLeastSignificantBits();
     for (int i = 0; i < 8; i++) {
       bytes[i] = (byte) ((mostSigBits >> (8 * (7 - i))) & 0xFF);
     }
     for (int i = 8; i < 16; i++) {
       bytes[i] = (byte) ((leastSigBits >> (8 * (15 - i))) & 0xFF);
     }
     return bytes;
   }
 }

Benchmark results

The output after running the program is as follows:

  UUIDv4: 5,000,000 inserts in 217.83 s (22,954 inserts/s)
 UUIDv7: 5,000,000 inserts in 28.58 s (174,939 inserts/s)

For 5,000,000 records with batch size 10,000, UUIDv7 is ~87% faster than UUIDv4

  UUIDv4: 5,000,000 inserts in 290.90 s (17,188 inserts/s)
 UUIDv7: 5,000,000 inserts in 55.84 s (89,538 inserts/s)

For 5,000,000 records with batch size 1,000, UUIDv7 is ~81% faster than UUIDv4

This clearly shows that batch inserts of UUIDv7 is ~87% faster than UUIDv4s. As I tried to reduce the batch-size, the performance difference between the two, reduced slightly. This is because the IDs in a batch of UUIDv7 are sequential and thus lead to predictable order of insertions in the B+ Tree pages. It is also able to take better advantage of the cached pages, as the page that is being written would have already been cached by a previous write. Whereas in the case of UUIDv4, the IDs in a batch are completely random and thus leads to reading of multiple different pages & the cache usage isn't that optimal.

Since the data is placed sequentially on the disk based on primary key, when using UUIDv7, range queries are more performant as compared to UUIDv4. The performance difference between the two IDs is very clear for the kind of queries where we want to fetch the last few records from the DB as can be seen in the EXPLAIN ANALYZE result below (as can be seen, the time difference between UUIDv7 & UUIDv4 is quite significant):

  mysql> EXPLAIN ANALYZE SELECT * FROM uuid_v4_bench ORDER BY id DESC LIMIT 100;
 +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 | EXPLAIN                                                                                                                                                                                                          |
 +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   -> Limit: 100 row(s)  (cost=0.574 rows=100) (actual time=10.1..10.1 rows=100 loops=1)
     -> Index scan on uuid_v4_bench using PRIMARY (reverse)  (cost=0.574 rows=100) (actual time=10..10.1 rows=100 loops=1)
 +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 1 row in set (0.012 sec)
 
 mysql> EXPLAIN ANALYZE SELECT * FROM uuid_v7_bench ORDER BY id DESC LIMIT 100;
 +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 | EXPLAIN                                                                                                                                                                                                                |
 +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   -> Limit: 100 row(s)  (cost=0.489 rows=100) (actual time=0.394..0.465 rows=100 loops=1)
     -> Index scan on uuid_v7_bench using PRIMARY (reverse)  (cost=0.489 rows=100) (actual time=0.377..0.441 rows=100 loops=1)
 +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 1 row in set (0.007 sec)

The random order of writes in case of UUIDv4 causes random page splits (& consequently a rebalancing of the node entries) thus contributing to somewhat more number of Disk I/O operations per write. This also leads to the B+ Tree nodes in the indexes being filled non-uniformly as can be seen in the primary index size differences between the two (UUIDv4 index is ~36% bigger than UUIDv7). This is because with sequential writes the B+ Tree nodes are tightly packed as the writes always happen in the rightmost node. Random writes with UUIDv4 on the other hand leads to a sparse packing of entries in the B+ Tree nodes, which causes the data to be fragmented causing the index size to be larger:

  mysql> SELECT TABLE_NAME, ROUND(DATA_LENGTH / 1024 / 1024, 2)
 -> AS `Primary_Index_Size_MB` FROM information_schema.TABLES
 -> WHERE TABLE_SCHEMA = 'uuid_benchmark';
 +---------------+-----------------------+
 | TABLE_NAME    | Primary_Index_Size_MB |
 +---------------+-----------------------+
 | uuid_v4_bench |                140.00 |
 | uuid_v7_bench |                 88.78 |
 +---------------+-----------------------+
 2 rows in set (0.033 sec)

Conclusion

So based on the results above, I will say that, UUIDv7 has certain advantages over purely random UUIDv4, because of better page cache usage, with the trade-off that sequential IDs may leak certain information about the dataset to unintended parties. But like any other solution, if used at the right place it can give significant performance benefits. By supporting sequential generation of IDs & having randomness of UUIDs it offers a great trade-off for cases where range scans are frequent. It is a solid replacement for UUIDv4 in cases where the IDs don't have to be purely random (due to privacy concerns, refer to this Reddit thread for more info). Like UUIDv4 it is also great replacement for Auto-Increment IDs which have their own challenges when used in Multi-Primary replicated clusters. All-in-all its a great middle ground between sequential Auto-Increment IDs & purely random UUIDs.