Common questions about how UUIDs work, which version to use, and how to generate them in JavaScript, Python, and other languages
What is a UUID and what is it used for?
A UUID (Universally Unique Identifier), also called a GUID (Globally Unique Identifier) on Windows platforms, is a 128-bit identifier standardised in RFC 4122. It is formatted as 32 hexadecimal digits in five groups separated by hyphens: 8-4-4-4-12 characters, for example 550e8400-e29b-41d4-a716-446655440000. UUIDs are designed to be globally unique without requiring a central registry or coordination between systems. They are used as primary keys in databases, identifiers for distributed system resources, session tokens in web applications, file and object identifiers in storage systems, node IDs in microservices, and unique identifiers in logging and event tracking systems.
What is the difference between UUID v4 and UUID v5?
UUID v4 is generated from cryptographically secure random data -- it produces 122 bits of randomness (the remaining 6 bits encode the version and variant), making every generated UUID effectively unique. UUID v5 is deterministic -- it takes a namespace UUID and a name string, concatenates them, and hashes the result with SHA-1. The hash output is then formatted as a UUID with the version and variant bits set. The critical difference is reproducibility: calling uuidV4() twice produces two different UUIDs, but calling uuidV5 with the same namespace and name always produces the same UUID. Use v4 when you need a unique identifier for a new resource. Use v5 when you need a stable, reproducible identifier that can be independently regenerated from known inputs.
What are UUID v5 namespaces and which one should I use?
UUID v5 namespaces are well-known UUIDs defined in RFC 4122 that scope the name space for hashing. They prevent collisions between different types of names -- the same name string in different namespaces produces different UUIDs. The four standard namespaces are: DNS (6ba7b810-...) for domain names and hostnames, URL (6ba7b811-...) for full URL strings, OID (6ba7b812-...) for ISO Object Identifiers, and X500 (6ba7b814-...) for X.500 distinguished names. For most custom applications, use the DNS namespace if your names are domain-like, or the URL namespace if your names are URL-like. If you are creating a completely custom namespace for your application, generate a random UUID v4 and use that as your custom namespace UUID.
Are UUID v4 values truly unique -- can they collide?
In practice, UUID v4 values are effectively unique for all real-world use cases. A UUID v4 contains 122 bits of cryptographic randomness. The probability of generating any two identical UUID v4 values is approximately 1 in 5.3 x 10^36 -- an incomprehensibly small number. To put it in context: to have even a 50% probability of a single collision, you would need to generate approximately 2.7 x 10^18 (2.7 quintillion) UUIDs. At a rate of 1 billion UUIDs per second, that would take about 85 years. The generation quality depends on the cryptographic random number generator -- this tool uses crypto.randomUUID() (or crypto.getRandomValues() as a fallback) which uses the operating system's cryptographically secure PRNG.
How do I generate a UUID in JavaScript or TypeScript?
In modern JavaScript and TypeScript, the simplest approach is crypto.randomUUID(), which is available in all modern browsers and Node.js 14.17+. It returns a string like '550e8400-e29b-41d4-a716-446655440000' directly. For environments where crypto.randomUUID() is not available, you can use crypto.getRandomValues(new Uint8Array(16)) to get 16 random bytes, then set the version bits (byte 6, high nibble to 4) and variant bits (byte 8, two high bits to 10 in binary), and format as a hex string. In Node.js you can also use the built-in crypto module: const { randomUUID } = require('crypto'); randomUUID(). For UUID v5, there is no native browser API -- you need to use the WebCrypto subtle.digest('SHA-1', ...) API or a library like the uuid npm package.
How do I generate a UUID in Python?
Python's standard library includes a uuid module with direct support for all UUID versions. For UUID v4 (random): import uuid; str(uuid.uuid4()) produces a lowercase hyphenated UUID string. For UUID v5 (deterministic): uuid.uuid5(uuid.NAMESPACE_DNS, 'example.com') takes a namespace constant and a name string. The built-in namespace constants are uuid.NAMESPACE_DNS, uuid.NAMESPACE_URL, uuid.NAMESPACE_OID, and uuid.NAMESPACE_X500 -- these correspond exactly to the namespaces in this tool. To remove hyphens, call str(uuid.uuid4()).replace('-', ''). To get uppercase, call str(uuid.uuid4()).upper(). UUID v4 in Python uses os.urandom() internally, which calls the operating system's cryptographically secure random source.
Should I use UUIDs as database primary keys?
UUIDs are widely used as database primary keys, particularly in distributed systems where rows are created across multiple nodes without a central auto-increment sequence. UUID v4 has excellent uniqueness properties but its random nature causes index fragmentation in B-tree indexes (like those used by PostgreSQL and MySQL) at high insert rates, because each new row inserts into a random position in the index rather than appending to the end. UUID v7 (ratified in the 2022 revision of the UUID spec) addresses this by using a sortable timestamp prefix, making new rows consistently append to the end of the index and dramatically improving write performance. UUID v7 is not yet universally supported in ORMs and database drivers, but is growing in adoption. For most applications with moderate write volumes, UUID v4 performs acceptably.
What is the No Hyphens format and when is it used?
The no-hyphens format removes the four hyphen separators from a UUID, producing a 32-character hexadecimal string like 550e8400e29b41d4a716446655440000 instead of the standard 550e8400-e29b-41d4-a716-446655440000. This compact format is used in contexts where the hyphens cause issues or waste storage space. Common use cases include: URL path segments where hyphens would be ambiguous, database storage where a CHAR(32) column or binary(16) is preferred over CHAR(36), API keys and token strings that users copy and paste, file names where hyphens are interpreted as word separators, and systems that store UUIDs as raw binary (16 bytes) but display them as hex strings. Both formats represent the same identifier -- the hyphens carry no information and are purely cosmetic formatting.
What is the difference between UUID v1, v3, v4, v5, v6, and v7?
RFC 4122 originally defined four versions. UUID v1 combines a 60-bit timestamp with the MAC address of the generating machine -- it is time-sortable but reveals the host's MAC address, a privacy concern. UUID v3 is like v5 but uses MD5 instead of SHA-1 for hashing -- v5 is preferred because SHA-1 is stronger than MD5. UUID v4 is fully random with no timestamp component. UUID v5 is deterministic SHA-1 based. UUID v6 is a reordering of v1's timestamp for better sortability. UUID v7 (introduced in the 2022 RFC revision) uses a Unix timestamp millisecond prefix followed by random bits -- it is the recommended choice for database primary keys needing time-ordered insertion. UUID v8 reserves space for custom implementations. This tool generates v4 (random) and v5 (deterministic).
Can I use UUIDs for session tokens or API keys?
UUID v4 values can be used as session tokens and API keys, and this is a common practice in many systems. They provide 122 bits of randomness, which is sufficient entropy to prevent brute-force guessing attacks. However, dedicated cryptographic token generators like crypto.getRandomValues() with 32 bytes (256 bits) of entropy are generally preferred for security-critical tokens because they provide more bits of randomness and are designed specifically for token use cases. If you do use UUIDs as session tokens, always store only a hashed version (using SHA-256 or similar) in the database, transmit them only over HTTPS, set appropriate cookie security flags (HttpOnly, Secure, SameSite), and implement expiry and rotation policies. UUID v5 should never be used for session tokens because it is deterministic -- anyone who knows the namespace and name can reproduce the token.