Common questions about when and how to encode URLs, what characters need encoding, and how decoding errors occur
What is URL encoding and why is it necessary?
URL encoding (also called percent-encoding) is the process of converting characters that are not permitted in URLs -- or that have special reserved meaning -- into a safe representation. A URL can only contain characters from a restricted subset of ASCII: letters, digits, and a small set of unreserved characters (hyphen, underscore, period, tilde). Any other character must be encoded as a percent sign followed by two hexadecimal digits representing the character's UTF-8 byte value. For example, a space becomes %20, an ampersand becomes %26, and the hash symbol becomes %23. Without encoding, these characters would be misinterpreted as URL structural delimiters, breaking the request or causing unexpected behavior in browsers and servers.
What is the difference between encodeURIComponent and encodeURI in JavaScript?
Both functions apply percent-encoding, but they differ in which characters they encode. encodeURIComponent is designed for encoding individual values within a URL -- it encodes everything except letters, digits, and the characters - _ . ! ~ * ' ( ). This includes the characters / : ? = & # which are structural in URLs, making it safe for parameter values where those characters would otherwise break the URL. encodeURI, on the other hand, is designed for encoding a complete URL string -- it intentionally preserves all URL-structural characters (/ : ? = & # @ and others) so the URL remains valid, while encoding only truly unsafe characters like spaces, non-ASCII characters, and symbols like < > { } | \ ^ `. Use encodeURIComponent for individual query parameter values, and encodeURI for a full URL string.
What is the difference between %20 and + for encoding spaces in URLs?
Both %20 and + represent a space character in URL contexts, but they are used in different parts of a URL. %20 is the standard percent-encoding of a space and is used in URL path segments -- for example, /my%20file.txt. The + character as a space representation comes from the application/x-www-form-urlencoded encoding format, which is used in HTML form submissions and query strings. In query strings submitted by HTML forms, spaces are encoded as + for historical reasons. When decoding, a URL decoder must convert both %20 and + to spaces when parsing query string values. This tool handles the + to space conversion during decoding by replacing + with %20 before calling decodeURIComponent.
How does URL encoding work for Unicode and non-ASCII characters?
Unicode characters (including non-Latin scripts, accented characters, emoji, and Chinese/Japanese/Korean text) cannot appear directly in URLs. They are first converted to their UTF-8 byte representation, then each byte is percent-encoded separately. For example, the copyright symbol (c) is U+00A9, which encodes to the two-byte UTF-8 sequence 0xC2 0xA9, resulting in %C2%A9 in a URL. The rocket emoji (U+1F680) encodes to the four-byte UTF-8 sequence 0xF0 0x9F 0x9A 0x80, resulting in %F0%9F%9A%80. This multi-byte encoding is why non-ASCII characters produce longer percent-encoded strings. JavaScript's encodeURIComponent handles this automatically using UTF-8 encoding.
When should I use Query String mode instead of Encode Component?
Use Query String mode when you have a complete key=value&key=value query string and you want to encode the keys and values without destroying the = and & separators. For example, if you have the query string name=John Smith&city=New York, using Encode Component would encode the entire string including the = and & characters, producing name%3DJohn%20Smith%26city%3DNew%20York, which is incorrect. Query String mode instead encodes each key and value independently -- producing name=John%20Smith&city=New%20York -- while preserving the separators. Use Query String mode when building URLs with user-supplied parameters. Use Encode Component when encoding a single value that will be inserted into a larger URL template.
What characters are safe in URLs and do not need encoding?
RFC 3986 defines two categories of safe characters. Unreserved characters never need encoding: letters A-Z and a-z, digits 0-9, and the four symbols hyphen (-), underscore (_), period (.), and tilde (~). Reserved characters are allowed in URLs but have special meaning as delimiters -- they include: / (path separator), : (scheme/port separator), ? (query start), # (fragment start), [ ] @ (authority components), ! $ & ' ( ) * + , ; = (sub-delimiters). Reserved characters only need to be encoded when they appear inside a component where they would be misinterpreted -- for instance, if a query parameter value contains an & it must be encoded as %26 to prevent it from being parsed as a parameter separator.
How do I encode a URL in JavaScript for use in a fetch or XMLHttpRequest call?
For constructing URLs with query parameters in JavaScript, the best practice is to use the URLSearchParams API which handles encoding automatically: const params = new URLSearchParams({ name: 'John Smith', city: 'New York' }); fetch('/api/search?' + params.toString()). If you need to manually encode individual values, use encodeURIComponent on each value: const url = '/api/search?name=' + encodeURIComponent(name) + '&city=' + encodeURIComponent(city). Never use encodeURI on the full URL with dynamic query values already appended, as encodeURI does not encode the = and & characters and will not encode special characters within the values. This tool's Encode Component and Query String modes replicate the encodeURIComponent and URLSearchParams behavior respectively.
What causes 'URI malformed' errors when decoding a URL?
A 'URI malformed' error (thrown by JavaScript's decodeURIComponent as a URIError) occurs when the input contains a percent sign (%) that is not followed by exactly two valid hexadecimal digits, or when a percent-encoded sequence forms an incomplete or invalid UTF-8 byte sequence. Common causes include: truncated percent sequences like %2 or a lone % character, invalid hex characters like %GG, a high UTF-8 byte without its continuation bytes (e.g. %C2 without %A9), or text that was never encoded but contains literal % characters (such as a percentage value like 50%). To fix this, either ensure the input is a properly encoded URL component, or pre-process the input to escape literal % characters as %25 before decoding.
How is URL encoding used in SEO and marketing tracking parameters?
URL encoding is essential for UTM tracking parameters and marketing campaign URLs. When campaign names, content descriptions, or source names contain spaces, slashes, or special characters, they must be encoded before being appended to a URL. For example, a campaign name 'Summer Sale 2024 / Email' would need to be encoded as Summer%20Sale%202024%20%2F%20Email when used as a utm_campaign value. Analytics platforms like Google Analytics and Adobe Analytics automatically decode these values when processing hits, so the report shows the original readable string. Incorrectly encoded or unencoded tracking parameters can result in broken URLs, misattributed sessions, or parameters being split across multiple values in analytics reports.
How does URL encoding differ from HTML encoding and Base64 encoding?
These three encoding schemes serve different purposes. URL encoding (percent-encoding) converts characters unsafe in URL contexts into %HH hex sequences -- it is specifically designed for URLs and query strings. HTML encoding (also called HTML entity encoding) converts characters that have special meaning in HTML markup into safe entity representations: < becomes <, > becomes >, & becomes &, and " becomes ". HTML encoding is applied when inserting dynamic text into HTML to prevent XSS attacks. Base64 encoding converts arbitrary binary data (including text) into a string using only 64 ASCII characters (A-Z, a-z, 0-9, +, /), typically for embedding binary data in text contexts like JSON payloads, data URIs, or email attachments. A URL might contain Base64-encoded data that is itself URL-encoded, so the encodings can be combined.