Encode Url String Online | With article

What is Url encoding?

Url encoding is used to convert non-ASCII or unprintable characters into urls safe format by replacing with ASCII characters. 1. First encoding process convert characters into 8-bit bytes. 2. Then after, it gets converted into hex each byte having 2 hex values like this (B3) with (%) sign at the front after converted looks like this (%B3).

How encoding process works?

Let’s take a Japanese character 平 first it gets converted into binary bytes. It look like this, 11100101 10111001 10110011 and then after it gets encoded into hex each hex digit represent 4 bits of data and 2 hex digits meaning 1 byte or 8 bit, after encoding it look like this %E5%B9%B3. We got three encoded hex values because non ASCII characters take more space while ASCII character take only one byte. Basically url encoding uses hex scheme with (%) percent sign. Encoding schemes like hex and base64 are mainly used where ASCII characters can be transferred only. Hex use 16 ASCII character 0-9 and A-F, all characters can be encoded into hex but url encoding convert some of them you can see it on given table.

ClassificationCharactersEncoding required?
Unreserved charactersAlphabets (A-Z a-z), Digits (0-9), tilde (~), underscore (_), hyphen (-), and dot (.)No
Reserved characters: / ? # [ ] @ * $ & ‘ ( ) !+ , ; =Yes
Unsafe charactersspace < > { } | ` ^ \Yes
Non ASCII charactersCharacters outside the US-ASCII set.Yes

Categorization of url encoding:

Url encoding can be categorize into three parts

1. Unreserved character: Also known as safe characters these character does not require encoding it includes Alphabets (A-Z a-z), Digits (0-9), hyphen (-), underscore (_) tilde (~), and dot (.)

2. Reserved characters: Some characters like / have special meaning in url. so you can’t put them in url without encoding.

2. Unsafe characters: characters like | are known as a unsafe character so these character require encoding before placing.

3. Non ASCII characters: Characters like 元 require encoding because URLs support ASCII character only.

Url encoding usage:

Basically url encoding occurs in every search, it happen too fast so we usually don’t pay much attention.

  1. While entering url.

https://coderkit.org/jp/サンプルページ

2. (After clicking enter) URL Scheme see it as a –

https://coderkit.org/jp/%E3%82%B5%E3%83%B3%E3%83%97%E3%83%AB%E3%83%9A%E3%83%BC%E3%82%B8

3. When page is fully loaded ( It happen too quick ).

https://coderkit.org/jp/サンプルページ

HTML Forms

Mostly developer face URL encoding and decoding while working with HTML forms. Basically, Url is also a medium of transferring information. first take a look at given example-

As you can see while transferring user data. characters like /, @ gets encoded. On the other side in backend (php), we have to use urldecode() function. otherwise we can’t extract user data.

URL encoding table:

Given table having five column first one shows the character number, in formal it is known as decimal no. Second column shows the Hex encoded values of characters. Third column shows URL encoded value some columns are blank because those character does not require encoding. Forth and the main one, sorry i put that column at fourth place it needs to be placed somewhere at second or third but anyways, so this column represent ASCII character like 1-9, a-z etc. As you can see some values are represented in black column these are caret notation mainly used for denoting purpose, fifth one is a character description.

DecHexEncCharDescription
000 ^@Null (NUL)
101 ^AStart of heading (SOH)
202 ^BStart of text (STX)
303 ^CEnd of text (ETX)
404^DEnd of transmission (EOT)
505^EEnquiry (ENQ)
606^FAcknowledge (ACK)
707^GBell (BEL)
808^HBackspace (BS)
909^IHorizontal tab (HT)
100A^JLine feed (LF)
110B^KVertical tab (VT)
120C^LNew page/form feed (FF)
130D^MCarriage return (CR)
140E^NShift out (SO)
150F^OShift in (SI)
1610^PData link escape (DLE)
1711^QDevice control 1 (DC1)
1812^RDevice control 2 (DC2)
1913^SDevice control 3 (DC3)
2014^TDevice control 4 (DC4)
2115^UNegative acknowledge (NAK)
2216^VSynchronous idle (SYN)
2317^WEnd of transmission block (ETB)
2418^XCancel (CAN)
2519^YEnd of medium (EM)
261A^ZSubstitute (SUB)
271B^[Escape (ESC)
281C^\File separator (FS)
291D^]Group separator (GS)
301E^^Record separator (RS)
311F^_Unit separator (US)
3220%20 Space
3321%21!Exclamation mark
3422%22Quotation mark/Double quote
3523%23#Number sign
3624%24$Dollar sign
3725%25%Percent sign
3826%26&Ampersand
3927%27Apostrophe/Single quote
4028%28(Left parenthesis
4129%29)Right parenthesis
422A%2A*Asterisk
432B%2B+Plus sign
442C%2C,Comma
452DHyphen/Minus
462E.Full stop/Period
472F%2F/Solidus/Slash
48300Digit zero
49311Digit one
50322Digit two
51333Digit three
52344Digit four
53355Digit five
54366Digit six
55377Digit seven
56388Digit eight
57399Digit nine
583A%3A:Colon
593B%3B;Semicolon
603C%3C<Less-than sign
613D%3D=Equal/Equality sign
623E%3E>Greater-than sign
633F%3F?Question mark
DecHexOctCharDescription
6440%40@Commercial at/At sign
6541ALatin capital letter A
6642BLatin capital letter B
6743CLatin capital letter C
6844DLatin capital letter D
6945ELatin capital letter E
7046FLatin capital letter F
7147GLatin capital letter G
7248HLatin capital letter H
7349ILatin capital letter I
744AJLatin capital letter J
754BKLatin capital letter K
764CLLatin capital letter L
774DMLatin capital letter M
784ENLatin capital letter N
794FOLatin capital letter O
8050PLatin capital letter P
8151QLatin capital letter Q
8252RLatin capital letter R
8353SLatin capital letter S
8454TLatin capital letter T
8555ULatin capital letter U
8656VLatin capital letter V
8757WLatin capital letter W
8858XLatin capital letter X
8959YLatin capital letter Y
905AZLatin capital letter Z
915B%5B[Left square bracket
925C%5C\Reverse solidus/Backslash
935D%5D]Right square bracket
945E%5E^Circumflex accent/Caret
955F_Underscore/Low line
9660%60`Grave accent
9761aLatin small letter a
9862bLatin small letter b
9963cLatin small letter c
10064dLatin small letter d
10165eLatin small letter e
10266fLatin small letter f
10367gLatin small letter g
10468hLatin small letter h
10569iLatin small letter i
1066AjLatin small letter j
1076BkLatin small letter k
1086ClLatin small letter l
1096DmLatin small letter m
1106EnLatin small letter n
1116FoLatin small letter o
11270pLatin small letter p
11371qLatin small letter q
11472rLatin small letter r
11573sLatin small letter s
11674tLatin small letter t
11775uLatin small letter u
11876vLatin small letter v
11977wLatin small letter w
12078xLatin small letter x
12179yLatin small letter y
1227AzLatin small letter z
1237B%7B{Left curly bracket
1247C%7C|Vertical line/Vertical bar
1257D%7D}Right curly bracket
1267E~Tilde
1277FDELDelete (DEL)