Playing with Base64 Encoding (with examples)

Lowyat.NET forums

Lowyat.NET Kopitiam Garage Sales

Lowyat.NET Rules and Regulations FAQ Help Search Members

Welcome Guest ( Log In | Register )

Lowyat.NET -> Codemasters

Bump Topic Add Reply RSS Feed

Outline · [ Standard ] · Linear+

Playing with Base64 Encoding (with examples), When 'abc' becomes 'YWJj'

views

TSMatQuasar	Jul 24 2023, 07:38 PM, updated 3y ago Show posts by this member only \| IPv6 \| Post #1
Casual Validating 329 posts Joined: Jun 2023	Since I gained understanding about UTF-8 (my old forum post by FlierMate), I haven't done any research on another encoding. This time I play with Base64 encoding. It is a bit easier to understand than UTF-8. I read it on https://www.lifewire.com/ about "How Base64 Encoding Works", and starts experimenting with examples with an online encoder. (Decoder is the reversed process, though it is not demonstrated in this post) I will show you the examples, four examples. First of all, as its name implies, base64 is all about 64 different combination of characters, and 2^6 = 64, so basically base64 is 6-bit encoding, can I say so? Let me quote the original paragraph: QUOTE ;The 64 characters (hence the name Base64) are 10 digits, ;26 lowercase characters, 26 uppercase characters as well ;as the Plus sign (+) and the Forward Slash (/). ;There is also a 65th character known as a pad, which is ;the Equal sign (=). This character is used when the last ;segment of binary data doesn't contain a full 6 bits. This is the encoding table, from 0 to 63: QUOTE ;To ensure the encoded data can be properly printed and does ;not exceed any mail server's line length limit, newline ;characters are inserted to keep line lengths below 76 characters. This is optional as seen from online encoder: QUOTE ;At the end of the encoding process, there might be a problem. ;If the size of the original data in bytes is a multiple of three, ;everything works fine. If it is not, there may be empty bytes. ;For proper encoding, exactly 3-bytes of binary data is needed. ;The solution is to append enough bytes with a value of 0 to ;create a 3-byte group. Two such values are appended if the data ;needs one extra byte of data, one is appended for two extra bytes. This is the most tricky part, notice the highlighted text. -- So now let's start with examples! abc (input) 97,98,99 (corresponding ASCII value) 0110 0001, 0110 0010, 0110 0011 (the binary representation of ASCII value) 011000 010110 001001 100011 (grouped into 6-bit block) 24,22,9,35 (ASCII value for each group) YWJj (output) Notice how "abc" becomes "YWJj"? a =97, b=98, c=99, each is 8-bit value, so 8 x 3 = 24 bits, it is perfect match for 6-bit grouping, since 24 / 6 = 4, that's why 3 characters input becomes 4 characters output (base64 encoded). I think the most difficult part when doing programming (assuming you don't use library API) is 6-bit grouping. After grouping, it is easy to refer 24 = Y, 22 = W, 9 = J, and 35 = j. (the index starts from 0) Now let's go to the tricky part, what if it cannot fit exactly 6-bit grouping? abcd 97,98,99,100 0110 0001, 0110 0010, 0110 0011, 0110 0100 011000 010110 001001 100011 011001 00[00][00] 24,22,9,35,25,0 YMJjZA== This one is same as the first example, except I appended a "d" to the input string. I believe you understand the most part, except the "pad" byte (using 65th character "="). Let me quote again the original paragraph you read in the beginning of this article post: QUOTE For proper encoding, exactly 3-bytes of binary data is needed. ;The solution is to append enough bytes with a value of 0 to ;create a 3-byte group. Two such values are appended if the data ;needs one extra byte of data, one is appended for two extra bytes. The 00[00][00] means two "pad" bytes ("=") are appended to the data to make the last block complete 6-bit group. Hence two "=" equal sign character at the output. The example below just requires one "pad" byte: abcde 97,98,99,100,101 0110 0001, 0110 0010, 0110 0011, 0110 0100, 0110 0101 011000 010110 001001 100011 011001 000110 0101[00] 24,22,9,35,25,6,20 YMJjZGU= As you see, the [00] in bracket is only one needed to make the last block a complete 6-bit group. What if we only encode a single 8-bit character "a"? Since it must be grouped to 6-bit block, four zeros (0000) are needed to make the last block full 6-bit. a 97 0110 0001 011000 01[00][00] 24,16 YQ== Remember every two zeros appended is one "=" equal sign character. So, "a" becomes "YQ==" (base64 encoded), wonderful isn't it? Am I clear with my four examples? I too learn from it myself. You can try to write a program using raw function to experiment it yourself, and I will let you explore how the base64 decoding works, I believe it is just reversal of encoding. Hope you enjoy this article post. Corrections are welcome if there is mistake in my explanation above. This post has been edited by MatQuasar: Jul 24 2023, 07:59 PM iammyself liked this post
Card PM	Report Top Like Quote Reply

jibpek	Jul 24 2023, 08:04 PM Show posts by this member only \| Post #2
Enthusiast Junior Member 710 posts Joined: Jul 2012	padding is optional. b64 url safe version not mentioned. TS rookie
Card PM	Report Top Like Quote Reply

flashang	Jul 25 2023, 09:06 AM Show posts by this member only \| IPv6 \| Post #3
Casual Junior Member 355 posts Joined: Aug 2021	QUOTE(jibpek @ Jul 24 2023, 08:04 PM) padding is optional. b64 url safe version not mentioned. TS rookie The general strategy is to choose 64 characters that are common to most encodings and that are also printable. for base64url (URL- and filename-safe standard), replace "+" with "-", "/" with "_", But "-" and "_" may confuse reader on print media. Full document : Base64 - Wikipedia https://en.wikipedia.org/wiki/Base64#Variants_summary_table This post has been edited by flashang: Jul 25 2023, 09:07 AM angch liked this post
Card PM	Report Top Like Quote Reply

iammyself	Jul 28 2023, 07:42 PM Show posts by this member only \| IPv6 \| Post #4
Getting Started Junior Member 238 posts Joined: May 2011	Nice writeup. This post has been edited by iammyself: Jul 28 2023, 07:43 PM
Card PM	Report Top Like Quote Reply

TSMatQuasar	Jul 28 2023, 08:11 PM Show posts by this member only \| IPv6 \| Post #5
Casual Validating 329 posts Joined: Jun 2023	QUOTE(iammyself @ Jul 28 2023, 07:42 PM) Nice writeup. Thanks for your support!
Card PM	Report Top Like Quote Reply

« Next Oldest · Codemasters · Next Newest »

Add Reply Options

Change to:

0.0140sec

0.24

5 queries

GZIP Disabled
Time is now: 24th December 2025 - 01:16 AM

All Rights Reserved © 2002- 2025 Vijandren Ramadass (~unite against racism~)

Removal Request

Powered by Invision Power Board © 2025 IPS, Inc.