Base64编码的真正目的是什么?

为什么我们有 Base64编码?我是一个初学者,我真的不明白为什么要混淆字节到其他东西(除非它是加密)。在我读过的一本书中,Base64编码在二进制传输不可能的情况下很有用。艾格。当我们发布一个表单时,它是编码的。但是为什么我们要把字节转换成字母呢?我们就不能把字节转换成字符串格式,中间留个空格吗?比如说 00000001 00000004?或者简单地 0000000100000004没有任何空间,因为字节总是成对的8?

40469 次浏览

Base64 is a more or less compact way of transmitting (encoding, in fact, but with goal of transmitting) any kind of binary data.

See http://en.wikipedia.org/wiki/Base64

"The general rule is to choose a set of 64 characters that is both part of a subset common to most encodings, and also printable."

That's a very general purpose and the common need is not to waste more space than needed.

Historically, it's based on the fact that there is a common subset of (almost) all encodings used to store chars into bytes and that a lot of the 2^8 possible bytes risk loss or transformations during simple data transfer (for example a copy-paste-emailsend-emailreceive-copy-paste sequence).

(please redirect upvote to Brian's comment, I just make it more complete and hopefully more clear).

Base64 is a way to encode binary data into an ASCII character set known to pretty much every computer system, in order to transmit the data without loss or modification of the contents itself. For example, mail systems cannot deal with binary data because they expect ASCII (textual) data. So if you want to transfer an image or another file, it will get corrupted because of the way it deals with the data.

Note: base64 encoding is NOT a way of encrypting, nor a way of compacting data. In fact a base64 encoded piece of data is 1.333… times bigger than the original datapiece. It is only a way to be sure that no data is lost or modified during the transfer.

Base64 is a mechanism to enable representing and transferring binary data over mediums that allow only printable characters.It is most popular form of the “Base Encoding”, the others known in use being Base16 and Base32.

The need for Base64 arose from the need to attach binary content to emails like images, videos or arbitrary binary content . Since SMTP [RFC 5321] only allowed 7-bit US-ASCII characters within the messages, there was a need to represent these binary octet streams using the seven bit ASCII characters...

Hope this answers the Question

During data transmission, data can be textual or non-text(binary) like image, video, file etc.

As we know, during transmission only a stream of data(textual/printable characters) can be sent or received, hence we need a way encode non-text data like image, video, file.

Binary and ASCII representation of non-text(image, video, file) is easily obtainable. Such non-text(binary) represenation is encoded in textual format such that each ASCII character takes one out of sixty four(A-Z, a-z, 0-9, + and /) possible character set.

                  Table 1: The Base 64 Alphabet


Value Encoding  Value Encoding  Value Encoding  Value Encoding
0 A            17 R            34 i            51 z
1 B            18 S            35 j            52 0
2 C            19 T            36 k            53 1
3 D            20 U            37 l            54 2
4 E            21 V            38 m            55 3
5 F            22 W            39 n            56 4
6 G            23 X            40 o            57 5
7 H            24 Y            41 p            58 6
8 I            25 Z            42 q            59 7
9 J            26 a            43 r            60 8
10 K            27 b            44 s            61 9
11 L            28 c            45 t            62 +
12 M            29 d            46 u            63 /
13 N            30 e            47 v
14 O            31 f            48 w         (pad) =
15 P            32 g            49 x
16 Q            33 h            50 y

This sixty four character set is called Base64 and encoding a given data into this character set having sixty four allowed characters is called Base64 encoding.

Let us take examples of few ASCII characters when encoded to Base64.

1 ==> MQ==

12 ==> MTI=

123 ==> MTIz

1234 ==> MTIzNA==

12345 ==> MTIzNDU=

123456 ==> MTIzNDU2

Here few points are to be noted:

  • Base64 encoding occurs in size of 4 characters. Because an ASCII character can take any out of 256 characters, which needs 4 characters of Base64 to cover. If the given ASCII value is represented in lesser character then rest of characters are padded with =.
  • = is not part of base64 character set. It is used for just padding.

Hence, one can see that the Base64 encoding is not encryption but just a way to transform any given data into a stream of printable characters which can be transmitted over network.