Font
Large
Medium
Small
Night
Prev Index    Favorite Next

Chapter 744

I can use it too

Author: Erzi Congzhou

I can use it too

However, there was also a problem with the Chinese character encoding in the previous life, that is, the Unicode encoding came out too late, which resulted in Microsoft having to adopt an extended encoding based on GB13000. For this reason, the national standard had to be based on the GB13000 encoding.

Patch it, expand it to GBK, and then expand it to GB18030.

The final GB 18030-2005, the full name of the national standard GB 18030-2005 "Chinese Coded Character Set for Information Technology", is fully compatible with GB 2312-1980, basically compatible with GBK, supports GB 13000 and all unified Chinese characters of Unicode, and is included in total.

There are 70244 Chinese characters.

At that time, Unicode did not include as many Chinese characters as GB 18030-2005. Although it could theoretically contain all Chinese characters, countless code bits were empty.

The final situation is that an old system is overloaded with patches, and a new system has a large number of empty code bits that no one is filling in. This has resulted in information systems that are still not fully compatible with Chinese character transcoding decades later.

question.

Zhou Zhi, a state-owned enterprise programmer in later generations, was deeply affected by this, so he believed that the key to solving this problem was that the country should abandon the cramped ISO/IEC 1064 from the beginning, and first grab enough Chinese character space in the Unicode standard

Yes, at least grab 100,000 code points to fill them in, and make it the only mandatory standard. This is used all over the world.

So he said: "Isn't it just right that there is not a single stroke in the eight characters? Only if there is not a single stroke in the eight characters can we participate deeply. As long as there are three sections of code space left for us, we can accommodate 100,000 Chinese characters."

"Furthermore, Unicode only has the concept of encoding, and its design purpose itself is to contain all kinds of characters in the world."

"Chinese character encoding is undoubtedly the most complicated character encoding work in the world. If we complete this work, we will have full say in the organization. In the future, we can also guide the work of other countries and organizations and help us write other ethnic languages.

, also has a foundational role.”

Now it was the turn of the literary and historical experts on Mr. Gu's side to understand what Zhou Zhi and Li Hongjiang were discussing.

Mr. Gu interrupted the lively discussion between the two: "Elbow, Xiao Li, which one of you can explain it first in words that us old men can understand?"

Mai Mingchuan smiled and said: "I understand the general meaning. Let me explain it first to see if it is correct. If not, Xiao Li and Zhou Zhi will add more later."

"Now there are two sets of standards, one is ISO/IEC 1064. This system has matured. Although the first part has been promulgated, our country has developed GB 13000 based on it and can be implemented quickly."

"But this system has a big problem, that is, there are too few code bits, and it can only accommodate 21,003 Chinese characters. Now it seems that there is still a long way to go before it can fully meet the needs."

"There is another set of standards, which is Unicode."

"As long as the coding range allocated to Chinese characters is enough, this set of standards can accommodate all our Chinese characters, and in the future we can continue to capture more coding ranges for further expansion, or be used to code other ethnic minority characters.

.”

"From the perspective of design principles, the Unicode standard is actually better than ISO/IEC 1064. However, this standard is still only half-baked. The first version has not been released yet. If we want to use the Unicode standard, we must first improve the standard.

, and then we can discuss the interval allocation and next step of work.”

"What Xiao Li means is that we should use GB 13000 first. We already have the foundation for GB2312. This method is familiar and will produce quick results."

"The meaning of elbow is that we will work on Unicode from the beginning, and we will get it right in one step. Since the Unicode standard has not been finalized yet, we will actively participate in it and work on the standard together!"

"Of course it would be the best result if we can really achieve what Zhuzi said. But do we have the strength?" Mr. Gu still has the impression that the country's information industry is starting to catch up. What he is worried about is that with the country's current technology

Strength cannot complete the work.

"In fact, they have basically completed this work." Li Hongjiang said: "Most computers use the American Standard Information Interchange Code, which is ASCII code. It is a 7-bit code that represents all uppercase and lowercase letters, numbers, punctuation marks and control characters.

Solution. Unicode has completed compilation of ASCII codes, and '\u0000' to '\u007F' correspond to all 128 ACSII characters."

"In other words, computer systems can actually use Unicode encoding, but it has not yet formed a big standard?"

"There are still many areas that need to be improved." Li Hongjiang said: "Of course, now that the ACSII problems have been solved, at least the architecture has matured, and the rest are just minor problems."

"If, and I mean if, we can have a 100,000-level code space for them to fill, I believe the league will be very interested."

There is a saying in later generations that "first-rate companies make standards, second-rate companies make brands, and third-rate companies make products." Today's GBK and Unicode are actually a battle for standards.

Zhou Zhi added: "This is a major event that affects the whole body. To put it bluntly, it is a battle over standards."

"China's right to speak in the world's information industry can be said to be negligible, but the Chinese character library can be regarded as a special resource."

"I'm afraid that if all the alphabetical language countries in the world add up all the symbols, they don't have as many Chinese characters as China."

"If we complete this font library first, then Unicode can be shown to the world as its absolute advantage."

"It's like GBK is still using tank cannons, and Unicode has detonated a hydrogen bomb."

"We can definitely use our results, pay membership fees, and become members of the organization."

Li Hongjiang has done some research on this organization and said: "The Unicode Alliance is a Unicode organization located in California, USA. They actually allow any company or individual who is willing to pay membership fees to join."

"Two organizations were established in the late 1980s, one is the commercial organization of the Unicode organization, and the other is the international standardization organization that cooperates with the international community. In response to the needs of computer popularization and information internationalization, they established the Unicode organization and ISO-

10646 working group.""

"They soon discovered each other's existence. Everyone worked for the same purpose, so the two organizations worked together to develop universal codes suitable for various languages, and published Unicode and ISO-10646 character sets in quite a tacit understanding. Although

In fact, the character set encoding of the two is the same, but in fact they are two different standards."

"The Unicode Alliance first released The Unicode Standard two years ago. The development of Unicode was combined with ISO/IEC 10646, the universal character set, formulated by the International Organization for Standardization. The two are actually the same in terms of the operating principles of encoding."

"But The Unicode Standard contains more detailed implementation information, covering more detailed topics such as bit encoding, proofreading and rendering. It even enumerates many character characteristics, including those that must support two reading directions.

, such as the left-to-right direction of normal reading, and the right-to-left direction of Arabic."

Damn it! Zhou Zhi’s eyes met Gu Kailai’s and Denzin’s eyes instantly in the air. The reading habit of ancient Chinese classics is also from right to left!

If I can use Arabic, I can also use Chinese classics!
Chapter completed!
Prev Index    Favorite Next