Note on the IETF linguistic doctrine and RFC 4646 text tuning

From Wikidna.org

Jump to: navigation, search

JFC Morfin


Contents

[edit] Background

There are several linguistics schools. One famous one, for its technical impact on software but eventual failure to fully deliver in the language area, is the Anglo-Saxon structuralism. Its most famous personality is Noam Chomsky.

Its basic concept is the existence of a Universal Grammar structuring of all natural languages. This school influences:

  • the EDP approach of languages support for two decades through the "globalization" doctrine,
  • the IETF through the internationalization RFCs of Harald Alvestrand, which apply it to the Internet,
  • the entire world through the "shaping the world" US knowledge war strategy.

[edit] Globalization

Globalization was first an IBM attempt to remove the language barrier between IBM computers and their respective foreign users. As described by its main thinker, Mark Davis (former IBM, and now Google), the President of Unicode, and co-author of the RFC 4646 with Addison Philips (his counterpart at Yahoo!) Globalization comprises:

  • internationalizing the English support of the medium (the Internet) by using Unicode to support every character set.
  • localizing computers in using the Unicode CLDR project of "locale" files.
  • langtagging the content in order to permit its consistent filtering and support throughout IDNs, locales, applications, search engines, etc.

This analysis is based on the idea that a Universal Grammar does exist, to where what is good for English is truly good for other languages as well. This means that internationalizing English to support every language is legitimate and will be transparent to users (and rewarding for the English speaking industry and e-commerce).

[edit] Multilingualization

Unfortunately, there is no Universal Grammar, even if there are semiotic common practices. We now know that human languages do not work that way: and human beings are not just IBM machines or Google users. Moreover, there are far more languages than the 130 languages that are supported by the Unicode CLDR locale files (20,000 language entities should be langtagged by ISO 639-6).

We can currently observe the impact of this confusion with the proposed WG-IDNABIS charter - eight years after China, India, Russia, Israel, etc. started using IDNs, and presented them at an ICANN general meeting at the request of the NTIA, and the IETF then rushed a WG-IDNA in order to document them along a structuralist vision. "the IDNAbis [] work arose as a consequence of experiences and problems with IDNA2003, which ICANN did its best to implement. The initiative arose out of IETF contributors' efforts to improve the specifications and to reduce various kinds of confusion and potential abuse." (Vint Cerf).

Linguistic diversity obliges us to investigate and respect the complex ways of the language utterance in order to devise the solutions that permit the Internet to equally support every language and speaker.

[edit] ISO 3166/639-6 vs. IETF Langtags

Countries, scripts, and administrative and normative languages are fundamental referents of the reality of the world and of the Information society. They are internationally standardized by ISO 3166 of ISO/TC46 (I made it the reference for cyberworld in 1978, which makes me interested in its possible corruption). ISO 3166 documents the countries, their languages and scripts. There are consistent WTO and WIPO agreements over its support, Human Rights concerning the linguistic diversity, and the unanimous WSIS (World Summit on the Information Society) positions about multilingualism.

A text was proposed at the IETF by Unicode experts to document their language tagging. It was twice defeated in the IETF Last Call as an independent submission. Randy Preshun (WG-LTRU Chair) sponsored it to become an IETF RFC. This text did not want to consider interoperability with ISO 3166. If it was generalized to the Internet through its IANA registry, it would eventually confuse the technical and cultural support of language diversity enough to practically impeach the deployment of a truly multilingual Internet.

The reason for this, in addition to a structuralist approach, was a total lack of relations with ISO 3166/MA (Maintenance Agency). However, ICANN is one of its ten members.

[edit] The RFC 4646 interoperability necessary mail-combat action

As a volunteer confronted with the salaried experts of large corporations, I was delegated by different industrial, cultural, and regalian interests to carry out a "mail-combat" action to attain a text that would remain interoperable with ISO 3166, ISO 639-6 (ISO planned simpler and an exhaustive equivalent) and various other ongoing ISO and ISO based works, including my non-profit organization's MDRS (Multilingual Distributed Referential System, an open extension of the IANA).

I was successful. RFC 4646 is acceptable (but not respected by the IETF). As a result, the IETF "Globalization/Structural" affinity group led a PR-action against me in the hope that I could not continue to conduct their consensus and calendar. It was too late: ideas have their own momentum. The IESG accepted the RFC 4646 a few hours after the WSIS "Tunis deal", which made sure that it could not be globally enforced. In addition IESG did not respect their RFC 4646 obligations which should have permitted to better embody interoperability: it is expected that the coming RFC 4646bis will hopefully help in that area.

[edit] The counterattack of the supporters of structuralism

Debbie Garside led the counterattack. She defended Anglo-Saxon structuralism with interesting technical additions that the IETF did not even understand, but which interested me and a UNESCO friend of mine. She was brillant, very active, and efficient. However, she was confronted with the technical inadequacy of her positions.

  • she entered the BSI (British standards agency) and took over the ISO 639-6 work. It has been delayed for years so far.
  • she introduced herself into the British GAC team, GAC/ccNSO (where she attained a vote from ICANN BoD in turn requesting an IDN 3166 project) and ISO 3166/MA where she voted in vain against its multilingual nature.
  • she created the WLDC [1], of which she is the CEO: it gathers [2] some Members from this WG-LTRU, from ISO TC37, from a fake MINC (she even used one of my MINC titles and co-hijacked the MINC website), etc.
  • she then made the BSI introduce an ISO TC46 NWIP (new work item proposal) concerning an ISO 3166-4 project to internationalize ISO 3166. This prevented ISO 3166 from becoming a working international example (paradigm) on the way to support multilingualization. It also gave her full responsibility for the ISO counterpart of RFC 4646, in which she could then constrain it in the same way.

[edit] The vote on "internationalization"

I opposed that NWIP because internationalization is a non-English neutral closed process. Multilingualization must be open and equally language neutral. However, I also proposed to the idea to cooperate together because there is plenty of room for work in the open multilingualization field. Her response was so ad hominem that ISO proposed a conciliation meeting. This meeting gathered the ISO/Central Secretariat, ISO/TC46, AFNOR, BSI, ICANN, ISO 3166/MA, and me (I had also proposed for the IETF and WG-LTRU to participate).

During that meeting ICANN expressed that they were neutral, they needed solutions, and would follow the users. The BSI/WLDC/Debbie Garside NWIP on "internationalizing (supporting non-ASCII scripts) ISO 3166" was introduced and submitted to a vote of the TC46/SC2 against my formal opposition due to time wasted and to preserve the BSI/British GAC image.

In spite of an uncommon extension of the regular three month voting period, only the UK (BSI), Ireland (a colleague of Michael Everson [the current IETF Language Subtag Registry reviewer), and the USA (after the regular initial deadline) actively supported the NWIP. It was disregarded by most and formally opposed by one country.

[edit] Initiative 3166-4

As a consequence, the MLTF (Multilinguistique et Technologies de Facilitation[3]) pursues its 3166-4 project [4], in reporting to ISO 3166/MA along the ISO 3166 rules. They want to capitalize on ISO 3166 in using it as a multilingualization paradigm. On June 27, 2008, MLTF should hold a meeting in Paris (during the World Internet Week de Paris), which is sponsored by various organizations. It should discuss a series of contributions that should lead to a Multilingual Internet, and multilinguistics as the science studying the implication of a parallel or simultaneous use of multiple languages.

The ccTAG/UNITAG oriented http://www.ietf.org/internet-drafts/draft-mltf-jfcm-cctags-01.txt draft is the first part of this public work. It documents ccTAGs as RFC 4646 interoperable.

Personal tools