Nový encTeX - kódování UTF-8 v TeXu
Zpravodaj Československého sdružení uživatelů TeXu, Tome 13 (2003) no. 2, pp. 98-106.

Voir la notice de l'article provenant de la source Czech Digital Mathematics Library

The UTF-8 encoding keeps the standard ASCII characters unchanged and encodes the accented letters of our alphabets in two bytes. The standard 8bit TeX is not ready for the UTF-8 input because it has to manage the single character as two tokens. It means you cannot set the \catcode, \uccode, etc. to these single characters and you cannot do \futurelet of the next character in normal sense. The second version of my encTeX solves these problems. The encTEX is full backward compatible with the original TeX. It adds ten new primitives by which you can set or read the conversion tables used by input processor of TeX or used during output to the terminal, log and \write files. The second version gives possibility to convert the multi-byte sequences to one byte or to a control sequence. You can implement up to 256 UTF-8 codes as one byte and unlimited number of other UTF-8 codes as control sequences. All internals in 8bit TeX are working in the same way as if "normal one byte encoding" of input files is used. I think that the UTF-8 encoding will be in more common use. In such situation, there is no other way than to modify the input processor of TeX otherwise the 8bit TeX will die in a short time.
@article{10_5300_2003_2_98,
     author = {Ol\v{s}\'ak, Petr},
     title = {Nov\'y {encTeX} - k\'odov\'an{\'\i} {UTF-8} v {TeXu}},
     journal = {Zpravodaj \v{C}eskoslovensk\'eho sdru\v{z}en{\'\i} u\v{z}ivatel\r{u} TeXu},
     pages = {98--106},
     publisher = {mathdoc},
     volume = {13},
     number = {2},
     year = {2003},
     doi = {10.5300/2003-2/98},
     language = {cz},
     url = {http://geodesic.mathdoc.fr/articles/10.5300/2003-2/98/}
}
TY  - JOUR
AU  - Olšák, Petr
TI  - Nový encTeX - kódování UTF-8 v TeXu
JO  - Zpravodaj Československého sdružení uživatelů TeXu
PY  - 2003
SP  - 98
EP  - 106
VL  - 13
IS  - 2
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/articles/10.5300/2003-2/98/
DO  - 10.5300/2003-2/98
LA  - cz
ID  - 10_5300_2003_2_98
ER  - 
%0 Journal Article
%A Olšák, Petr
%T Nový encTeX - kódování UTF-8 v TeXu
%J Zpravodaj Československého sdružení uživatelů TeXu
%D 2003
%P 98-106
%V 13
%N 2
%I mathdoc
%U http://geodesic.mathdoc.fr/articles/10.5300/2003-2/98/
%R 10.5300/2003-2/98
%G cz
%F 10_5300_2003_2_98
Olšák, Petr. Nový encTeX - kódování UTF-8 v TeXu. Zpravodaj Československého sdružení uživatelů TeXu, Tome 13 (2003) no. 2, pp. 98-106. doi : 10.5300/2003-2/98. http://geodesic.mathdoc.fr/articles/10.5300/2003-2/98/

Cité par Sources :