Representation of Texts in Structured Form
Computer Science and Information Systems, Tome 9 (2012) no. 1.

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

Although the existing knowledge representation techniques, ranging from the relational databases to the most recent Semantic web languages, are successfully applied in numerous practical applications, they are still unable to represent the information contained in text documents and web pages in structured form, suitable for productive text processing. Text files can represent text documents with no loss of information, however, this information is represented in an unstructured form. Various knowledge formalisms used in different phases of Natural Language Understanding, such as lexical, syntactic, semantic, pragmatic and discourse analysis, are still unable to represent texts in structured form with no loss of information. In this paper, we define the crucial requirements for structured text representation and then, we give a brief introduction to a representation technique that fulfills all these requirements, including the basic data types and learning techniques used to create, maintain and interpret the resulting representation formalism.
Keywords: structured representation, learning, text processing, natural language understanding, regular languages
@article{CSIS_2012_9_1_a2,
     author = {Mladen Stanojevic and Sanja Vrane\v{s}},
     title = {Representation of {Texts} in {Structured} {Form}},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {9},
     number = {1},
     year = {2012},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2012_9_1_a2/}
}
TY  - JOUR
AU  - Mladen Stanojevic
AU  - Sanja Vraneš
TI  - Representation of Texts in Structured Form
JO  - Computer Science and Information Systems
PY  - 2012
VL  - 9
IS  - 1
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2012_9_1_a2/
ID  - CSIS_2012_9_1_a2
ER  - 
%0 Journal Article
%A Mladen Stanojevic
%A Sanja Vraneš
%T Representation of Texts in Structured Form
%J Computer Science and Information Systems
%D 2012
%V 9
%N 1
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2012_9_1_a2/
%F CSIS_2012_9_1_a2
Mladen Stanojevic; Sanja Vraneš. Representation of Texts in Structured Form. Computer Science and Information Systems, Tome 9 (2012) no. 1. http://geodesic.mathdoc.fr/item/CSIS_2012_9_1_a2/