What is UNL Development Set?
The Universal
Networking Language Development Set (UNL DS or UDS) contains the DeConverter and EnConverter software as
well as other UNL tools. The
conversion systems from UNL to a native language and vice versa are developed
using this set of software. The
contents of the UNL DS are explained below. The software and documents included
in the UNL DS can be downloaded at http://www.undl.org/unlsys/ds.html.
It is necessary to make a contract to use the UNL
DS, the information is shown in the above website too.
DeConverter
is a language independent generator that provides synchronously a framework for
morphological and syntactic generation, and word selection for natural
collocation. DeConverter can deconvert UNL expressions into a variety of native
languages, using a different set of files such as the Word Dictionary,
Grammatical Rules and Co-occurrence Dictionary of each language.
Outline
of the functions in DeConverter.- First of all, DeConverter transforms the
sentence represented by an UNL expression - that is, a set of binary relations -
into the directed hyper graph structure called Node-net. The root node of a
Node-net is called Entry Node and represents the main predicate of the sentence.
It then applies generation rules to every node in the Node-net respectively, and
generates the word list in the target language. In this process, the syntactic
structure is determined by applying Syntactic Rules, while morphemes are
generated by applying Morphological Rules.
The
Generation capability of this system covers context-free language, and it can
also generate context-sensitive language. Since its capability is high enough,
it is expected to be able to generate many languages of the world. Co-occurrence
Relations between words contribute to a better word selection. This means that
it is possible to generate more natural sentences by using co-occurrence
relation
The
EnConverter is a language independent parser, which provides a framework for
morphological, syntactic, and semantic analysis synchronously. It would be
impossible to solve all the ambiguities even in a morphological analysis if the
syntactic or semantic analysis is performed synchronously. Also, it would be
impossible to solve every ambiguity in a syntactic analysis in the absence of
semantic analysis.
The
EnConverter works in the following way. An input string is scanned from left to
right. When an input string is scanned, all matched morphemes with the same
starting characters are retrieved from the dictionary and become the candidate
morphemes. The rules are applied to these candidate morphemes according to the
rule priority in order to build the syntactic tree and the semantic network for
the sentence. The left character string is scanned from the beginning according
to the applied rule; the process continues in the same manner. The output of the
whole process is a semantic network expressed in the UNL format.
The Word Dictionary Builder is a tool to make an
indexed word dictionary from text data. This indexed word dictionary can be used
in both DeConversion and EnConversion. A simple explanation of this tool is
included in the Development Set where the dictionary format of the word
dictionary text data and the usage of this tool are provided.
The CR Dictionary Builder is a tool to make a
formatted CR dictionary from text data. This formatted CR dictionary is used in
DeConversion when selecting equivalents for natural wording. A simple
explanation of this tool is included in the UNL DS where the dictionary format
of the CR text data and the usage of this tool are shown.
DOCUMENTS
In the Specifications of DeConverter, the function and the structure of the DeConverter is explained. The DeConverter works based on the deconversion rules, using a word dictionary and a co-occurrence dictionary. The format and types of rule and how information in dictionaries is used are explained in detail.
In the Specifications of EnConverter, the function and the structure of the EnConverter are explained. The EnConverter works based on the enconversion rules, using a word dictionary and the UNL KB. The format and types of rule and how the information in dictionaries is used are explained in detail.
[In
preparation]
This manual explains how to make a Co-occurrence Dictionary and how to use it in deconversion.
For
further information please contact:
UNL Center, UNDL Foundation
53-70, Jingumae 5-chome, Shibuya-ku,
Tokyo 150-8925, Japan
Tel (81) 3-3499-2811 (Ex.2094 / 2105)
Email: unlsys@undl.org