What is UNL Development Set?

The Universal Networking Language Development Set (UNL DS or UDS) contains the DeConverter and EnConverter software as well as other UNL tools.  The conversion systems from UNL to a native language and vice versa are developed using this set of software.  The contents of the UNL DS are explained below. The software and documents included in the UNL DS can be downloaded at http://www.undl.org/unlsys/ds.html. It is necessary to make a contract to use the UNL DS, the information is shown in the above website too.

1.        DeConverter

DeConverter is a language independent generator that provides synchronously a framework for morphological and syntactic generation, and word selection for natural collocation. DeConverter can deconvert UNL expressions into a variety of native languages, using a different set of files such as the Word Dictionary, Grammatical Rules and Co-occurrence Dictionary of each language.

Outline of the functions in DeConverter.- First of all, DeConverter transforms the sentence represented by an UNL expression - that is, a set of binary relations - into the directed hyper graph structure called Node-net. The root node of a Node-net is called Entry Node and represents the main predicate of the sentence. It then applies generation rules to every node in the Node-net respectively, and generates the word list in the target language. In this process, the syntactic structure is determined by applying Syntactic Rules, while morphemes are generated by applying Morphological Rules.

The Generation capability of this system covers context-free language, and it can also generate context-sensitive language. Since its capability is high enough, it is expected to be able to generate many languages of the world. Co-occurrence Relations between words contribute to a better word selection. This means that it is possible to generate more natural sentences by using co-occurrence relation.

2.        EnConverter

The EnConverter is a language independent parser, which provides a framework for morphological, syntactic, and semantic analysis synchronously. It would be impossible to solve all the ambiguities even in a morphological analysis if the syntactic or semantic analysis is performed synchronously. Also, it would be impossible to solve every ambiguity in a syntactic analysis in the absence of semantic analysis.

The EnConverter works in the following way. An input string is scanned from left to right. When an input string is scanned, all matched morphemes with the same starting characters are retrieved from the dictionary and become the candidate morphemes. The rules are applied to these candidate morphemes according to the rule priority in order to build the syntactic tree and the semantic network for the sentence. The left character string is scanned from the beginning according to the applied rule; the process continues in the same manner. The output of the whole process is a semantic network expressed in the UNL format.

3.        Word Dictionary Builder

The Word Dictionary Builder is a tool to make an indexed word dictionary from text data. This indexed word dictionary can be used in both DeConversion and EnConversion. A simple explanation of this tool is included in the Development Set where the dictionary format of the word dictionary text data and the usage of this tool are provided.

 4.        Co-occurrence Relation (CR) Dictionary Builder

The CR Dictionary Builder is a tool to make a formatted CR dictionary from text data. This formatted CR dictionary is used in DeConversion when selecting equivalents for natural wording. A simple explanation of this tool is included in the UNL DS where the dictionary format of the CR text data and the usage of this tool are shown.  

DOCUMENTS

1.        Specifications of DeConverter

In the Specifications of DeConverter, the function and the structure of the DeConverter is explained. The DeConverter works based on the deconversion rules, using a word dictionary and a co-occurrence dictionary. The format and types of rule and how information in dictionaries is used are explained in detail.

2.        Specifications of EnConverter

In the Specifications of EnConverter, the function and the structure of the EnConverter are explained. The EnConverter works based on the enconversion rules, using a word dictionary and the UNL KB. The format and types of rule and how the information in dictionaries is used are explained in detail.

3.        Manual of Co-occurrence Dictionary

[In preparation]

This manual explains how to make a Co-occurrence Dictionary and how to use it in deconversion.

For further information please contact:


UNL Center, UNDL Foundation
53-70, Jingumae 5-chome, Shibuya-ku,
Tokyo 150-8925, Japan
Tel (81) 3-3499-2811 (Ex.2094 / 2105)
Email: unlsys@undl.org