UNL Development Set

@

 

  

What is the UDS?
Who can use the UDS?
How to start on developing a deconversion module?
How to start on developing an enconversion module?
To download

@

The UDS  is open to the Members of the UNL Society who signed  the UDS agreement, further information on the UNL Society  is available at here
Members:  change your password

@

for more information or questions please contact us at unlcenter@undl.org

 

 

The UDS (UNL Development Set) is a set of tools of the UNL System for developers to use to develop conversion modules between languages and UNL. It contains the DeConverter, the EnConveter, the Word Dictionary Builder, and specifications or manuals of the tools. More information on these tools is available at the UNL System.

 

 

Who can use the UDS?

 

For using the UDS, it is necessary to sign the following agreements:

 

"AGREEMENT TO ENTER THE UNL SOCIETY"

"UNL DEVELOPMENT SET LICENSE OF AGREEMENT "

 

For more information on how to enter the UNL Society see "how to enter" under "UNL Society" at: http://www.undl.org/

 

 

 

To develop a language deconversion module using the DeConverter provided by UNL Center needs to develop a word dictionary and deconversion rules of the language. The word dictionary provides correspondent words of the language of UWs that appear in UNL Expressions of input of the DeConverter, and grammatical attributes (features) of the headwords. Deconversion rules of the language describe operations of processes to deconvert UNL Expressions to sentences of the language. Detailed information on the DeConverter, deconversion rules and word dictionary is given in the specifications of the DeConverter and the manual of the Word Dictionary Builder. All tools, specifications and manuals of the UDS can be downloaded at to downlaod.

 

In the following explanation, "d.txt" is a list of examples of entry of English word dictionary. "elgexam.txt" is a set of examples of English deconversion rules, which includes necessary rules for deconverting gexample.unlh. gExample.unlh is an example of UNL Expression.

 

To start on developing a deconversion module can simply follow the steps.

 

STEP 1

 

To prepare text data of entries of word dictionary of a language

Description format of text data of entries of word dictionary is given in the manual of the Word Dictionary Builder.

 

STEP 2

 

To convert text data of word dictionary entries into IBAM formatted files

gDicBldL.exeh is used to convert one-byte code language word dictionary data.

gDicBldC.exeh is used to convert two-byte code language word dictionary data.

Usage of the Dictionary Builder tools is shown in the manual.

"d.dic" and "d.pix" are examples of the IBAM formatted files made from gd.txth using "DicBldL.exe".

 

STEP 3

 

To write deconversion rules

Information on how to write deconversion rules is given in the specifications of the DeConveter. "elgexam.txt" is an example of deconversion rules of English for deconverting "example.unl".

 

STEP 4

 

To deconvert

The gDeCoLh version is used to deconvert UNL into one-byte code languages.

The gDeCoCh version is used to deconvert UNL into two-byte code languages.

"example_decoe.txt" shows the results of deconversion from "example.unl".

Usage of the DeConverter is shown in the specifications.

 

STEP 5

 

To check the results, if not correct to revise dictionary entries or rules

The DeConverter can output detailed traces of deconversion processes. Problems can be detected by checking the traces. What information is included in the traces is explained in the specifications.

 

 

 

To develop a language enconversion module using the EnConverter provided by UNL Center needs to develop a word dictionary and enconversion rules of the language. The word dictionary provides correspondent UWs of words included in input sentences of the language, and grammatical attributes (features) of the words. Enconversion rules of the language describe operations of processes to enconvert sentences of the language into UNL Expressions. Detailed information on the EnConverter, enconversion rules and word dictionary is given in the specifications of the EnConverter and the manual of the Word Dictionary Builder. All tools, specifications and manuals of the UDS can be downloaded at to downlaod.

 

To start on developing a enconversion module can simply follow the steps.

 

STEP 1

 

To prepare text data of word dictionary entries for words included in input sentences

Correspondent UWs must be given to meaningful words. The EnConverter uses the UWs to create UNL Expressions.

Description format of text data of word dictionary entries is given in the manual of the Word Dictionary Builder.

"eng.txt" is an example of English input sentences.

"d.txt" is an example of English word dictionary, which includes the entries of words included in geng.txth.

 

STEP 2

 

To convert text data of word dictionary entries into IBAM formatted files

gDicBldL.exeh is used to convert one-byte code language word dictionary data.

gDicBldC.exeh is used to convert two-byte code language word dictionary data.

Usage of the Dictionary Builder tools is shown in the manual.

"d.dic" and "d.pix" are examples of the IBAM formatted files made from gd.txth using "DicBldL.exe".

 

STEP 3

 

To write enconversion rules

Information on how to write enconversion rules is provided in the specifications of the EnConveter. "elaexam.txt" is an example of enconversion rules of English for enconverting "eng.txt".

 

STEP 4

 

To enconvert

The gEnCoLh version is used to enconvert sentences of one-byte code languages.

The gEnCoCh version is used to enconvert sentences of two-byte code languages.

"eng.unl" is the results of enconversion from "eng.txt".

Usage of the DeConverter is shown in the specifications.

 

STEP 5

 

To check the results, if not correct to revise dictionary entries or rules

The EnConverter can output detailed traces of enconversion processes. Problems can be detected by checking the traces. What information is included in the traces is explained in the specifications.

 

 

 

There are two versions of the DeConverter, EnConverter and Dictionary Builders, C - Version and L - Version. The C - Versions are developed for dealing with two-byte code languages of Chinese (GB code), Korean (KIS code), Thai language, and so on. The L - Versions are developed for dealing with ASCII codes, any one-byte code languages such as Arabic, Latin languages, and Hindi.

 

 

DeConverter

 

 

Version 2006 C

DOWNLOAD

Version 2006 L

DOWNLOAD

 

 

Specifications

DOWNLOAD

 

 

 

 

EnConverter

 

 

Version 3.3 C

DOWNLOAD

Version 2006 L

DOWNLOAD

 

 

Specifications

DOWNLOAD

 

 

 

 

Word Dictionary Builder

 

 

DicBldC

DOWNLOAD

DicBldL

DOWNLOAD

 

 

Manual

DOWNLOAD