Author Topic: UNL. Универсальный сетевой язык.  (Read 8177 times)

0 Members and 1 Guest are viewing this topic.

Offline Sirko

  • Posts: 2496
  • Gender: Male
UNL - язык обработки информации в формате, независимом от естественных языков.

The UNL is an effort to achieve a simple basis for representing the most central aspects of information and meaning in a human-language-independent form. As a knowledge representation language, the UNL aims at coding, storing, disseminating and retrieving information independently of the original language in which it was expressed. In this sense, the UNL seeks to provide the tools for overcoming the language barrier in a systematic way.
At first glance, the UNL seems to be an “interlingua”, a sort of pivot-language to which the source texts are converted before being translated into the target languages. It can, in fact, be used for such a purpose, but its primary objective is to serve as an infrastructure for handling knowledge rather than individual languages.

In the UNL approach, there are two basic different movements: UNL-ization and NL-ization. UNL-ization is the process of representing/mapping/analysing the information conveyed by natural language utterances into UNL; NL-ization, conversely, is the process of realizing/manifesting/generating a natural language document out of a UNL graph. These processes are completely independent. For the time being, the NL-ization process has been already fully automatic, whereas the UNL-ization process is still mostly human, even though machine-aided.

Currently, the main goal of the UNL-ization process has been to map the information that is verbally elicited in the surface structure of written texts into a language-independent and machine-tractable database. This means that the UNL representation has not been committed to replicate the lexical and the syntactic choices of the original, but focuses in representing, in a non-ambiguous format, one of its possible readings, preferably the most conventional one. In this sense, the UNL representation has been an interpretation rather than a translation of a given text.

Indeed, it is important to note that at this point in time it would be foolish to state it possible to represent the “full” meaning of any word, sentence or text for any language. Subtleties of intention and interpretation make the “full meaning”, whatever concept we might have of it, too variable and subjective for any systematic treatment. The UNL avoids the pitfalls of trying to represent the “full meaning” of sentences or texts, targeting instead the “core” or “consensual” meaning that is most often attributed to them. In this sense, much of the subtlety of poetry, metaphor, figurative language, inuendo and other complex, indirect communicative behaviours is beyond the current scope and goals of the UNL. Instead, the UNL targets direct communicative behaviour and literal meanings as a tangible, concrete basis for much or most of human communication in practical, day-to-day settings.

This is the main reason why UNL has not been exactly a interlingua-based machine translation project, even though machine translation is one of the possible and more obvious and promising uses of UNL. The main problem is that the practice of translation has been normally restricted to the notion of "fidelity" (or faithfulness), i.e., any translated version of a text is expected to be a replica, in another language, of the content and of the form of the original. This transfer process, however, is “all too human”, as Nietzsche said, to be replicated by the currently existing technology, which is not prepared to deal with several linguistic and cultural phenomena, such as vagueness, ambiguities, metaphors, presuppositions, ellipses, implicatures and so on. This does not mean that natural language automatic processing, and therefore machine translation, is impracticable; it just means that it is not possible yet to do that completely without humans or in the same way humans do. The results, in any case, are likely to be different from the ones produced by humans. Several techniques (rule-based, memory-based, corpus-based) have been proposed to decrease the role of humans in natural language analyses tasks, but the results, even though already promising, are not of publishing-quality yet, and require substantial human revision.

In addition to translation, the UNL has been exploited for several other different tasks in natural language engineering, such as multilingual document generation, summarization, text simplification, information retrieval and semantic reasoning. Indeed, in UNL-based applications there is no need for the source language to be different from the target one: an English text may be represented in UNL in order to be generated, once again, in English, as a summarized, a simplified or a simply rephrased version of the original.

Finally, it should also be stressed that UNL, differently from other auxiliary languages (such as Esperanto, Interlingua, Volapük, Ido and others), is not intended to be a human language. We do not expect people to speak UNL or to communicate "in" UNL; we do expect them to use UNL and to communicate "through" UNL, but in the same unconscious, invisible and spontaneous way they do with other declarative and procedural languages which are pervasive in everyday applications. As no one is required to know HTML to browse the Internet or even to create websites, everyone would be able to UNL-ize documents and to extract out of them the information needed without any knowledge of UNL. UNL is therefore a formal language designed for computers, not for humans. Like other logical systems, it seeks to provide the linguistic and semiotic infrastructure for computers (and not for humans) to handle what is meant by natural languages


UNL Expression of “I can hear a dog barking outside”

{unl}
agt(hear(icl>perceive(agt>person,obj>thing)):06.@ability.@entry,   I(icl>person):00.@topic)
obj(hear(icl>perceive(agt>person,obj>thing)):06.@ability.@entry,   :01)
agt:01(bark(agt>dog):0H.@progress.@entry,   dog(icl>mammal):0D.@indef)
plc:01(bark(agt>dog):0H.@progress.@entry,   outside(icl>area):0P)
{/unl}

Syntax is the study of the principles and rules for constructing sentences in natural languages.
X-bar theory

The syntactic framework of the UNLarium derives from the X-bar theory ( Chomsky, Noam (1970). Remarks on nominalization. In: R. Jacobs and P. Rosenbaum (eds.) Reading in English Transformational Grammar, 184-221. Waltham: Ginn.), which postulates that all human languages share certain structural similarities, including the same underlying syntactic structure, whose abstract configuration is depicted in the diagram below:
    XP
   / \
spec  XB
     / \
    XB  adjt
   / \
  X   comp
  |
head
In the above:
X is the head, the nucleus or the source of the whole syntactic structure, which is actually derived (or projected) out of it.
comp (i.e., complement) is an internal argument, i.e., a word, phrase or clause which is necessary to the head to complete its meaning (e.g., objects of transitive verbs)
adjt (i.e., adjunct) is a word, phrase or clause which modifies the head but which is not syntactically required by it (adjuncts are expected to be extranuclear, i.e., removing an adjunct would leave a grammatically well-formed sentence)
spec (i.e., specifier) is an external argument, i.e., a word, phrase or clause which qualifies (determines) the head
XB (X-bar) is the general name for any of the intermediate projections derived from X
XP (X-bar-bar, X-double-bar, X-phrase) is the maximal projection of X.

Constituents

The head, the complement, the specifier and the adjunct are said to be the constituents of the syntactic representation and define the four general universal syntactic roles.
[edit] Heads
In the X-bar diagram depicted above, the letter X is used to signify an arbitrary category. Thus, the X may become an N for noun, a V for verb, and so on. In the UNLarium framework, there are eight different types of heads:
N = nouns and nominals: personal pronouns, demonstrative pronouns, nominalizations, etc
V = verbs
J = adjectives
A = adverbs[2]
P = prepositions
D = determiners: articles, demonstrative determiners, possessive determiners, quantifiers
I = auxiliary verbs
C = conjunctions
The heads define the nature of the phrase structures, thus:
N projects a Noun Phrase (NP)
V projects a Verbal Phrase (VP)
J projects an Adjective Phrase (JP)
A projects an Adverbial Phrase (AP)
P projects a Prepositional Phrase (PP)
D projects a Determiner Phrase (DP)
I projects an Inflectional Phrase (IP)
C projects a Complementizer Phrase (CP)

Specifiers
Specifiers are used to narrow the meaning intended by the head. They include:
articles: the (book), a (book), etc.
possessive determiners: my (book), your (book), etc.
demonstrative determiners: this (book), that (book), etc.
quantifiers: no (answer), every (hour), etc.
intensifiers (emphasizers, amplifiers, downtoners): very (expensive), quite (well), nearly (under), kind of (like), etc.
frequency adverbs: always (go), never (go), usually (go), etc.
negative adverbs: not (go)

Complements
Complements are used to complete the meaning intended by the head. They may be:
direct objects: (do) something, (give) something
indirect objects: (laugh at) something, (give to) someone
complement of deverbals (i.e., nouns deriving from verbs): (construction of) the city, (arrival of) Peter
complement of adjectives: (loyal) to the queen, (interested) in Chemistry
complement of adverbs: (contrarily) to popular belief, (independently) from her
complement of prepositions: (under) the table, (after) today
complement of conjunctions: (and) Peter, (I don't know if) he'll come

Adjuncts
Adjuncts are used to modify the meaning intended by the head:
adjectives: beautiful (table)
manner adverbs: speak (slowly)
prepositional phrases: (table) of wood
etc.

S-rule (syntactic rule) is the formalism used for describing syntactic structures and syntactic operations in the UNLarium framework.
S-rules are not used for for affixation (prefixation, infixation, suffixation) or for changes that involve only sequences of words, which must be addressed by A-rules and L-rules, respectively.
Formal Syntax

S-rules comply with the following formal syntax:
<S-RULE>                ::= <CONDITION> ":=" (<RELATION>)+";"
<CONDITION>             ::= <TAG>(","<TAG>)* | (<RELATION>)*
<RELATION>              ::= <SYNTACTIC RELATION> | <SEMANTIC RELATION>
<SEMANTIC RELATION>     ::= <UNL RELATION> "(" <NODE> ";" <NODE> ")"
<SYNTACTIC RELATION>    ::= <NL RELATION> "(" (<NODE>";")? <NODE> ")"
<UNL RELATION>          ::= {one of the head-driven syntactic relations defined in the UNL Specs} 
<NL RELATION>           ::= {one of the head-driven syntactic relations defined in the UNDLF Tagset}
<NODE>                  ::= <FEATURE>(","<FEATURE>)*
<FEATURE>               ::= <ID>|<TAG>|"""<STRING>"""|"["<STRING>"]"|<DIRECTION>|<SYNTACTIC RELATION>|<ACTION>
<ID>                    ::= "%"[a-zA-Z_0-9]+
<TAG>                   ::= {one of the tags defined in the UNDLF Tagset}
<STRING>                ::= [a..Z]+
<DIRECTION>             ::= ">"|">>"|"<"|"<<"
<ACTION>                ::= <PREFIXATION> | <SUFFIXATION> | <INFIXATION> | <REPLACEMENT> (cf. A-rule)
where
<a> = a is a non-terminal symbol
"a" = a is a constant
a | b = a or b
(a)? = a can be repeated 0 or one time
(a)* = a can be repeated 0 or more times
(a)+ = a can be repeated 1 or more times


Examples

Examples of S-rules:
composition
VA("into account"); (add the string "into account" as the adjunct of the verb)
subcategorization
VC(PH("in")); (the complement of the verb is a prepositional phrase headed by the preposition "in")
agreement
VS(ANUM,APER); (the specifier of the verb assigns number (ANUM) and person (APER) to its head
case marking
VS(NOM); (the specifier of the verb receives the case nominative (NOM)
distribution
VA(>>); (the adjunct of the verb comes at the right side of the verb after a blank space)
adjacency
VA(AJ2); (the adjunct of the verb integrates the second projection of the head)
periphrasis
VH(%vh,FUT):=+IC([will];%vh,+INF);
projection
VS(%head;%spec)VB(%head;%comp):=VP(VB(%head;%comp);%spec); (integrate the two relations on the left side into a single relation)
mapping
agt(%source;%target):=VS(%source;%target); (the agent relation is mapped into a VS relation)

A-rule (affixation rule) is the formalism used for generating affixes (prefixes, suffixes, infixes) in the UNLarium framework.

A-rules are used for prefixation, suffixation and infixation, i.e., for adding morphemes to a given base form. They are used for generating inflections (such as "book">"books", "love">"loved") or derivations (such as "dress">"undress", "write">"writer").

A-rules comply with the following syntax:
<A-RULE>           ::= <CONDITION> “:=” <ACTION> ("," <ACTION>)* “;”
<CONDITION>        ::= <ATAG>(“&”(“^”)?<ATAG>)*
<ATAG>             ::= {one of the tags defined in the UNDLF Tagset}
<ACTION>           ::= <PREFIXATION> | <SUFFIXATION> | <INFIXATION> | <REPLACEMENT>
<PREFIXATION>      ::= <ADDED>    {“<” | “<<”}    (<DELETED>)?
<SUFFIXATION>      ::= (<DELETED>)? {“>” | “>>”}    <ADDED>
<INFIXATION>       ::= "["<DELETED"]" ">" <ADDED> | <ADDED> "<" "["<DELETED"]"
<REPLACEMENT>      ::= ( <STRING> ":" )? <ADDED> | "[" <INTEGER> "-" <INTEGER> "]" ":"  <ADDED>
<ADDED>            ::= <STRING>
<DELETED>          ::= <STRING> | <INTEGER> 
<STRING>           ::= “ “ “ [a..Z]+ “ “ “
<INTEGER>          ::= [0..9]+
where
<a> = a is a non-terminal symbol
“a“ = a is a constant
a | b = a or b
{ a | b } = either a or b
(a)? = a can occur 0 or 1 time
(a)* = a can be repeated 0 or more times
(a)+ = a can be repeated 1 or more times

Examples
Prefixation
RULE    BEHAVIOR    BEFORE    AFTER
X:=”y”<”z”;    if X replace the string “z” by the string “y” in the beginning of the string   zabc   yabc
X:=”y”<1;    if X replace the first character of the string by “y”   zabc   yabc
X:=”y”<0;    if X add the string “y” to the beginning of the string    zabc   yzabc
X:=”y”<;    if X add the string “y” to the beginning of the string (idem previous)    zabc   yzabc
X:=”y”<<0;[1]    if X add the string “y” and a blank space to the beginning of the string    zabc   y zabc
X:=”y”<<;[2]    if X add the string “y” and a blank space to the beginning of the string (idem previous)    zabc   y zabc

Suffixation
RULE    BEHAVIOR    BEFORE    AFTER
X:=”z”>”y”;    if X replace the string “z” by the string “y” in the end of the string    abcz    abcy
X:=1>”y”;    if X replace the last character of the string by “y”    abcz    abcy
X:=0>”y”;    if X add the string “y” to the end of the string    abcz    abczy
X:=>”y”;    if X add the string “y” to the end of the string (idem previous)    abcz    abczy
X:=0>>”y”;[3]    if X add a blank space and the string “y” to the end of the string    abcz    abcz y
X:=>>”y”;[4]    if X add a blank space and the string “y” to the end of the string (idem previous)    abcz    abcz y

Infixation
RULE    BEHAVIOR    BEFORE    AFTER
X:=[2]>"y";    if X add "y" to the right of the second character    abc    abyc
X:="y"<[3];    if X add "y" to the left of the third character    abc    abyc
X:=["b"]>”y”;    if X add "y" to the right of "b";    abc    abyc
X:="y"<["c"];    if X add "y" to the left of "c"    abc    abyc

Replacement
RULE    BEHAVIOR    BEFORE    AFTER
X:=”y”;    if X replace the whole by “y”    X   y
X:=”z”:”y”;    if X replace the string “z” by “y”    azbc    aybc
X:=[2-3]:”y”;    if X replace the second to the third character by “z”    abcz    ayz



Call for Participation in the UNL Programme

The UNDL Foundation is seeking language specialists to provide dictionary entries and grammar rules for the UNL program in their native language. Tasks are distributed upon availability and are carried out in a distance-working environment through a specific web interface. Entries are paid through PayPal according to the expertise. Candidates are not required to have any previous experience in natural language processing but are expected to have some acquaintance with Linguistics and a good knowledge of English. Undergraduate and graduate students of Linguistics, Language Studies and Translation Studies from all over the world are especially welcome.

Please apply http://www.unlweb.net

I'm a language specialist. How much money may I make working in the UNLweb?

This depends largely on your work and expertise, and on the availability of funds for your language and project. Beginners (A0) receive only USD0.25 for each new entry inserted in the dictionary or for each new rule created in the grammar, but experienced users (C2) receive four times more (USD1.00) for the same work. Our average user has been inserting 30 new entries per hour (which makes USD7.50 for a beginner, or a USD30.00 for an experienced user), and we have been paying the average of USD300.00 a month per user (but some make more than USD2,000.00). It is important to observe, however, that many projects and languages have not been funded yet.

Offline myst

  • Posts: 35581
Где инструменты, где доступ к базе данных, где сама база данных?
Убил 20 минут, но так и не понял, одно бла-бла: что в Вики, что на сайте. Это какой-то развод, что ли?

Offline Sirko

  • Posts: 2496
  • Gender: Male
Где инструменты, где доступ к базе данных, где сама база данных?
Убил 20 минут, но так и не понял, одно бла-бла: что в Вики, что на сайте. Это какой-то развод, что ли?

Все здесь. Регистрация справа вверху. Платные проекты на русском и украинском сделаны.

1. Create an account

The first step to join the UNLweb is the registration. Registration is open and free and does not oblige you to work in the programme. Registered users are automatically subscribed to the UNLweb newsletter and are given the permission to post in the UNL Open Forum and in the UNLwiki. But you don't have to be a registered user to browse the data  of the UNLweb. The resources of the UNLweb are available even to non-registered users, who may browse the system using a guest account.

2. Pursue CLEA250

The UNLweb comprises several different profiles, ranging from observer to manager. The initial level is observer. As an observer, you have already access to several functionalities of the system, but you're not allowed to add or edit data and to gain UNLdots, which is the currency by which freelancers are remunerated and expertise is measured. In order to start adding data to the UNLweb, you have to be promoted to the trainee level, i.e., you have to be approved in CLEA250. CLEA is the acronym for Certificate of Language Engineering Aptitude, a certification given for free by VALERIE, the Virtual Learning Environment for UNL. CLEA250 comprises 250 questions about terminology of descriptive linguistics. It does not involve much specialisation, and is normally pursued in less than 3 hours. By the approval in CLEA250, you're automatically promoted to the trainee level and may start adding entries to the UNL-NL Dictionary.

3. Join a project

Dictionaries and grammars are available inside the UNLarium, which is one of the systems that integrate the UNLweb. The UNLarium is expected to be a linguist-friendly web-based integrated development environment for creating language resources. It contains three main divisions: dictionaries, grammars and corpora, and has been the cradle to most language data inside the UNL Programme. From the point of view of creating resources, the UNLarium is a corpus-driven environment, which means that you have to subscribe to a project in order to start adding data. You may join as many projects as you want, but you don't need to join projects if you plan only to browse or to export data. In order to join a project, you must have been approved in CLEA250.

4. Create an assignment

In order to start adding entries, you have to create an assignment. The assignment is actually only a reservation of entries to be treated. You may choose the part of speech and the number of entries (up to 250) to be reserved and the reservation is valid for 30 days. During this period, no one else will have access to these entries. After this period, the entries not addressed return to the main dictionary and become available to other users for reservation. You may close assignments at any time, and they're automatically closed on their deadline.

5. Create entries

The entries that you have reserved are automatically listed in your assignments page in the UNLarium. As a trainee, you will be able to create assignments only to provide entries for the UNL-NL Dictionary. The UNL-NL Dictionary is a bilingual lexical database, where UWs, the words of UNL, are mapped into natural language (NL) words. This is a very simple dictionary, and you'll be expected to provide only the basic information for each entry. The UNL-NL Dictionary is always UW-oriented, which means that you're not allowed to add the entries that you want. You'll have to translate the UWs proposed by the system and to classify them in your own native language. For each new entry created, you'll be credited 1 (one) UNLdot.

6. Close the assignment

After finishing the set of entries that you have reserved, you'll have to close the assignment. You may have only one open assignment at a time. You will be able to reopen closed assignments, but only those within the deadline. In any case, you'll have access only to the entries that you have created; the entries that you have postponed or declined will be already returned to the main dictionary.

7. Extending your permissions

Trainees may only create entries in the UNL-NL Dictionary in their own native language. You may extend these permissions in four different ways:

7.1. Extending languages

You may ask for a language extension, either by providing an international acknowledged language proficiency certificate, or by writing a message to the UNLweb ( info@unlweb.net ) detailing your experience with the language you intend to add to your profile. The message must be written in the intended language, and will be evaluated by other native speakers.

7.2. Extending dictionaries

The UNLarium contains three types of dictionaries.The UNL-NL dictionary, which is the most simple one, where UWs are mapped into natural language words; the NL dictionary, which is a monolingual dictionary where natural language entries are further described (according to the morphological and syntactic behaviour); and the UNL dictionary, which is dictionary in UNL where UWs are treated and classified. CLEA250 grants permission to work only with the UNL-NL dictionary. In order to work with the NL dictionary, you'll have to be approved in CLEA450. In order to work with the UNL dictionary, you'll have to be approved in CUP500.

7.3. Extending actions

Trainees and authors may only create entries. But there are two other possible actions inside the UNLarium: verifying and revising entries. Verification is the first check, and is done by editors. Revision is the second check, and is done by revisers. Editors and revisers are defined according to the expertise, i.e., according to the number of UNLdots accumulated. Users become editors when they sum 30,000 UNLdots, and revisers after achieving 75,000 UNLdots.

7.4. Becoming a manager

If you own a PhD in Linguistics, you may apply directly to a manager account. Managers have special privileges, as creating projects and editing language settings.They are also authorized to create, verify and revise entries, independently of the score of UNLdots. In order to apply for a manager account, contact info@unlweb.net .

Offline myst

  • Posts: 35581
Все здесь.
Я там уже был.

Регистрация справа вверху.
А что регистрация даёт?

Платные проекты на русском и украинском сделаны.
Платные? Для английского тоже платные?

Offline Sirko

  • Posts: 2496
  • Gender: Male
Платные? Для английского тоже платные?

Язык участия определяется местом проживания.

Offline myst

  • Posts: 35581
Я во время регистрации выбрал английский. Мне для английского информацию дают.

Offline Sirko

  • Posts: 2496
  • Gender: Male
Проект UNL MIR финансирует следующие языки:zh,fr,de,hi,ja,pt,ru,es,sw. Они завершены/завершаются.
Проект Le Petit Prince финансирует все языки. Уже сделаны или в процессе завершения, кроме вышеуказанных, следующие языки: bn,bg,hr,et,el,id,it,pa,fa,sl,ur,ta,tr,nl,uk,gr,la,sk.
Остальные вакантны. До 2000 дол. США за каждый.  ;up: :smoke:

Le Petit Prince in UNL
Monday, 15 March 2010 07:43    Ronaldo Martins   

The UNDL Foundation has released a version in UNL of “Le Petit Prince” (The Little Prince), the famous novella by Antoine de Saint-Exupéry, published in 1943. The corpus is available under an Attribution Share Alike (CC-BY-SA) Creative Commons license at the UNLarium, and may be used for researchers and developers interested in semantic annotation of natural language texts.
What is UNL?

The UNL is a knowledge representation language that has been used for several different tasks in natural language engineering, such as machine translation, multilingual document generation, summarization, information retrieval and semantic reasoning. It has been originally proposed by the Institute of Advanced Studies of the United Nations University, in Tokyo, and has been currently promoted by the UNDL Foundation, in Geneva, Switzerland, under a mandate of the United Nations. [read more about UNL]
Why Le Petit Prince?

Le Petit Prince is one of the best-selling books ever (more than 80 million copies), and has been translated to more than 180 languages, providing thus the possibility of contrasting and evaluating a wide range of UNL-based translations. Additionally, the text offers the chance of experimenting UNL in three situations that have not been explored so often: French original, narrative and literature. Our main goal is to “UNL-plicate” the text in at least three different directions: replication, summarization and simplification, in as many languages as possible. [read more about UNLplication]
How the text was UNLized?

The integral version of Le Petit Prince, which has been released under public domain in Canada, was obtained from http://wikilivres.info/wiki/Le_Petit_Prince. The whole text comprises 15,513 word forms (tokens) and 1,684 sentences. The UNLization of the text was carried out in a fully-manual way through the UNL Editor, a graph-based authoring tool developed by the UNDL Foundation. The sentences have been divided into two main different groups: a) the training corpus, which comprises the first 53 sentences of the book (dedication and first chapter), including the title; and b) the application corpus, which comprises the remaining 1,548 sentences. The training corpus was addressed collectively by a group of four human UNLizers in order to synchronize and normalize the UNLization strategies. The application corpus was organized according to the similarity of sentences (and not to the order of appearance) and was addressed from December 2009 to February 2010 according to the guidelines resulting from the training exercise (and which are available at http://www.unlweb.net/wiki/index.php/UNLization_Guidelines).
Further information

For further information, please contact
Ronaldo MARTINS
Language Resources Manager
UNDL Foundation
48, route de Chancy, CH-1213, Petit-Lancy, Geneva, Switzerland
+41 22 879 8090
About

The UNDL Foundation is a non-profit organization based in Geneva, Switzerland, which has received, from the United Nations, the mandate for implementing the Universal Networking Language (UNL). The UNL Programme is a collaborative effort to create natural language resources and technology to reduce language barriers and strengthen cross-cultural communication in the framework of the United Nations. Participation in the Programme is free and open to individuals and institutions, either as researchers or as developers. Special funds are available for some languages.

Offline Alone Coder

  • Вне лингвистики
  • Posts: 23232
  • Gender: Male
    • Орфовики
Как я понял, проект собирается семантику выражать. При чём тут тогда Хомский?

Offline myst

  • Posts: 35581
Я вообще там ничего не понял. :(
Как, например, извлечь семантику конкретного предложения?

Offline Sirko

  • Posts: 2496
  • Gender: Male
Как я понял, проект собирается семантику выражать. При чём тут тогда Хомский?

Синтаксические сети Хомского используются для описания субкатегоризационных правил (S-rules), в т.ч. для описания составных выражений.
Например, конструкция VS(NP); указывает на непереходность глагола, а VS(NP)VC(NP); на его переходность.
+VA([home],AJ1)VC("the bacon",AJ2); описывает порядок слов в составном выражении bring home the bacon   

Offline Alone Coder

  • Вне лингвистики
  • Posts: 23232
  • Gender: Male
    • Орфовики
Переходность - синтаксический критерий, к семантике отношения не имеющий. В другом языке то же значение может выражаться непереходным глаголом или вообще не глаголом.

Offline Sirko

  • Posts: 2496
  • Gender: Male
The UNDL Foundation invites applications for the I UNL Grammar Workshop, to take place in Geneva, Switzerland, from February 6th to 10th, 2012. The workshop is dedicated to the development of the grammatical resources for the automatic processing of the following languages:

Croatian
Czech
Danish
Dutch
Estonian
Finnish
German
Hungarian
Italian
Lithuanian
Norwegian
Romanian
Serbian
Slovak
Slovenian
Swedish

ACTIVITIES
During the workshop the participants are expected to provide the dictionary entries and the morphological, syntactic and semantic modules of the grammar necessary to generate a reference corpus from UNL into their native language, and from their native language into UNL. The grammar is expected to comply with the formalism described at www.unlweb.net/wiki/index.php/Grammar_Specs, and will be provided through the UNLarium (www.unlweb.net/unlarium), a web-based integrated development environment for creating and editing language resources for natural language engineering. The UNDL Foundation will provide all the training and support necessary for the accomplishment of the tasks.

REQUISITES
Candidates must be native speakers of any of the languages referred to above; should have been registered at the UNLweb ; and should have been approved in CLEA250, CLEA450, CLEA700 and CUP500. These certificates may be pursued online at VALERIE – the Virtual Learning Environment for UNL – available at www.unlweb.net/valerie. Previous experience in natural language processing, respectable academic records and previous experience in any of the existing projects at the UNLweb (such as Le Petit Prince and UNL MIR) will also be valued.

APPLICATION
In order to apply, candidates must send a CV to r.martins@undlfoundation.org until November 30th, 2011.

SELECTION
The UNDL Foundation will select one candidate per language, according to the following criteria:
a) Compliance with the requisites;
b) Highest academic degree;
c) Strongest experience in natural language processing;
d) Strongest experience in the UNLweb.
The list of selected candidates will be published at the UNLweb until December 15th, 2011.

PROGRAM AND VENUE
The workshop will take 30 hours, from February 6th to 10th, 2012, at the UNDL Foundation office, in Geneva, Switzerland, according to the tentative schedule below:

Feb 06th, 2012 - Monday
09:00-10:00 Introduction
10:00-12:00 I – Corpus
14:00-17:00 II – UNL-NL dictionary
Feb 07th, 2012 - Tuesday
09:00-12:00 III – Morphology (inflectional paradigms)
14:00-17:00 IV – NL dictionary
Feb 08th, 2012- Wednesday
09:00-12:00 V – UNL-NL grammar (I)
14:00-17:00 V – UNL-NL grammar (II)
Feb 09th, 2012 - Thursday
09:00-12:00 VI – NL-UNL grammar (I)
14:00-17:00 VI – NL-UNL grammar (II)
Feb 10th, 2012 - Friday
09:00-12:00 Evaluation
14:00-17:00 Discussion

SUPPORT
The UNDL Foundation will pay the travel and accommodation expenses for the selected candidates not living in Geneva, Switzerland. These include:
a round-trip plane or train ticket to/from Geneva;
7 nights (from Feb 5th to Feb 11th) at the ETAP hotel Genève Petit-Lancy;
7 per diem of CHF60.00 (total of CHF420.00).

PRO LABORE
The selected candidates will receive a pro labore of CHF1,000.00 (one thousand Swiss Francs) in addition to the UNLdots acquired during the workshop. The pro labore will only be paid to the candidates present to all sessions and actually engaged in producing the intended resources.

CERTIFICATION
The UNDL Foundation will issue a Certificate of Participation, upon evaluation, for all the participants.

THE UNDL FOUNDATION
The UNDL Foundation is a non-profit organization based in Geneva, Switzerland, which has received, from the United Nations, the mandate for implementing the Universal Networking Language (UNL). The UNL is an artificial language that has been used for several different tasks in natural language engineering, such as machine translation, multilingual document generation, summarization, information retrieval and semantic reasoning. It has been, since 1996, a unique initiative to reduce language barriers and strengthen cross-cultural communication in the framework of the UN.

FURTHER INFORMATION
For further information, please contact:
Ronaldo Martins
Language Resources Manager
r.martins@undlfoundation.org

Quote
Dear All,

During this Holiday Season our thoughts always turn gratefully to those who have made our progress possible. At this end of 2011, the UNLarium surpassed the amount of 1,000,000 UNLdots, in 25 languages, and this would not be conceivable without your participation. In this spirit we say, once again, simply but sincerely, thank you very much for your key contribution. Our best wishes for the Holiday Season. And a happy New Year. May it see our hopes for a better world fulfilled: a world where language unite, rather than isolate.

On behalf of the UNDL Foundation,

Ronaldo.

Quote
I UNL Olympiad
Saturday, 17 November 2012 14:28    Ronaldo Martins   
The UNDL Foundation has launched the I UNL Olympiad, which is devoted to the the development of grammars for the corpus UC-A1, comprising 100 sentences. The competition is open to any participant, and the deadline is February 15th, 2013.



Modalities

The competition is organised in two modalities:

Best UNLization Grammar for <LANGUAGE>
Best NLization Grammar for <LANGUAGE>
Where <LANGUAGE> is one of the languages participating in this Olympiad (see the complete list below).

Candidates may participate in one or two modalities, i.e., they may work with the UNLization grammar, with the NLization grammar, or with both.

Candidates may also participate in one or more languages, provided that they belong to the list below.

Prizes

Prizes are awarded to the best grammars of each modality (UNLization and NLization) for each language:

1st place: Gold Medal and USD500.00
2nd place: Silver Medal
3rd place: Bronze Medal
Additionally, the authors of the three best UNLization Grammars among all languages and the authors of the three best NLization Grammars among all languages will also be invited to participate in the next intermediate-level grammar workshop, to be held in Geneva, Switzerland, on May 2013.

Registration

Candidates must be registered at the UNLweb. Participation is open and free, and the registration to the Olympiad is done by sending the corresponding files to r.martins@undlfoundation.org until 23:59:59 (UTC) of February 15th, 2013. The list of files is available at www.unlweb.net/wiki/Olympiad.

Evaluation

Grammars will be evaluated and ranked according to the following criteria:

Best F-measure
Scalability (i.e., extendibility, or the capacity of being reused to other corpora), in case of grammars with the same F-Measure
Date of submission, in case of grammars with the same F-Measure and equally scalable
Languages

The I UNL Olympiad will be dedicated to the development of grammars for the following languages:

Assamese
Baatonum
Bengali
Bulgarian
Chinese
Croatian
Dutch
German
Gujarati
Hindi
Hungarian
Indonesian
Italian
Japanese
Kashmiri
Malayalam
Marathi
Oriya
Persian
Polish
Romanian
Russian
Sanskrit
Sindhi
Slovak
Swahili
Swedish
Tamil
Telugu
Thai
Turkish
Ukrainian
Candidates may participate in one or more languages above.

Instructions

For detailed instructions, see www.unlweb.net/wiki/Olympiad.

Further information

For further information, please contact:

Ronaldo Martins, PhD

Language Resources Manager

UNDL Foundation

r.martins@undlfoundation.org

(please distribute, and our apologies for multiple postings)

==============================
XI UNL School - Macau 2013
www.unlweb.net/school
==============================
The UNDL Foundation and the University of Macau invite applications
for the XI UNL School, to take place at the University of Macau, in
Macau, from March 11th to 15th, 2013. The workshop is dedicated to the
development of the grammatical resources for the processing of the
languages below. The UNDL Foundation will pay the travel and
accommodation expenses for the selected candidates not living in
Macau.
==============================
IMPORTANT DATES
03/Feb/2013: Deadline for the applications
10/Feb/2013: Notification of accepted candidates
11-15/Mar/2013: XI UNL School
==============================
LANGUAGES
The workshop is dedicated to the development of the grammatical
resources for the processing of the languages below:
*Bahasa Indonesia
*Bengali
*Burmese
*Cantonese
*Chinese
*Japanese
*Khmer
*Korean
*Laotian
*Malay
*Mongolian
*Nepalese
*Portuguese (Macau)
*Sinhalese
*Tagalog
*Tetum
*Thai
*Vietnamese
==============================
REQUISITES
Candidates must be native speakers of any of the languages referred to
above and must be approved in CLEA250, CLEA500 and CUP500. These
certificates may be pursued online at VALERIE - the Virtual Learning
Environment for UNL - available at www.unlweb.net/valerie. Previous
experience in natural language processing, respectable academic
records and previous experience in any of the existing projects at the
UNLweb (such as Le Petit Prince and UNL MIR) will also be valued.
==============================
APPLICATION
In order to apply, candidates must send a CV to
r.martins@undlfoundation.org before February 3rd, 2013.
==============================
SELECTION
The UNDL Foundation will select only one candidate per language,
according to the following criteria:
*Compliance with the requisites;
*Highest academic degree;
*Strongest experience in natural language processing;
*Strongest experience at the UNLweb.
==============================
VENUE
University of Macau, Macau.
==============================
SUPPORT
The UNDL Foundation will pay the travel and accommodation expenses for
the selected candidates not living in Macau. These include:
*a round-trip plane, bus or train ticket from/to Macau
*7 nights at a mid-range hotel in Macau
*7 per diem of USD50.00 (USD350.00 in total)
==============================
WORKSHOP ACTIVITIES (11-15/Mar/2013)
During the workshop, the participants are expected to provide the
syntactic and semantic modules of the grammar necessary to generate
the workshop corpus from UNL into their native language, and from
their native language into UNL. The grammar is expected to comply with
the formalism described at
www.unlweb.net/wiki/index.php/Grammar_Specs, and will be provided
through the UNLdev, a web-based integrated development environment for
creating and editing dictionary entries and grammar rules for natural
language processing. The UNDL Foundation will provide all the training
and support necessary for the accomplishment of the tasks.
==============================
CERTIFICATION
The UNDL Foundation will issue a Certificate of Participation, upon
evaluation, for all the participants.
==============================
THE UNL AND THE UNDL FOUNDATION
The UNDL Foundation is a non-profit organization based in Geneva,
Switzerland, which has received, from the United Nations, the mandate
for implementing the Universal Networking Language (UNL). The UNL is
an artificial language that has been used for several different tasks
in natural language engineering, such as machine translation,
multilingual document generation, summarization, information retrieval
and semantic reasoning. It has been, since 1996, a unique initiative
to reduce language barriers and strengthen cross-cultural
communication in the framework of the UN.
==============================
LOCAL ORGANIZATION
Ana Luísa Varani Leal
Assistant Professor
University of Macau - FSH - Department of Portuguese
==============================
FURTHER INFORMATION
For further information, please contact:
Ronaldo Martins, PhD
Language Resources Manager
UNDL Foundation
r.martins@undlfoundation.org

 

With Quick-Reply you can write a post when viewing a topic without loading a new page. You can still use bulletin board code and smileys as you would in a normal post.

Note: this post will not display until it's been approved by a moderator.
Name: Email:
Verification:
Type the letters shown in the picture
Listen to the letters / Request another image
Type the letters shown in the picture:
√49 Напишите ответ строчными буквами:
«Сто одёжек, все без застёжек» — что это?: