Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DMRS representation #21

Open
arademaker opened this issue Jun 30, 2021 · 8 comments
Open

DMRS representation #21

arademaker opened this issue Jun 30, 2021 · 8 comments

Comments

@arademaker
Copy link
Member

Considering the graph

image

We want it to be as closer as possible of its graphical representation

image

@arademaker
Copy link
Member Author

  1. We need rdfs:label for the nodes with the predText value
  2. we need rdfs:label for the links with the edges labels such as ARG1/EQ

@arademaker
Copy link
Member Author

arademaker commented Jun 30, 2021

Eventually, we would like to be as close as possible to the XML serialization. See delph-in/pydelphin#329.

<dmrs cfrom="-1" cto="-1" top="10003" index="10003">
  <node nodeid="10000" cfrom="0" cto="13">
    <gpred>udef_q</gpred>
    <sortinfo />
  </node>
  <node nodeid="10001" cfrom="0" cto="6">
    <realpred lemma="mask" pos="v" sense="1" />
    <sortinfo SF="prop" TENSE="untensed" MOOD="indicative" PROG="bool" PERF="-" cvarsort="e" />
  </node>
  <node nodeid="10002" cfrom="7" cto="13">
    <realpred lemma="people" pos="n" sense="of" />
    <sortinfo PERS="3" NUM="pl" IND="+" cvarsort="x" />
  </node>
  <node nodeid="10003" cfrom="18" cto="25">
    <realpred lemma="look" pos="v" sense="1" />
    <sortinfo SF="prop" TENSE="pres" MOOD="indicative" PROG="+" PERF="-" cvarsort="e" />
  </node>
...  
  <node nodeid="10010" cfrom="51" cto="52">
    <realpred lemma="a" pos="q" />
    <sortinfo />
  </node>
  <node nodeid="10011" cfrom="53" cto="59">
    <realpred lemma="forest" pos="n" sense="of" />
    <sortinfo PERS="3" NUM="sg" IND="+" cvarsort="x" />
  </node>
  <link from="10000" to="10002">
    <rargname>RSTR</rargname>
    <post>H</post>
  </link>
  <link from="10001" to="10002"><rargname>ARG2</rargname><post>EQ</post></link>
  <link from="10003" to="10002"><rargname>ARG1</rargname><post>NEQ</post></link>
  <link from="10004" to="10003"><rargname>ARG1</rargname><post>EQ</post></link>
  <link from="10004" to="10008"><rargname>ARG2</rargname><post>NEQ</post></link>
  <link from="10005" to="10008"><rargname>RSTR</rargname><post>H</post></link>
  <link from="10006" to="10008"><rargname>ARG1</rargname><post>EQ</post></link>
  <link from="10007" to="10006"><rargname>ARG1</rargname><post>EQ</post></link>
  <link from="10009" to="10003"><rargname>ARG1</rargname><post>EQ</post></link>
  <link from="10009" to="10011"><rargname>ARG2</rargname><post>NEQ</post></link>
  <link from="10010" to="10011"><rargname>RSTR</rargname><post>H</post></link>
</dmrs>

@arademaker
Copy link
Member Author

arademaker commented Jun 30, 2021

After the hashtag, we should not use / separators. But note that hashtag means the identification of a specific part of something. For instance, URL uses that for indicating sections of pages located in services: https://github.com/delph-in/docs/wiki/AceInstall#building-ace means the page AceInstall located in the path /delph-in/docs/wiki/ in the server https://github.com with a section building-ace. So in http://ibm.com/sick/b/33/4/nodes/10014#predicate we are talking about the predicate part/section of 10014 an item in the collection (path, folder) sick/b/33/4/nodes. This is how normally with think, right?

URIs, some initial thoughts:

One idea is to make the namespace of a profile as flat as possible. Note that in the current situation, 2 and 3, is not compatible with 1 and 4.

  1. from http://ibm.com/sick/b/33/4/dmrsi#dmrs to http://ibm.com/sick/b/result-33-4
  2. from http://ibm.com/sick/b/33/4/nodes/10012 to http://ibm.com/sick/b/node-33-4-10012
  3. from http://ibm.com/sick/b/33/4/links/8 to http://ibm.com/sick/b/link-33-4-8
  4. from http://ibm.com/sick/b/33/4/nodes/10014#predicate to http://ibm.com/sick/b/predicate-33-4-10014

We need a new node http://ibm.com/sick/b/item-33 (or sentence-XX) and make the property text be a property from that obj to the string. Make all results connected to that sentence by hasResult (I am trying to follow the terminology from the profiles)

One alternative would be to think that we have a hierarchical structure of collections . The profile is sick/b, that in turn have another collection 33 with an item 4 that have its parts:

  1. from http://ibm.com/sick/b/33/4/dmrsi#dmrs to http://ibm.com/sick/b/33/4
  2. from http://ibm.com/sick/b/33/4/nodes/10012 to http://ibm.com/sick/b/33/4#node-10012
  3. from http://ibm.com/sick/b/33/4/links/8 to http://ibm.com/sick/b/33/4#link-8
  4. from http://ibm.com/sick/b/33/4/nodes/10014#predicate to http://ibm.com/sick/b/33/4#predicate-10014

Another alternative would be to push the hierarchical structure as far as possible. We would need to mix the identifiers with names (like function names and arguments). The b profile has an item 33 that has a result 4 ...

  1. from http://ibm.com/sick/b/33/4/dmrsi#dmrs to http://ibm.com/sick/b/item/33/res/4
  2. from http://ibm.com/sick/b/33/4/nodes/10012 to http://ibm.com/sick/b/item/33/res/4/node/10012
  3. from http://ibm.com/sick/b/33/4/links/8 to http://ibm.com/sick/b/item/33/res/4/link/8
  4. from http://ibm.com/sick/b/33/4/nodes/10014#predicate to http://ibm.com/sick/b/item/33/res/4/node/10014/predicate

@arademaker
Copy link
Member Author

arademaker commented Jul 1, 2021

image

Using the rdfs:label! I am not showing the dmrs node that link to all the nodes and links in the graph above.

@arademaker
Copy link
Member Author

If http://ibm.com/sick/a/1/0#dmrs is the #dmrs section of a document 0 in the collection 1 part of collection a... So http://ibm.com/sick/a/1/0#link-10 is a section of the same document 0. But #link-10 is actually part of #dmrs, right?

@yfaria yfaria mentioned this issue Jul 11, 2021
@yfaria
Copy link
Contributor

yfaria commented Jul 12, 2021

In fact, this version of the code has this flaw in URI construction for not considering the which semantic representation the node is in the construction. As the names of those elements are different among the three different semantic representations that are being converted (EDS, MRS, DMRS), it doesn't create problems, but it is better having this indication.
The construction of the URIs on #26 consider those parts. Now there are three other URIs as well: the one of the profile, the profile item and the result of the item.
Considering http://ibm.com/sick/b as the prefix given by the user, we have the profile URI would be the prefix itself, http://ibm.com/sick/b; the item of id 33 would have the URI http://ibm.com/sick/b/33; the fourth result would be http://ibm.com/sick/b/33/4; the DMRS URI of that result http://ibm.com/sick/b/33/4/dmrs; and, finally, an element of that would be preceded by a hash, so we would have http://ibm.com/sick/b/33/4/dmrs#link-10.

@arademaker
Copy link
Member Author

yes, it seems reasonable.

@arademaker
Copy link
Member Author

@yfaria is this issue closed? I am not sure.

To be concrete, I believe we may have applications that need the simpler possible RDF. That is the RDF closer to the graphical representation of DMRS.

We could potentially provide that function in this library, right?

Another question is how close we are to the XML representation DMRSs and if we should try to fix any detour that we may have made from it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants