Constituency Parsing

Constituency parsing is the task of parsing a sentence to its phrase structure grammar which represents its syntax in the form of nested constituent structure.

Users can load a MTL component and specify tasks='con' to have sentences parsed by ELIT. E.g.,

import elit
nlp = elit.load('LEM_POS_NER_DEP_SDP_CON_AMR_ROBERTA_BASE_EN')
print(nlp([['Emory', 'NLP', 'is', 'a', 'research', 'lab', 'in', 'Atlanta', ',', 'GA', '.'], 
           ['It', 'is', 'founded', 'by', 'Jinho', 'D.', 'Choi', 'in', '2014', '.'], 
           ['Dr.', 'Choi', 'is', 'a', 'professor', 'at', 'Emory', 'University', '.']]
, tasks='con'))

Outputs:

{
  "con": [
    ["TOP", [["S", [["NP", [["_", ["Emory"]], ["_", ["NLP"]]]], ["VP", [["_", ["is"]], ["NP", [["NP", [["_", ["a"]], ["_", ["research"]], ["_", ["lab"]]]], ["PP", [["_", ["in"]], ["NP", [["NP", [["_", ["Atlanta"]]]], ["_", [","]], ["NP", [["_", ["GA"]]]]]]]]]]]], ["_", ["."]]]]]],
    ["TOP", [["S", [["NP", [["_", ["It"]]]], ["VP", [["_", ["is"]], ["VP", [["_", ["founded"]], ["PP", [["_", ["by"]], ["NP", [["_", ["Jinho"]], ["_", ["D."]], ["_", ["Choi"]]]]]], ["PP", [["_", ["in"]], ["NP", [["_", ["2014"]]]]]]]]]], ["_", ["."]]]]]],
    ["TOP", [["S", [["NP", [["_", ["Dr."]], ["_", ["Choi"]]]], ["VP", [["_", ["is"]], ["NP", [["NP", [["_", ["a"]], ["_", ["professor"]]]], ["PP", [["_", ["at"]], ["NP", [["_", ["Emory"]], ["_", ["University"]]]]]]]]]], ["_", ["."]]]]]]
  ],
  "tok": [
    ["Emory", "NLP", "is", "a", "research", "lab", "in", "Atlanta", ",", "GA", "."],
    ["It", "is", "founded", "by", "Jinho", "D.", "Choi", "in", "2014", "."],
    ["Dr.", "Choi", "is", "a", "professor", "at", "Emory", "University", "."]
  ]
}

con stores the constituency trees, specifically (label, child-constituents) for the non-terminal constituents and form for the terminals. Note that we designed a nested list representation with the round brackets replaced by square brackets to avoid ambiguity and make it compatible with JSON. When not being printed out, the Document class will convert our nested list structure to the conventional bracketed tree.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

constituency_parsing.md

constituency_parsing.md

Constituency Parsing

Files

constituency_parsing.md

Latest commit

History

constituency_parsing.md

File metadata and controls

Constituency Parsing