Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge markups #90

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

Conversation

jillianchang
Copy link
Collaborator

@jillianchang jillianchang commented Oct 11, 2021

This PR integrates the markup branch into the master branch.

@jillianchang
Copy link
Collaborator Author

Can #70 be visited in order to apply the variable markup?

@kylebgorman
Copy link
Contributor

Can #70 be visited in order to apply the variable markup?

I'm not sure what you mean re: #70. It seems like everything is working with respect to that and you updated the tests.

@kylebgorman
Copy link
Contributor

This all looks good to me. My only thought is that do we want to keep the old and new behavior in separate grammars so users can choose to have variable markup or not? Just thinking out loud.

Even if we do this maybe we should just merge now.

It doesn't seem like the CircleCI tests are running when you submit (they run for me). I'm not sure what that's about.

@jillianchang
Copy link
Collaborator Author

Can #70 be visited in order to apply the variable markup?

I'm not sure what you mean re: #70. It seems like everything is working with respect to that and you updated the tests.

__________________________ ScansionTest.test_aen_1_1 ___________________________

self = <tests.scansion_test.ScansionTest testMethod=test_aen_1_1>

    def test_aen_1_1(self):
        text = "Arma virumque canō, Trojae quī prīmus ab ōris"
        verse = self.scan_verse(text, 1)
        self.assertEqual(verse.number, 1)
        self.assertEqual(verse.text, text)
        self.assertEqual(
            verse.norm, "arma virumque canō trojae quī prīmus ab ōris"
        )
        self.assertEqual(
            verse.raw_pron, "arma wirũːkwe kanoː trojjaj kwiː priːmus ab oːris"
        )
        self.assertEqual(
            verse.var_pron, "arma wirũːkwe kanoː trojjaj kwiː priːmus‿ab‿oːris"
        )
    
        # Tests foot structures.
        self.assertEqual(verse.foot[0].type, latin_scansion.Foot.DACTYL)
        self.assertEqual(verse.foot[1].type, latin_scansion.Foot.DACTYL)
        self.assertEqual(verse.foot[2].type, latin_scansion.Foot.SPONDEE)
        self.assertEqual(verse.foot[3].type, latin_scansion.Foot.SPONDEE)
        self.assertEqual(verse.foot[4].type, latin_scansion.Foot.DACTYL)
        self.assertEqual(verse.foot[5].type, latin_scansion.Foot.SPONDEE)
    
        # Tests syllable weights.
        self.assertEqual(
            verse.foot[0].syllable[0].weight, latin_scansion.Syllable.HEAVY
        )
        self.assertEqual(
            verse.foot[0].syllable[1].weight, latin_scansion.Syllable.LIGHT
        )
        self.assertEqual(
            verse.foot[0].syllable[2].weight, latin_scansion.Syllable.LIGHT
        )
        self.assertEqual(
            verse.foot[1].syllable[0].weight, latin_scansion.Syllable.HEAVY
        )
        self.assertEqual(
            verse.foot[1].syllable[1].weight, latin_scansion.Syllable.LIGHT
        )
        self.assertEqual(
            verse.foot[1].syllable[2].weight, latin_scansion.Syllable.LIGHT
        )
        self.assertEqual(
            verse.foot[2].syllable[0].weight, latin_scansion.Syllable.HEAVY
        )
        self.assertEqual(
            verse.foot[2].syllable[1].weight, latin_scansion.Syllable.HEAVY
        )
        self.assertEqual(
            verse.foot[3].syllable[0].weight, latin_scansion.Syllable.HEAVY
        )
        self.assertEqual(
            verse.foot[3].syllable[1].weight, latin_scansion.Syllable.HEAVY
        )
        self.assertEqual(
            verse.foot[4].syllable[0].weight, latin_scansion.Syllable.HEAVY
        )
        self.assertEqual(
            verse.foot[4].syllable[1].weight, latin_scansion.Syllable.LIGHT
        )
        self.assertEqual(
            verse.foot[4].syllable[2].weight, latin_scansion.Syllable.LIGHT
        )
        self.assertEqual(
            verse.foot[2].syllable[0].weight, latin_scansion.Syllable.HEAVY
        )
        self.assertEqual(
            verse.foot[5].syllable[1].weight, latin_scansion.Syllable.HEAVY
        )
        self.assertEqual(
            verse.foot[5].syllable[0].weight, latin_scansion.Syllable.HEAVY
        )
        self.assertEqual(
            verse.foot[5].syllable[1].weight, latin_scansion.Syllable.HEAVY
        )
    
        # Tests subsyllabic units.
        self.assertEqual(verse.foot[0].syllable[0].nucleus, "a")
        self.assertEqual(verse.foot[0].syllable[0].coda, "r")
        self.assertEqual(verse.foot[0].syllable[1].onset, "m")
        self.assertEqual(verse.foot[0].syllable[1].nucleus, "a")
        self.assertEqual(verse.foot[0].syllable[2].onset, "w")
        self.assertEqual(verse.foot[0].syllable[2].nucleus, "i")
        self.assertEqual(verse.foot[1].syllable[0].onset, "r")
        self.assertEqual(verse.foot[1].syllable[0].nucleus, "ũː")
        self.assertEqual(verse.foot[1].syllable[1].onset, "kw")
        self.assertEqual(verse.foot[1].syllable[1].nucleus, "e")
        self.assertEqual(verse.foot[1].syllable[2].onset, "k")
        self.assertEqual(verse.foot[1].syllable[2].nucleus, "a")
        self.assertEqual(verse.foot[2].syllable[0].onset, "n")
        self.assertEqual(verse.foot[2].syllable[0].nucleus, "oː")
        self.assertEqual(verse.foot[2].syllable[1].onset, "tr")
        self.assertEqual(verse.foot[2].syllable[1].nucleus, "o")
        self.assertEqual(verse.foot[2].syllable[1].coda, "j")
        self.assertEqual(verse.foot[3].syllable[0].onset, "j")
        self.assertEqual(verse.foot[3].syllable[0].nucleus, "a")
        self.assertEqual(verse.foot[3].syllable[0].coda, "j")
        self.assertEqual(verse.foot[3].syllable[1].onset, "kw")
        self.assertEqual(verse.foot[3].syllable[1].nucleus, "iː")
        self.assertEqual(verse.foot[4].syllable[0].onset, "pr")
        self.assertEqual(verse.foot[4].syllable[0].nucleus, "iː")
        self.assertEqual(verse.foot[4].syllable[1].onset, "m")
        self.assertEqual(verse.foot[4].syllable[1].nucleus, "u")
>       self.assertEqual(verse.foot[4].syllable[2].onset, "s")
E       AssertionError: '‿' != 's'
E       - ‿
E       + s

tests/scansion_test.py:131: AssertionError

As per above, testing the feet and syllable structure fails because the backward composition doesn't yet use the variable markups.

@kylebgorman
Copy link
Contributor

kylebgorman commented Oct 12, 2021 via email

@jillianchang
Copy link
Collaborator Author

jillianchang commented Oct 12, 2021 via email

@kylebgorman
Copy link
Contributor

The test results are saying that the tie is the onset, but wouldn’t we want the onset to be “s?”

We'd want that. How the markup characters are "syllabified" is kind of up to you, I suppose.

@jillianchang
Copy link
Collaborator Author

I'm actually quite stuck on why it's displaying the tie as the onset, instead of the letter itself. Would you be able to look at that when you get the chance?

@kylebgorman
Copy link
Contributor

I'm actually quite stuck on why it's displaying the tie as the onset, instead of the letter itself. Would you be able to look at that when you get the chance?

I took a look and it's not at all clear to me either. Do any of the intermediate representations we generate in the textproto give a clue?

I am wondering if we should pause this branch dev and focus on other stuff. I should be able to get back to web stuff shortly---I made some progress earlier and just need a half-day of focused time...sorry for my slow response

@jillianchang
Copy link
Collaborator Author

Yeah, the textproto shows the tie as on the onset anywhere where there's resyllabification.

Okay, that's probably a good idea. This branch seems fine with regards to the actual scanning- it's just the parsing of the specific syllable elements where it's not quite accurate. I'm not sure how big of a big deal that is?

In the meantime, I can start adding comments to the textproto files.

@kylebgorman
Copy link
Contributor

kylebgorman commented Oct 23, 2021 via email

@jillianchang
Copy link
Collaborator Author

Should we come back to this so that the web app can use markup pronunciation?

@kylebgorman
Copy link
Contributor

kylebgorman commented Nov 27, 2021 via email

@jillianchang
Copy link
Collaborator Author

I'm starting to investigate this further. My question is, how does it exactly determine the onset, nucleus, and coda of a syllable? (as generated by the textprotos). If we can look into the code that does that, maybe we can get a clue.

@kylebgorman
Copy link
Contributor

See here. I have no clue how to translate this into useful pure-FST logic. It is some of the hardest code I've ever written.

One of the many motivations for creating Pynini is that sometimes we need to (or at least very much want to) mix declarative grammar-defining logic with general-purpose imperative logic that works on specific inputs, like you see there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants