Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CorpusChunking #130

Closed
Eumaeus opened this issue May 8, 2019 · 2 comments
Closed

CorpusChunking #130

Eumaeus opened this issue May 8, 2019 · 2 comments
Assignees

Comments

@Eumaeus
Copy link
Contributor

Eumaeus commented May 8, 2019

Useful both for analysis and presentation…

Given a corpus, chunk it into a Vector[Corpus] according to:

  • distinct texts
  • citation values (with a @groupLevel param)
    • groupLevel = 1 would give a Vector[Corpus] with 1 citable node in each… not so useful
    • groupLevel = 2 would give a Vector[Corpus] of 24 Corpora for the Iliad, e.g.
@Eumaeus Eumaeus self-assigned this May 8, 2019
@Eumaeus
Copy link
Contributor Author

Eumaeus commented May 10, 2019

Implemented in 10.13.0.

Param @levelsToGroup. If you have, e.g., and Iliad, you do @levelsToGroup=1, and you will get 24 Corpus-objects. If you have a tokenized Iliad, you would want to do @levelsToGroup=2, to get 24 Corpus-objects, one for each book, with the lines and their tokens included.

In v.10.13.0 in the May_2019 branch.

@neelsmith
Copy link
Contributor

Closing since this has been merged into master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants