Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests for Danish translation #3

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Tests for Danish translation #3

wants to merge 1 commit into from

Conversation

bertfrees
Copy link
Member

Includes:

  • uncontracted braille
  • 8-dot braille
  • contracted braille
  • emphasis indicators

Addresses requirements:

Related issues:

@bertfrees bertfrees added this to the Priority 1 milestone Jan 13, 2015
@bertfrees bertfrees modified the milestones: danish (1), (1) Mar 16, 2015
@bertfrees
Copy link
Member Author

@stesk You said you had already started looking into liblouis tests. You looked at how Jukka did it and you think you can do the same for Danish. That would be great! But note that if the test data is not in the JSON format that's perfectly fine too.

Jukka only did tests for uncontracted braille. 8-dot braille and contracted braille was not a requirement for Finnish. We also have test data for emphasis indicators (see http://snaekobbi.github.io/requirements/finnish#4.3:38). Note that the test is not written in the liblouis JSON format (because this feature is not implemented with liblouis alone), but the principle remains the same: input + expected output.

For Danish I think contracted braille is a requirement so we probably need some additional data for that. However implementing/testing contracted braille can be tricky because for a perfect coverage you may need a list of thousands of correctly translated words if the braille code is more dictionary based than rule based. We have several options:

  1. The braille code for Danish has very strict rules and doesn't require a dictionary based implementation which means we don't need a large amount of data.
  2. A dictionary based implementation is required but you already have all the data we need.
  3. You create the required data with the help of transcribers and/or by analyzing transcribed books. This can be a huge job.
  4. You work closely together with the current maintainer of the liblouis table for Danish (Bue Vester-Andersen) who I believe has gathered a lot of data throughout the years and maybe wants to share it with you.
  5. You don't care about test data, and you realize in this case we can't really guarantee the correctness of the implementation.

From what I know about Danish braille the rules for contracted braille are related to hyphenation. Bue's liblouis table is largely based on hyphenation data. So possibly this issue is closely related to issue #8 (Tests for Danish hyphenation).

It would be great if any official documentation you have about the Danish braille code could be made publicly available, so that it can be referenced from code, test data, etc. For example, these are documents from NLB about the Norwegian braille code: https://github.com/liblouis/liblouis/tree/formal_braille_spec/norwegian. You think we can have a similar page for Danish? (see issue snaekobbi/liblouis#7)

In case you want to discuss anything I'm always on Skype and IRC (channel #snaekobbi).

@bertfrees bertfrees assigned stesk and unassigned oleholstandersen May 18, 2015
@stesk
Copy link

stesk commented May 19, 2015

@bertfrees I'm on it. For now I've contacted Bue regarding the state of the Danish translation tables in liblouis. I will then try to figure out, probably by asking more knowledgeable colleagues, how to collect representative test data.

@bertfrees
Copy link
Member Author

@stesk will convert the examples in his documentation (https://github.com/stesk/danishbraille) to liblouis harness tests.

In addition, we already have a large amount of test data (dictionary tests) thanks to @BueVest.

Nota has enough confidence in this data.

bertfrees added a commit to snaekobbi/liblouis that referenced this pull request Jul 1, 2015
@bertfrees
Copy link
Member Author

There is currently no definition of 8-dot braille. Need to find out whether defining and/or implementing are in scope. If yes, what is the priority.

@bertfrees
Copy link
Member Author

Correction: of course there is a definition of 8-dot braille. Bue even made an implementation in liblouis, see above.

bertfrees added a commit to liblouis/liblouis that referenced this pull request Jul 13, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants