Conversion Error: found 0 References section(s) but expected one and only one #65

jkrybicki · 2018-04-13T18:32:52Z

Uploading an odt file (attached in zip) produces the above error; uploading a docx file (attached) produces this error: ERROR: nu.xom.ParsingException: cvc-datatype-valid.1.2.1: '30j0zll' is not a valid value for 'NCName'.
Rybicki_Polysystem.docx
Rybicki_Polysystem.zip

dodinh · 2018-09-28T13:27:50Z

I also stumbled upon this problem (at least the DOCX part). The problem can easily be solved on the user side, as follows:

Open the DOCX in Word
Click somewhere editable
Ribbon-band > Insert > Bookmarks
Activate "Show hidden bookmarks"
Remove all bookmarks
Save and upload

On the dev side: I have no idea how hard it is to exclude bookmarks from validation, or if they even produce problems with the Convalidator.

jkrybicki · 2018-09-29T09:25:54Z

Thanks!!! Jan From: Erik-Lân Do Dinh Sent: Friday, September 28, 2018 15:27 To: ADHO/dhconvalidator Cc: Jan Rybicki; Author Subject: Re: [ADHO/dhconvalidator] Conversion Error: found 0 Referencessection(s) but expected one and only one (#65) I also stumbled upon this problem (at least the DOCX part). The problem can easily be solved on the user side, as follows: 1. Open the DOCX in Word 2. Click somewhere editable 3. Ribbon-band > Insert > Bookmarks 4. Activate "Show hidden bookmarks" 5. Remove all bookmarks 6. Save and upload On the dev side: I have no idea how hard it is to exclude bookmarks from validation, or if they even produce problems with the Convalidator. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

mpetris · 2018-10-11T10:28:27Z

On the dev side: I have no idea how hard it is to exclude bookmarks from validation, or if they even produce problems with the Convalidator.

Since the error occurs during validation I think it is not a conversion problem. But the bookmarks should be excluded from the results nevertheless.
There are three places where this could happen:

in the docx XSLT of the conversion profile of the dhconvalidator in the TEI Stylesheets
in the DocxInputConverter
in the DocxOutputConverter

For an experienced XSLT programmer 1. would be a good option. The advantage is that one would get the bookmark handling even when working directly with the Stylesheets or the Oxgarage API. The disadvantage is that this XSLT development can get pretty complex and one would need strong tool support like the Oxygen debugger to handle it. My impression was that the DOCX Stylesheet is designed to convert as much as possible from the original DOCX to TEI. Validation was not a major concern. This has the advantage that even seldom used stuff gets converted but the disadvantage that all unwanted stuff even when seldom used needs to be excluded afterwards.
To decide between 2. and 3. one would need to look at the XML to decide which conversion state would be easier to tweak. With option 2. one would work on the DOCX XML directly which can be awful. With option 3. one would work on the TEI XML which is generally easier.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conversion Error: found 0 References section(s) but expected one and only one #65

Conversion Error: found 0 References section(s) but expected one and only one #65

jkrybicki commented Apr 13, 2018

dodinh commented Sep 28, 2018

jkrybicki commented Sep 29, 2018 via email

mpetris commented Oct 11, 2018

Conversion Error: found 0 References section(s) but expected one and only one #65

Conversion Error: found 0 References section(s) but expected one and only one #65

Comments

jkrybicki commented Apr 13, 2018

dodinh commented Sep 28, 2018

jkrybicki commented Sep 29, 2018 via email

mpetris commented Oct 11, 2018