Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious irefs added after section when abbrev links to heading #203

Open
CxRes opened this issue Sep 11, 2023 · 17 comments
Open

Spurious irefs added after section when abbrev links to heading #203

CxRes opened this issue Sep 11, 2023 · 17 comments

Comments

@CxRes
Copy link

CxRes commented Sep 11, 2023

This is a complicated one to explain. Please see the attached test case. When the *[Accept-Events]: abbrev links to #foo (L94), an id attached to a section heading, a spurious <iref> of a previously defined definition is introduced after the section heading Foo. Replace the #foo with an inline id, say #a-e (at the end of the previous para), the <iref> vanishes.

Further, instances of Accept-Events are not wrapped unlike other definitions. Again, this is remedied if the id is changed from #foo to #a-e.

Original:
draft-gupta-httpbis-per-resource-events-test-2.md

Output:
draft-gupta-httpbis-per-resource-events-test-2.xml.txt


Also, it seems the conversion wraps code elements (inside backticks) are wrapped with <spanx> and not <tt> even with -3 flag. Thus, the engine does not replace any Accept-Event instances that are wrapped in back ticks (even when linking to a non-heading id).

Actual output <spanx style="verb">Accept-Events</spanx>

Desired output <tt><iref item="Accept-Events"/><xref format="none" target="a-e">Accept-Events</xref></tt>

@CxRes
Copy link
Author

CxRes commented Sep 22, 2023

@cabo If you are back at work, can you please have a look at this issue. Thanks!

@cabo
Copy link
Owner

cabo commented Sep 22, 2023

Also, it seems the conversion wraps code elements (inside backticks) are wrapped with <spanx> and not <tt> even with -3 flag. Thus, the engine does not replace any Accept-Event instances that are wrapped in back ticks (even when linking to a non-heading id).

Actual output <spanx style="verb">Accept-Events</spanx>

Desired output <tt><iref item="Accept-Events"/><xref format="none" target="a-e">Accept-Events</xref></tt>

Unfortunately, code elements (backticks) are protected from the *[...] support that kramdown provides.
I don't think this is easy to change, as backticks have several such protection semantics, and disabling the protection just for this processing may not be easy.

Looking at the other part of the report now; I'm not sure yet I understand what you expect to see.

@cabo
Copy link
Owner

cabo commented Sep 22, 2023

You can work around the protection afforded by backticks by manually adding the <tt> and </tt> tags around the phrase, as in

<tt>Accept-Event</tt>

@CxRes
Copy link
Author

CxRes commented Sep 22, 2023

Looking at the other part of the report now; I'm not sure yet I understand what you expect to see.

Corresponding to line 94 to 96 in the markdown: In line 105 of the XML, the following <iref item='application client' primary='true'/> should not exist.

You can work around the protection afforded by backticks by manually adding the and tags around the phrase

That is exactly what I am doing in my actual draft. Just makes the text unmarkdownish!

@cabo
Copy link
Owner

cabo commented Sep 22, 2023

Please try 1.6.42 for the weird shift-around.

I'll continue to look for ways to get code spans into autolink/iref, but I'm not optimistic.

@CxRes
Copy link
Author

CxRes commented Sep 22, 2023

Unfortunately, its not fixed.

New output on the same markdown
draft-gupta-httpbis-per-resource-events-test-2.2.xml.txt

My understanding is than an <iref> on line 105 should not exist. Further, Accept-Events are no longer getting replaced at all. See line 107 and 109 in the new draft and diff that with the old.

@cabo
Copy link
Owner

cabo commented Sep 22, 2023

So you don't want the functionality of autolink-iref-cleanup: true at all?
Set it to false then.

@CxRes
Copy link
Author

CxRes commented Sep 22, 2023

What does autolink-iref-cleanup do?

@cabo
Copy link
Owner

cabo commented Sep 22, 2023

What does autolink-iref-cleanup do?

It essentially suppresses auto-irefs inside the section that is also auto-linked (the #foo section in your case).
We considered it confusing for an occurrence inside a section to reference that very section in an auto-link.
Instead, the iref is moved up to that section, and the links are removed.

@CxRes
Copy link
Author

CxRes commented Sep 22, 2023

Interesting... you really ought to document that!

It did not occur to me to put Accept-Events in another section to test.

@cabo
Copy link
Owner

cabo commented Sep 22, 2023

Interesting... you really ought to document that!

Absolutely, and I'll keep this ticket open until that is done.

@CxRes
Copy link
Author

CxRes commented Sep 23, 2023

I'll continue to look for ways to get code spans into autolink/iref, but I'm not optimistic.

In that case, I would need your help finding some kind of hack. I am defining a field called "Events", which I am wrapping in <tt>, which I want indexed. However, I do not want every instance of the word Events to be indexed.

What I would like to do is this, but it does not work:

*[<tt>Events</tt>] #events-header (((Events)))

What I have tried is to use <tt>{{&zwsp}}Events</tt> and then define the index replacement as:

*[{{&zwsp}}Events] #events-header (((Events)))
*[&zwsp;Events] #events-header (((Events)))
*[<zero-space-typed-out-in-unicode>Events] #events-header (((Events)))

None of these have worked! Any ideas?

(I am happy to take this discussion elsewhere if you feel it is not appropriate here)

@cabo
Copy link
Owner

cabo commented Sep 23, 2023

The abbrev feature of upstream kramdown does not mix well with entity references. If you actually type out the zwsp (or any other character), it works.

Of course, with many editors these native zwsps are hard to see in the manuscript, and they get in the way in the HTML form (copy/paste gets them, searches don't always work, etc.). So I'm thinking of providing a way to designate certain Unicode characters as to be suppressed in the output, so you can use them for such a purpose and remove them from the final XML.

So then you would

*[⁜Events]: #events-header (((Events)))

This sentence is about the ⁜Events that need to be cited in the index, and other Events that don't.

and declare:

character:
  ⁜: [delete]

in the YAML header so the ⁜ is not actually shown after processing.
(This would go with a general Unicode character feature, which will allow to warn about unexpected characters (e.g., typographic quotes) etc. , which needs a way to declare non-ASCII-characters as "expected".)

Feedback welcome...

(I'm also still thinking about a way to have abbrevs reach into backticks, but my simple prototype for this doesn't work yet.)

@CxRes
Copy link
Author

CxRes commented Sep 23, 2023

IMHO, the cleanest way to solve this would be to preserve Kramown's abbreviation feature. So then you could do:

*[=Events=]: #events-header (((Events))) Events

This sentence is about the <tt>=Events=</tt> that need to be cited in the index, and other Events that don't.

or even

*[=Events=]: <tt> #events-header (((Events))) Events </tt>

This sentence is about the =Events= that need to be cited in the index, and other Events that don't.

I don't know if this is feasible to implement. Also, this would be a breaking change, but it could be possibly be implemented under a flag! It also circumvents the backticks problem. What do you think?

@cabo
Copy link
Owner

cabo commented Sep 23, 2023

Experimental feature:
In 1.6.43, you can now use

*[=Events=]: #header-fields (((Events))) `Events`

Any text added to an abbrev definition beyond what is already allowed now
becomes replacement text (in markdown form) for what was matched.

Known bugs: Combining the <nobr> flag with complex XML replacements
can crash kramdown-rfc or generate invalid XML output at this point.

@CxRes
Copy link
Author

CxRes commented Sep 23, 2023

Just tested it out. This is AWESOME!

Thank you so much for fixing these issues. Much Appreciated!!!!!!!

@CxRes
Copy link
Author

CxRes commented Sep 23, 2023

I have added documentation for the flag in the Syntax2 wiki. Once you finish, I will add the experimental part as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants