Fix penultimate sections of multiline fragments being ignored by the Markdown renderer #201

Lordfirespeed · 2023-11-21T19:33:49Z

When using the Markdown renderer, if a fragment containing newline characters is encountered, it is split by \n and each line is yielded in turn.

 for fragment in fragments:
    if "\n" in fragment.text:
        lines = fragment.text.split("\n")
        yield current_line + lines[0]
        for inner_line in lines[1:-2]:
            yield inner_line
        current_line = lines[-1]
    else:
        current_line += fragment.text

However, with a quick check in the python REPL we find:

>>> "abcde"[0] 
'a'
>>> "abcde"[1:-2]
'bc'
>>> "abcde"[-1]
'e'

Notably, d, the penultimate element, is being skipped.
This can be fixed by changing the bounds of the 'inner' slice from [1:-2] to [1:-1].

coveralls · 2023-11-24T19:53:41Z

coverage: 94.113% (+0.04%) from 94.073%
when pulling 7aaad0d on Lordfirespeed:patch-1
into 70e8e8d on miyuchina:master.

pbodnar · 2023-11-24T20:10:21Z

@Lordfirespeed, good catch, thank you. Would you mind to also add a unit test on this? I'm a little bit struggling to find a use case where a Fragment would be created with text including \n, i.e. when the problematic part of code would be called "in real life". (Or maybe @anderskaplan as the author of MarkdownRenderer will know? :))

Lordfirespeed · 2023-11-24T22:43:10Z

My case is a bit of an odd one, apologies!

I wanted to parse LaTeX so that I could use regex inside latex blocks to replace command sequences e.g.

\R -> \mathbb{R}
\rArr -> \Rightarrow

Because I was trying to migrate my notes from KaTeX to MathJax.

I 'wrote' this very basic class:

from mistletoe.markdown_renderer import MarkdownRenderer
from mistletoe.latex_token import Math
from itertools import chain


class MathMarkdownRenderer(MarkdownRenderer):
    def __init__(self, *extras, **kwargs):
        super().__init__(
            *chain(
                (Math,),
                extras,
            ),
            **kwargs
        )

    def render_math(self, token):
        return self.render_raw_text(token)

And used it like so:

from mistletoe import Document as MarkdownDocument
from math_markdown_renderer import MathMarkdownRenderer

with MathMarkdownRenderer() as renderer:
    with open(file, "r") as infile:
        document = MarkdownDocument(infile)
    # I made changes to the parsed document here
    render = renderer.render(document)

I found that multiline LaTeX would almost always contain newline characters, e.g. the following:

$$
a=b
$$

would yield a Math token with the following content:

"$$\na=b\n$$"

So, in short, yes, I can add a unit test; but not without also adding functionality to the library 😅

anderskaplan · 2023-11-25T19:11:22Z

Good catch!

The easiest way to trigger the bug in a unit test would be to modify one of the existing tests. For example:

    def test_images_and_links(self):
        input = [
            "[a link](#url (title))\n",
            "[another link](<url-in-angle-brackets> '*emphasized\n",
            "multi-line\n",
            "title*')\n",
            '![an \\[*image*\\], escapes and emphasis](#url "title")\n',
            "<http://auto.link>\n",
        ]
        output = self.roundtrip(input)
        self.assertEqual(output, "".join(input))

(The only change is that the line "multi-line" has been added. The bug will only trigger when there are more than two rows.)

pbodnar · 2023-11-25T19:36:15Z

@anderskaplan, thanks a lot, great to have you back! :)

@Lordfirespeed, so I guess you can just add "multi-line\n", to the existing unit test as Anders suggests, and that's it. I would merge the PR afterwards.

Lordfirespeed · 2023-11-25T21:33:29Z

@anderskaplan, thanks a lot, great to have you back! :)

@Lordfirespeed, so I guess you can just add "multi-line\n", to the existing unit test as Anders suggests, and that's it. I would merge the PR afterwards.

I'm not sure that particular example would cause the bug to happen as [0] and [-1] are both handled specifically, so the multiline fragment needs to contain at least two newlines - but I get what you mean! I'll get on that.

Edit: Oh! sorry, never mind. I was confused.

Lordfirespeed · 2023-11-25T21:52:39Z

All done! I have not confirmed the unit test functions as intended as I don't have a codespace open at the moment.

pbodnar · 2023-11-26T16:00:05Z

Thank you!

Fix penultimate sections of multiline fragments being ignored

d49199d

Lordfirespeed changed the title ~~Fix penultimate sections of multiline fragments being ignored~~ Fix penultimate sections of multiline fragments being ignored by the Markdown renderer Nov 21, 2023

pbodnar added the bug label Nov 24, 2023

unit test for ensuring multiline fragments are rendered correctly

7aaad0d

pbodnar merged commit 5f4c550 into miyuchina:master Nov 26, 2023
8 checks passed

pbodnar added this to the 1.3.0 milestone Nov 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix penultimate sections of multiline fragments being ignored by the Markdown renderer #201

Fix penultimate sections of multiline fragments being ignored by the Markdown renderer #201

Lordfirespeed commented Nov 21, 2023

coveralls commented Nov 24, 2023 •

edited

Loading

pbodnar commented Nov 24, 2023

Lordfirespeed commented Nov 24, 2023 •

edited

Loading

anderskaplan commented Nov 25, 2023

pbodnar commented Nov 25, 2023

Lordfirespeed commented Nov 25, 2023 •

edited

Loading

Lordfirespeed commented Nov 25, 2023

pbodnar commented Nov 26, 2023

Fix penultimate sections of multiline fragments being ignored by the Markdown renderer #201

Fix penultimate sections of multiline fragments being ignored by the Markdown renderer #201

Conversation

Lordfirespeed commented Nov 21, 2023

coveralls commented Nov 24, 2023 • edited Loading

pbodnar commented Nov 24, 2023

Lordfirespeed commented Nov 24, 2023 • edited Loading

anderskaplan commented Nov 25, 2023

pbodnar commented Nov 25, 2023

Lordfirespeed commented Nov 25, 2023 • edited Loading

Lordfirespeed commented Nov 25, 2023

pbodnar commented Nov 26, 2023

coveralls commented Nov 24, 2023 •

edited

Loading

Lordfirespeed commented Nov 24, 2023 •

edited

Loading

Lordfirespeed commented Nov 25, 2023 •

edited

Loading