-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with line-breaks tags #58
Comments
Hi isaring, interestingly, Beautifulsoup, the HTML parser we use, parses your code as Keep us updated if you learn something new! |
Did you notice? there's an extra "<" before the fives. |
Oops, that's just a mistyping of my own! |
@isaring: you can convert your markdown yourself using the html5lib parser and use markdownify.MarkdownConverter to convert your html (see https://replit.com/@mathieud/DependentThinDowngrade#main.py).
@AlexVonB: So it's not a bs4 issue, it's a parser problem. So should the parser at https://github.com/matthewwithanm/python-markdownify/blob/develop/markdownify/__init__.py#L96 be upgraded to |
There indeed seems to be some kind of bug in the
But if a mix is used, then
whereas the other parsers do not:
Beautiful Soup tries to choose the best available HTML parser by default:
It might be best to use its default behavior by default, but implement a Markdownify option that allows a particular parser to be explicitly requested. |
Hi,
I'm facing an issue with line-breaks tags when they are written like
<br/>
instead of<br>
.Considering this simple example:
Expected:
'11111 \n22222 \n33333 \n44444 \n55555 \n\n'
Actual:
'11111 \n22222 \n33333 \n\n\n'
My workaround is to
.replace('<br/>','<br>')
but it's a little pity...Could you fix this in a future release?
Regards,
The text was updated successfully, but these errors were encountered: