You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was trying to parse the pubmed baseline xml from https://ftp.ncbi.nlm.nih.gov/pubmed/baseline/ with the pp.parse_medline_xml function. But every second file I get an syntax error:
File "/home/xxx/.local/lib/python3.12/site-packages/pubmed_parser/medline_parser.py", line 751, in parse_medline_xml
for event, element in etree.iterparse(f, events=("end",)):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "src/lxml/iterparse.pxi", line 210, in lxml.etree.iterparse.__next__
File "src/lxml/iterparse.pxi", line 195, in lxml.etree.iterparse.__next__
File "src/lxml/iterparse.pxi", line 230, in lxml.etree.iterparse._read_more_events
File "src/lxml/parser.pxi", line 1379, in lxml.etree._FeedParser.feed
File "src/lxml/parser.pxi", line 609, in lxml.etree._ParserContext._handleParseResult
File "src/lxml/parser.pxi", line 618, in lxml.etree._ParserContext._handleParseResultDoc
File "src/lxml/parser.pxi", line 728, in lxml.etree._handleParseResult
File "src/lxml/parser.pxi", line 657, in lxml.etree._raiseParseError
File "/home/xxx/Downloads/xml_files_baseline/pubmed24n0007.xml.gz", line 1538867
lxml.etree.XMLSyntaxError: Opening and ending tag mismatch: ArticleTitle line 1538866 and ArticleId, line 1538867, column 39
If I try to parse the same file with xmltodict it works.
The text was updated successfully, but these errors were encountered:
Hi there,
I was trying to parse the pubmed baseline xml from https://ftp.ncbi.nlm.nih.gov/pubmed/baseline/ with the
pp.parse_medline_xml
function. But every second file I get an syntax error:If I try to parse the same file with
xmltodict
it works.The text was updated successfully, but these errors were encountered: