We can customize parsing by providing different options to the parser;
const {XMLParser} = require('fast-xml-parser');
const options = {
ignoreAttributes : false
};
const parser = new XMLParser(options);
let jsonObj = parser.parse(xmlDataStr);
Let's understand each option in detail with necessary examples
To allow attributes without value.
By default boolean attributes are ignored
const xmlDataStr = `<root a="nice" checked><a>wow</a></root>`;
const options = {
ignoreAttributes: false,
attributeNamePrefix : "@_"
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"root": {
"@_a": "nice",
"a": "wow"
}
}
Once you allow boolean attributes, they are parsed and set to true
.
const xmlDataStr = `<root a="nice" checked><a>wow</a></root>`;
const options = {
ignoreAttributes: false,
attributeNamePrefix : "@_",
allowBooleanAttributes: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"root": {
"@_a": "nice",
"@_checked": true,
"a": "wow"
}
}
FXP creates text node always when preserveOrder: true
. Otherwise, it creates a property with the tag name and assign the value directly.
You can force FXP to render a tag with textnode using this option.
Eg
const xmlDataStr = `
<root a="nice" checked>
<a>wow</a>
<a>
wow again
<c> unlimited </c>
</a>
<b>wow phir se</b>
</root>`;
const options = {
ignoreAttributes: false,
// alwaysCreateTextNode: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output when alwaysCreateTextNode: false
{
"root": {
"a": [
"wow",
{
"c": "unlimited",
"#text": "wow again"
}
],
"b": "wow phir se",
"@_a": "nice"
}
}
Output when alwaysCreateTextNode: true
{
"root": {
"a": [
{
"#text": "wow"
},
{
"c": {
"#text": "unlimited"
},
"#text": "wow again"
}
],
"b": {
"#text": "wow phir se"
},
"@_a": "nice"
}
}
To group all the attributes of a tag under given property name.
const xmlDataStr = `<root a="nice" b="very nice" ><a>wow</a></root>`;
const options = {
ignoreAttributes: false,
attributeNamePrefix : "@_",
attributesGroupName : "@_"
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"root": {
"@_": {
"@_a": "nice",
"@_b": "very nice"
},
"a": "wow"
}
}
To recognize attributes in the JS object separately. You can prepend some string with each attribute name.
const xmlDataStr = `<root a="nice" ><a>wow</a></root>`;
const options = {
ignoreAttributes: false,
attributeNamePrefix : "@_"
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"root": {
"@_a": "nice",
"a": "wow"
}
}
Similar to tagValueProcessor
but applied to attributes.
Eg
:
const options = {
ignoreAttributes: false,
attributeValueProcessor: (name, val, jPath) => {
:
}
};
:
If cdataPropName
is not set to some property name, then CDATA values are merged to tag's text value.
Eg
const xmlDataStr = `
<a>
<name>name:<![CDATA[<some>Jack</some>]]><![CDATA[Jack]]></name>
</a>`;
const parser = new XMLParser();
const output = parser.parse(xmlDataStr);
Output
{
"a": {
"name": "name:<some>Jack</some>Jack"
}
}
When cdataPropName
is set to some property then text value and CDATA are parsed to different properties.
Example 2
const options = {
cdataPropName: "__cdata"
}
Output
{
"a": {
"name": {
"__cdata": [
"<some>Jack</some>",
"Jack"
],
"#text": "name:"
}
}
}
If commentPropName
is set to some property name, then comments are parsed from XML.
Eg
const xmlDataStr = `
<!--Students grades are uploaded by months-->
<class_list>
<student>
<!--Student details-->
<!--A second comment-->
<name>Tanmay</name>
<grade>A</grade>
</student>
</class_list>`;
const options = {
commentPropName: "#comment"
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"#comment": "Students grades are uploaded by months",
"class_list": {
"student": {
"#comment": [
"Student details",
"A second comment"
],
"name": "Tanmay",
"grade": "A"
}
}
}
FXP by default parse XML entities if processEntities: true
. You can set htmlEntities
to parse HTML entities. Check entities section for more information.
By default, ignoreAttributes
is set to true
. This means that attributes are ignored by the parser. If you set any configuration related to attributes without setting ignoreAttributes: false
, it will not have any effect.
You can specify an array of strings, regular expressions, or a callback function to selectively ignore specific attributes during parsing or building.
<tag
ns:attr1="a1-value"
ns:attr2="a2-value"
ns2:attr3="a3-value"
ns2:attr4="a4-value">
value
</tag>
You can use the ignoreAttributes
option in three different ways:
- Array of Strings: Ignore specific attributes by name.
- Array of Regular Expressions: Ignore attributes that match a pattern.
- Callback Function: Ignore attributes based on custom logic.
const options = {
attributeNamePrefix: "$",
ignoreAttributes: ['ns:attr1', 'ns:attr2'],
parseAttributeValue: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlData);
Result:
{
"tag": {
"#text": "value",
"$ns2:attr3": "a3-value",
"$ns2:attr4": "a4-value"
}
}
const options = {
attributeNamePrefix: "$",
ignoreAttributes: [/^ns2:/],
parseAttributeValue: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlData);
Result:
{
"tag": {
"#text": "value",
"$ns:attr1": "a1-value",
"$ns:attr2": "a2-value"
}
}
const options = {
attributeNamePrefix: "$",
ignoreAttributes: (aName, jPath) => aName.startsWith('ns:') || jPath === 'tag.tag2',
parseAttributeValue: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlData);
Result:
{
"tag": {
"$ns2:attr3": "a3-value",
"$ns2:attr4": "a4-value",
"tag2": "value"
}
}
As many users want to ignore XML declaration tag to be ignored from parsing output, they can use ignoreDeclaration: true
.
Eg
const xmlDataStr = `<?xml version="1.0"?>
<?elementnames <fred>, <bert>, <harry> ?>
<h1></h1>`;
const options = {
ignoreDeclaration: true,
attributeNamePrefix : "@_"
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"?elementnames": "",
"h1": ""
}
As many users want to ignore PI tags to be ignored from parsing output, they can use ignorePiTags: true
.
Eg
const xmlDataStr = `<?xml version="1.0"?>
<?elementnames <fred>, <bert>, <harry> ?>
<h1></h1>`;
const options = {
ignoreDeclaration: true,
ignorePiTags: true,
attributeNamePrefix : "@_"
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"h1": ""
}
Whether a single tag should be parsed as an array or an object, it can't be decided by FXP. Hence isArray
method can help users to take the decision if a tag should be parsed as an array.
Eg
const xmlDataStr = `
<root a="nice" checked>
<a>wow</a>
<a>
wow again
<c> unlimited </c>
</a>
<b>wow phir se</b>
</root>`;
const alwaysArray = [
"root.a.c",
"root.b"
];
const options = {
ignoreAttributes: false,
//name: is either tagname, or attribute name
//jPath: upto the tag name
isArray: (name, jpath, isLeafNode, isAttribute) => {
if( alwaysArray.indexOf(jpath) !== -1) return true;
}
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"root": {
"@_a": "nice",
"a": [
"wow",
{
"#text": "wow again",
"c": [
"unlimited"
]
}
],
"b": [
"wow phir se"
]
}
}
FXP uses strnum library to parse string into numbers. This property allows you to set configuration for strnum package.
Eg
const xmlDataStr = `
<root>
<a>-0x2f</a>
<a>006</a>
<a>6.00</a>
<a>-01.0E2</a>
<a>+1212121212</a>
</root>`;
const options = {
numberParseOptions: {
leadingZeros: true,
hex: true,
skipLike: /\+[0-9]{10}/,
// eNotation: false
}
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"root": {
"a": [ -47, 6, 6, -100, "+1212121212" ],
}
}
parseAttributeValue: true
option can be set to parse attributes value. This option uses strnum package to parse numeric values. For more controlled parsing check numberParseOptions
option.
Eg
const xmlDataStr = `
<root a="nice" checked enabled="true" int="32" int="34">
<a>wow</a>
</root>`;
const options = {
ignoreAttributes: false,
// parseAttributeValue: true,
allowBooleanAttributes: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"root": {
"a": "wow",
"@_a": "nice",
"@_checked": true,
"@_enabled": "true",
"@_int": "34"
}
}
Boolean attributes are always parsed to true
. XML Parser doesn't give error if validation is kept off so it'll overwrite the value of repeated attributes.
When parseAttributeValue: true
{
"root": {
"a": "wow",
"@_a": "nice",
"@_checked": true,
"@_enabled": true,
"@_int": 34
}
}
When validation options are set or true
.
//..
const output = parser.parse(xmlDataStr, {
allowBooleanAttributes: true
});
//..
Output
Error: Attribute 'int' is repeated.:1:48
parseTagValue: true
option can be set to parse tags value. This option uses strnum package to parse numeric values. For more controlled parsing check numberParseOptions
option.
Eg
const xmlDataStr = `
<root>
35<nested>34</nested>
</root>`;
const options = {
parseTagValue: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"root": 35
}
Example 2 Input
<root>
35<nested>34</nested>
</root>
Output
{
"root": {
"nested": 34,
"#text": 35
}
}
Example 3 Input
<root>
35<nested>34</nested>46
</root>
Output
{
"root": {
"nested": 34,
"#text": "3546"
}
}
When preserveOrder: true
, you get some long ugly result. That's because this option is used to keep the order of tags in result js Object. It also helps XML builder to build the similar kind of XML from the js object without losing information.
const XMLdata = `
<!--Students grades are uploaded by months-->
<class_list standard="3">
<student>
<!--Student details-->
<!--A second comment-->
<name>Tanmay</name>
<grade>A</grade>
</student>
</class_list>`;
const options = {
commentPropName: "#comment",
preserveOrder: true
};
const parser = new XMLParser(options);
let result = parser.parse(XMLdata);
[
{
"#comment": [
{ "#text": "Students grades are uploaded by months" }
]
},
{
"class_list": [
{
"student": [
{
"#comment": [
{ "#text": "Student details" }
]
},
{
"#comment": [
{ "#text": "A second comment" }
]
},
{
"name": [
{ "#text": "Tanmay" }
]
},
{
"grade": [
{ "#text": "A" }
]
}
]
}
],
":@": {
"standard" : "3"
}
}
]
Set it to true
(default) to process default and DOCTYPE entities. Check Entities section for more detail. If you don't have entities in your XML document then it is recommended to disable it processEntities: false
for better performance.
Remove namespace string from tag and attribute names.
Default is removeNSPrefix: false
const xmlDataStr = `<root some:a="nice" ><any:a>wow</any:a></root>`;
const options = {
ignoreAttributes: false,
attributeNamePrefix : "@_"
//removeNSPrefix: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"root": {
"@_some:a": "nice",
"any:a": "wow"
}
}
Setting removeNSPrefix: true
const xmlDataStr = `<root some:a="nice" ><any:a>wow</any:a></root>`;
const options = {
ignoreAttributes: false,
attributeNamePrefix : "@_",
removeNSPrefix: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"root": {
"@_a": "nice",
"a": "wow"
}
}
At particular point, if you don't want to parse a tag and it's nested tags then you can set their path in stopNodes
.
Eg
const xmlDataStr = `
<root a="nice" checked>
<a>wow</a>
<a>
wow again
<c> unlimited </c>
</a>
<b>wow phir se</b>
</root>`;
const options = {
stopNodes: ["root.a"]
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"root": {
"a": [
"wow",
"\n wow again\n <c> unlimited </c>\n "
],
"b": "wow phir se",
}
}
You can also mention a tag which should not be processed irrespective of their path. Eg. <pre>
or <script>
in an HTML document.
const options = {
stopNodes: ["*.pre", "*.script"]
};
Note that a stop node should not have same closing node in contents. Eg
<stop>
invalid </stop>
</stop>
nested stop notes are also not allowed
<stop>
<stop> invalid </stop>
</stop>
With tagValueProcessor
you can control how and which tag value should be parsed.
- If
tagValueProcessor
returnsundefined
ornull
then original value would be set without parsing. - If it returns different value or value with different data type then new value would be set without parsing.
- Otherwise original value would be set after parsing (if
parseTagValue: true
) - if tag value is empty then
tagValueProcessor
will not be called.
Eg
const xmlDataStr = `
<root a="nice" checked>
<a>wow</a>
<a>
wow again
<c> unlimited </c>
</a>
<b>wow phir se</b>
</root>`;
const options = {
ignoreAttributes: false,
tagValueProcessor: (tagName, tagValue, jPath, hasAttributes, isLeafNode) => {
if(isLeafNode) return tagValue;
return "";
}
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"root": {
"a": [
{
"#text": "wow",
"@_a": "2"
},
{
"c": {
"#text": "unlimited",
"@_a": "2"
},
"@_a": "2"
}
],
"b": {
"#text": "wow phir se",
"@_a": "2"
},
"@_a": "nice"
}
}
Text value of a tag is parsed to #text
property by default. You can always change this.
Eg
const xmlDataStr = `
<a>
text<b>alpha</b>
</a>`;
const options = {
ignoreAttributes: false,
textNodeName: "$text"
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"a": {
"b": "alpha",
"$text": "text"
}
}
transformTagName
property function let you change the name of a tag as per your need. Eg 'red-code' can be transformed to 'redCode' or 'redCode' can be transformed to 'redcode'.
transformTagName: (tagName) => tagName.toLowerCase()
transformAttributeName
property function let you modify the name of attribute
transformAttributeName: (attributeName) => attributeName.toLowerCase()
Remove surrounding whitespace from tag or attribute value.
Eg
const xmlDataStr = `
<root attri=" ibu te ">
35 <nested> 34</nested>
</root>`;
const options = {
ignoreAttributes: false,
parseTagValue: true, //default
trimValues: false
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"root": {
"nested": 34,
"#text": "\n 35 \n ",
"@_attri": " ibu te "
}
}
If the tag value consists of whitespace only and trimValues: false
then value will not be parsed even if parseTagValue:true
. Similarly, if trimValues: true
and parseTagValue:false
then surrounding whitespace will be removed.
const options = {
ignoreAttributes: false,
parseTagValue: false,
trimValues: true //default
};
Output
{
"root": {
"nested": "34",
"#text": "35",
"@_attri": "ibu te"
}
}
Unpaired Tags are the tags which don't have matching closing tag. Eg <br>
in HTML. You can parse unpaired tags by providing their list to the parser, validator and builder.
Eg
const xmlDataStr = `
<rootNode>
<tag>value</tag>
<empty />
<unpaired>
<unpaired />
<unpaired>
</rootNode>`;
const options = {
unpairedTags: ["unpaired"]
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);
Output
{
"rootNode": {
"tag": "value",
"empty": "",
"unpaired": [ "", "", ""]
}
}
Note: Unpaired tag can't be used as closing tag e.g. </unpaired>
is not allowed.
This property allow you to change tag name when you send different name, skip a tag from the parsing result when you return false, or to change attributes.
const xmlDataStr = `
<rootNode>
<a at="val">value</a>
<b></b>
</rootNode>`;
const options = {
attributeNamePrefix: "",
ignoreAttributes: false,
updateTag(tagName, jPath, attrs){
attrs["At"] = "Home";
delete attrs["at"];
if(tagName === "a") return "A";
else if(tagName === "b") return false;
}
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);