Skip to content

Latest commit

 

History

History
1004 lines (884 loc) · 20.5 KB

2.XMLparseOptions.md

File metadata and controls

1004 lines (884 loc) · 20.5 KB

We can customize parsing by providing different options to the parser;

const {XMLParser} = require('fast-xml-parser');

const options = {
    ignoreAttributes : false
};

const parser = new XMLParser(options);
let jsonObj = parser.parse(xmlDataStr);

Let's understand each option in detail with necessary examples

allowBooleanAttributes

To allow attributes without value.

By default boolean attributes are ignored

    const xmlDataStr = `<root a="nice" checked><a>wow</a></root>`;

    const options = {
        ignoreAttributes: false,
        attributeNamePrefix : "@_"
    };
    const parser = new XMLParser(options);
    const output = parser.parse(xmlDataStr);

Output

{
    "root": {
        "@_a": "nice",
        "a": "wow"
    }
}

Once you allow boolean attributes, they are parsed and set to true.

const xmlDataStr = `<root a="nice" checked><a>wow</a></root>`;

const options = {
    ignoreAttributes: false,
    attributeNamePrefix : "@_",
    allowBooleanAttributes: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "root": {
        "@_a": "nice",
        "@_checked": true,
        "a": "wow"
    }
}

alwaysCreateTextNode

FXP creates text node always when preserveOrder: true. Otherwise, it creates a property with the tag name and assign the value directly.

You can force FXP to render a tag with textnode using this option.

Eg

const xmlDataStr = `
    <root a="nice" checked>
        <a>wow</a>
        <a>
          wow again
          <c> unlimited </c>
        </a>
        <b>wow phir se</b>
    </root>`;  

const options = {
    ignoreAttributes: false,
    // alwaysCreateTextNode: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output when alwaysCreateTextNode: false

{
    "root": {
        "a": [
            "wow",
            {
                "c": "unlimited",
                "#text": "wow again"
            }
        ],
        "b": "wow phir se",
        "@_a": "nice"
    }
}

Output when alwaysCreateTextNode: true

{
    "root": {
        "a": [
            {
                "#text": "wow"
            },
            {
                "c": {
                    "#text": "unlimited"
                },
                "#text": "wow again"
            }
        ],
        "b": {
            "#text": "wow phir se"
        },
        "@_a": "nice"
    }
}

attributesGroupName

To group all the attributes of a tag under given property name.

const xmlDataStr = `<root a="nice" b="very nice" ><a>wow</a></root>`;

const options = {
    ignoreAttributes: false,
    attributeNamePrefix : "@_",
    attributesGroupName : "@_"
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "root": {
        "@_": {
            "@_a": "nice",
            "@_b": "very nice"
        },
        "a": "wow"
    }
}

attributeNamePrefix

To recognize attributes in the JS object separately. You can prepend some string with each attribute name.

const xmlDataStr = `<root a="nice" ><a>wow</a></root>`;

const options = {
    ignoreAttributes: false,
    attributeNamePrefix : "@_"
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "root": {
        "@_a": "nice",
        "a": "wow"
    }
}

attributeValueProcessor

Similar to tagValueProcessor but applied to attributes.

Eg

:
const options = {
    ignoreAttributes: false,
    attributeValueProcessor: (name, val, jPath) => {
        :
    }
};
:

cdataPropName

If cdataPropName is not set to some property name, then CDATA values are merged to tag's text value.

Eg

const xmlDataStr = `
    <a>
        <name>name:<![CDATA[<some>Jack</some>]]><![CDATA[Jack]]></name>
    </a>`;  

const parser = new XMLParser();
const output = parser.parse(xmlDataStr);

Output

{
    "a": {
        "name": "name:<some>Jack</some>Jack"
    }
}

When cdataPropName is set to some property then text value and CDATA are parsed to different properties.

Example 2

const options = {
    cdataPropName:     "__cdata"
}

Output

{
    "a": {
        "name": {
            "__cdata": [
                "<some>Jack</some>",
                "Jack"
            ],
            "#text": "name:"
        }
    }
}

commentPropName

If commentPropName is set to some property name, then comments are parsed from XML.

Eg

const xmlDataStr = `
    <!--Students grades are uploaded by months-->
    <class_list>
       <student>
         <!--Student details-->
         <!--A second comment-->
         <name>Tanmay</name>
         <grade>A</grade>
       </student>
    </class_list>`;  

const options = {
    commentPropName: "#comment"
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "#comment": "Students grades are uploaded by months",
    "class_list": {
        "student": {
            "#comment": [
                "Student details",
                "A second comment"
            ],
            "name": "Tanmay",
            "grade": "A"
        }
    }
}

htmlEntities

FXP by default parse XML entities if processEntities: true. You can set htmlEntities to parse HTML entities. Check entities section for more information.

ignoreAttributes

By default, ignoreAttributes is set to true. This means that attributes are ignored by the parser. If you set any configuration related to attributes without setting ignoreAttributes: false, it will not have any effect.

Selective Attribute Ignoring

You can specify an array of strings, regular expressions, or a callback function to selectively ignore specific attributes during parsing or building.

Example Input XML

<tag
    ns:attr1="a1-value"
    ns:attr2="a2-value"
    ns2:attr3="a3-value"
    ns2:attr4="a4-value">
        value
</tag>

You can use the ignoreAttributes option in three different ways:

  1. Array of Strings: Ignore specific attributes by name.
  2. Array of Regular Expressions: Ignore attributes that match a pattern.
  3. Callback Function: Ignore attributes based on custom logic.

Example: Ignoring Attributes by Array of Strings

const options = {
    attributeNamePrefix: "$",
    ignoreAttributes: ['ns:attr1', 'ns:attr2'],
    parseAttributeValue: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlData);

Result:

{
    "tag": {
        "#text": "value",
        "$ns2:attr3": "a3-value",
        "$ns2:attr4": "a4-value"
    }
}

Example: Ignoring Attributes by Regular Expressions

const options = {
    attributeNamePrefix: "$",
    ignoreAttributes: [/^ns2:/],
    parseAttributeValue: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlData);

Result:

{
    "tag": {
        "#text": "value",
        "$ns:attr1": "a1-value",
        "$ns:attr2": "a2-value"
    }
}

Example: Ignoring Attributes via Callback Function

const options = {
    attributeNamePrefix: "$",
    ignoreAttributes: (aName, jPath) => aName.startsWith('ns:') || jPath === 'tag.tag2',
    parseAttributeValue: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlData);

Result:

{
    "tag": {
        "$ns2:attr3": "a3-value",
        "$ns2:attr4": "a4-value",
        "tag2": "value"
    }
}

ignoreDeclaration

As many users want to ignore XML declaration tag to be ignored from parsing output, they can use ignoreDeclaration: true.

Eg

const xmlDataStr = `<?xml version="1.0"?>
      <?elementnames <fred>, <bert>, <harry> ?>
      <h1></h1>`;

const options = {
    ignoreDeclaration: true,
    attributeNamePrefix : "@_"
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "?elementnames": "",
    "h1": ""
}

ignorePiTags

As many users want to ignore PI tags to be ignored from parsing output, they can use ignorePiTags: true.

Eg

const xmlDataStr = `<?xml version="1.0"?>
      <?elementnames <fred>, <bert>, <harry> ?>
      <h1></h1>`;

const options = {
    ignoreDeclaration: true,
    ignorePiTags: true,
    attributeNamePrefix : "@_"
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "h1": ""
}

isArray

Whether a single tag should be parsed as an array or an object, it can't be decided by FXP. Hence isArray method can help users to take the decision if a tag should be parsed as an array.

Eg

const xmlDataStr = `
    <root a="nice" checked>
        <a>wow</a>
        <a>
          wow again
          <c> unlimited </c>
        </a>
        <b>wow phir se</b>
    </root>`;

const alwaysArray = [
    "root.a.c",
    "root.b"
];
      

const options = {
    ignoreAttributes: false,
    //name: is either tagname, or attribute name
    //jPath: upto the tag name
    isArray: (name, jpath, isLeafNode, isAttribute) => { 
        if( alwaysArray.indexOf(jpath) !== -1) return true;
    }
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "root": {
        "@_a": "nice",
        "a": [
            "wow",
            {
                "#text": "wow again",
                "c": [
                    "unlimited"
                ]
            }
        ],
        "b": [
            "wow phir se"
        ]
    }
}

numberParseOptions

FXP uses strnum library to parse string into numbers. This property allows you to set configuration for strnum package.

Eg

const xmlDataStr = `
    <root>
        <a>-0x2f</a>
        <a>006</a>
        <a>6.00</a>
        <a>-01.0E2</a>
        <a>+1212121212</a>
    </root>`;  

const options = {
    numberParseOptions: {
        leadingZeros: true,
        hex: true,
        skipLike: /\+[0-9]{10}/,
        // eNotation: false
    }
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "root": {
        "a": [ -47, 6, 6, -100, "+1212121212" ],
    }
}

parseAttributeValue

parseAttributeValue: true option can be set to parse attributes value. This option uses strnum package to parse numeric values. For more controlled parsing check numberParseOptions option.

Eg

const xmlDataStr = `
    <root a="nice" checked enabled="true" int="32" int="34">
        <a>wow</a>
    </root>`;  

const options = {
    ignoreAttributes: false,
    // parseAttributeValue: true,
    allowBooleanAttributes: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "root": {
        "a": "wow",
        "@_a": "nice",
        "@_checked": true,
        "@_enabled": "true",
        "@_int": "34"
    }
}

Boolean attributes are always parsed to true. XML Parser doesn't give error if validation is kept off so it'll overwrite the value of repeated attributes.

When parseAttributeValue: true

{
    "root": {
        "a": "wow",
        "@_a": "nice",
        "@_checked": true,
        "@_enabled": true,
        "@_int": 34
    }
}

When validation options are set or true.

//..
const output = parser.parse(xmlDataStr, {
    allowBooleanAttributes: true
});
//..

Output

Error: Attribute 'int' is repeated.:1:48

parseTagValue

parseTagValue: true option can be set to parse tags value. This option uses strnum package to parse numeric values. For more controlled parsing check numberParseOptions option.

Eg

const xmlDataStr = `
    <root>
        35<nested>34</nested>
    </root>`;  

const options = {
    parseTagValue: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "root": 35
}

Example 2 Input

<root>
    35<nested>34</nested>
</root>

Output

{
    "root": {
        "nested": 34,
        "#text": 35
    }
}

Example 3 Input

<root>
    35<nested>34</nested>46
</root>

Output

{
    "root": {
        "nested": 34,
        "#text": "3546"
    }
}

preserveOrder

When preserveOrder: true , you get some long ugly result. That's because this option is used to keep the order of tags in result js Object. It also helps XML builder to build the similar kind of XML from the js object without losing information.

const XMLdata = `
    <!--Students grades are uploaded by months-->
    <class_list standard="3">
       <student>
         <!--Student details-->
         <!--A second comment-->
         <name>Tanmay</name>
         <grade>A</grade>
       </student>
    </class_list>`;

    const options = {
        commentPropName: "#comment",
        preserveOrder: true
    };
    const parser = new XMLParser(options);
    let result = parser.parse(XMLdata);
[
    {
        "#comment": [
            { "#text": "Students grades are uploaded by months" }
        ]
    },
    {
        "class_list": [
            {
                "student": [
                    {
                        "#comment": [
                            { "#text": "Student details" }
                        ]
                    },
                    {
                        "#comment": [
                            { "#text": "A second comment" }
                        ]
                    },
                    {
                        "name": [
                            { "#text": "Tanmay" }
                        ]
                    },
                    {
                        "grade": [
                            { "#text": "A" }
                        ]
                    }
                ]
            }
        ],
        ":@": {
            "standard" : "3"
        }
    }
]

processEntities

Set it to true (default) to process default and DOCTYPE entities. Check Entities section for more detail. If you don't have entities in your XML document then it is recommended to disable it processEntities: false for better performance.

removeNSPrefix

Remove namespace string from tag and attribute names.

Default is removeNSPrefix: false

const xmlDataStr = `<root some:a="nice" ><any:a>wow</any:a></root>`;

const options = {
    ignoreAttributes: false,
    attributeNamePrefix : "@_"
    //removeNSPrefix: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "root": {
        "@_some:a": "nice",
        "any:a": "wow"
    }
}

Setting removeNSPrefix: true

const xmlDataStr = `<root some:a="nice" ><any:a>wow</any:a></root>`;

const options = {
    ignoreAttributes: false,
    attributeNamePrefix : "@_",
    removeNSPrefix: true
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "root": {
        "@_a": "nice",
        "a": "wow"
    }
}

stopNodes

At particular point, if you don't want to parse a tag and it's nested tags then you can set their path in stopNodes.

Eg

const xmlDataStr = `
    <root a="nice" checked>
        <a>wow</a>
        <a>
          wow again
          <c> unlimited </c>
        </a>
        <b>wow phir se</b>
    </root>`;  

const options = {
    stopNodes: ["root.a"]
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "root": {
        "a": [
            "wow",
            "\n            wow again\n            <c> unlimited </c>\n          "
        ],
        "b": "wow phir se",
    }
}

You can also mention a tag which should not be processed irrespective of their path. Eg. <pre> or <script> in an HTML document.

const options = {
    stopNodes: ["*.pre", "*.script"]
};

Note that a stop node should not have same closing node in contents. Eg

<stop>
invalid </stop>
</stop>

nested stop notes are also not allowed

<stop>
    <stop> invalid </stop>
</stop>

tagValueProcessor

With tagValueProcessor you can control how and which tag value should be parsed.

  1. If tagValueProcessor returns undefined or null then original value would be set without parsing.
  2. If it returns different value or value with different data type then new value would be set without parsing.
  3. Otherwise original value would be set after parsing (if parseTagValue: true)
  4. if tag value is empty then tagValueProcessor will not be called.

Eg

const xmlDataStr = `
    <root a="nice" checked>
        <a>wow</a>
        <a>
          wow again
          <c> unlimited </c>
        </a>
        <b>wow phir se</b>
    </root>`;  

const options = {
    ignoreAttributes: false,
    tagValueProcessor: (tagName, tagValue, jPath, hasAttributes, isLeafNode) => {
        if(isLeafNode) return tagValue;
        return "";
    }
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "root": {
        "a": [
            {
                "#text": "wow",
                "@_a": "2"
            },
            {
                "c": {
                    "#text": "unlimited",
                    "@_a": "2"
                },
                "@_a": "2"
            }
        ],
        "b": {
            "#text": "wow phir se",
            "@_a": "2"
        },
        "@_a": "nice"
    }
}

textNodeName

Text value of a tag is parsed to #text property by default. You can always change this.

Eg

const xmlDataStr = `
    <a>
            text<b>alpha</b>
    </a>`;  

const options = {
    ignoreAttributes: false,
    textNodeName: "$text"
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "a": {
        "b": "alpha",
        "$text": "text"
    }
}

transformTagName

transformTagName property function let you change the name of a tag as per your need. Eg 'red-code' can be transformed to 'redCode' or 'redCode' can be transformed to 'redcode'.

transformTagName: (tagName) => tagName.toLowerCase()

transformAttributeName

transformAttributeName property function let you modify the name of attribute

transformAttributeName: (attributeName) => attributeName.toLowerCase()

trimValues

Remove surrounding whitespace from tag or attribute value.

Eg

const xmlDataStr = `
    <root attri="  ibu  te   ">
        35       <nested>   34</nested>
    </root>`;  

const options = {
    ignoreAttributes: false,
    parseTagValue: true, //default
    trimValues: false
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "root": {
        "nested": 34,
        "#text": "\n      35       \n    ",
        "@_attri": "  ibu  te   "
    }
}

If the tag value consists of whitespace only and trimValues: false then value will not be parsed even if parseTagValue:true. Similarly, if trimValues: true and parseTagValue:false then surrounding whitespace will be removed.

const options = {
    ignoreAttributes: false,
    parseTagValue: false,
    trimValues: true //default
};

Output

{
    "root": {
        "nested": "34",
        "#text": "35",
        "@_attri": "ibu  te"
    }
}

unpairedTags

Unpaired Tags are the tags which don't have matching closing tag. Eg <br> in HTML. You can parse unpaired tags by providing their list to the parser, validator and builder.

Eg

const xmlDataStr = `
    <rootNode>
        <tag>value</tag>
        <empty />
        <unpaired>
        <unpaired />
        <unpaired>
    </rootNode>`;  

const options = {
    unpairedTags: ["unpaired"]
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

Output

{
    "rootNode": {
        "tag": "value",
        "empty": "",
        "unpaired": [ "", "", ""]
    }
}

Note: Unpaired tag can't be used as closing tag e.g. </unpaired> is not allowed.

updateTag

This property allow you to change tag name when you send different name, skip a tag from the parsing result when you return false, or to change attributes.

const xmlDataStr = `
    <rootNode>
        <a at="val">value</a>
        <b></b>
    </rootNode>`;  

const options = {
    attributeNamePrefix: "",
    ignoreAttributes:    false,
    updateTag(tagName, jPath, attrs){
        attrs["At"] = "Home";
        delete attrs["at"];
        
        if(tagName === "a") return "A";
        else if(tagName === "b") return false;
    }
};
const parser = new XMLParser(options);
const output = parser.parse(xmlDataStr);

> Next: XmlBuilder