XML Processor

XML Processor can handle XML data smoothly. XML (Extensible Markup Language) is used for structured data like product details, customer information, or financial records.

Use this processor for tasks like:

  • Find specific information within XML files, making it easier to extract exactly what you need.

  • Check if the XML data meets specific rules (like ensuring all required fields are present). This helps maintain accuracy.

  • Clean up errors or unnecessary information so that only important information remains in output XML.

  • Organize data stored in complex structures, such as converting lists of items into a more structured format that’s easier to work with.

XML Processor helps you manage XML data effectively, ensuring you retrieve the right information in the correct format effortlessly.


Configure the XML processor parameters as explained below.

Evaluation

This field determines how the XML should be processed and validated. Choose one of the below options:

XPATH

Specifies a path within the XML document using XPath to extract specific data or nodes.

Example: /root/person/name (This XPath would extract the name of a person from the XML located at the path /root).

XSD Validation

Validates the XML against an XSD (XML Schema Definition) file uploaded by the user. Optionally, an XPath can be provided to specify where in the XML structure the validation should focus.

Example: If the XSD specifies that an element must be an integer, the validation would check if in the XML conforms to this rule.

XSD Validation And Drop Invalid XML Tags

Performs XSD validation and removes any XML tags that do not comply with the XSD. An XPath can also be provided to specify the portion of the XML where invalid tags should be removed.

Example: If the XSD specifies that

must contain , , and , and the XML has an
tag missing , this option would remove the incomplete
tag.


XML Data Column

Select the input column that contains the XML data to be processed.

If your input data contains XML in a column named xml_data, select xml_data in this field.


Include Input XML Column

This field determines when to include the original XML data in the output.

  • Always:

    Includes the original XML data column in every output row.

    Regardless of whether the XML is valid or invalid, the original XML data column will always be included in the output.

  • Only With Invalid XML:

    Includes the original XML data column in the output only when the XML is found to be invalid after processing.

    If the XML fails XSD validation or contains errors that prevent processing, the original XML data column will be included in the output.

  • Only With Valid XML:

    Includes the original XML data column in the output only when the XML is successfully validated and processed without errors.

    If the XML passes XSD validation and is processed successfully, original XML data column will be included in the output.

  • Never:

    Excludes the original XML data column from the output entirely. Regardless of validation or processing outcomes, xml_data will not appear in the output.


Conflicted Array Columns

Enter the complete path of conflicted columns that need to be changed from string to array type.

Example: parentNode.childNode.siblingNode


Keep Columns String

Enable the option to keep the data type of all the parsed columns as string.


You can add further Configuration using the ADD CONFIGURATION button.

Top