XML Processor
XML Processor can handle XML data smoothly. XML (Extensible Markup Language) is used for structured data like product details, customer information, or financial records.
Use this processor for tasks like:
Find specific information within XML files, making it easier to extract exactly what you need.
Check if the XML data meets specific rules (like ensuring all required fields are present). This helps maintain accuracy.
Clean up errors or unnecessary information so that only important information remains in output XML.
Organize data stored in complex structures, such as converting lists of items into a more structured format that’s easier to work with.
XML Processor helps you manage XML data effectively, ensuring you retrieve the right information in the correct format effortlessly.
Configure the XML processor parameters as explained below.
Evaluation
This field determines how the XML should be processed and validated. Choose one of the below options:
XPATH
Specifies a path within the XML document using XPath to extract specific data or nodes.
Example: /root/person/name (This XPath would extract the name of a person from the XML located at the path /root).
XSD Validation
Validates the XML against an XSD (XML Schema Definition) file uploaded by the user. Optionally, an XPath can be provided to specify where in the XML structure the validation should focus.
Example: If the XSD specifies that an element
XSD Validation And Drop Invalid XML Tags
Performs XSD validation and removes any XML tags that do not comply with the XSD. An XPath can also be provided to specify the portion of the XML where invalid tags should be removed.
Example: If the XSD specifies that
must containXML Data Column
Select the input column that contains the XML data to be processed.
If your input data contains XML in a column named xml_data, select xml_data in this field.
Include Input XML Column
This field determines when to include the original XML data in the output.
Always:
Includes the original XML data column in every output row.
Regardless of whether the XML is valid or invalid, the original XML data column will always be included in the output.
Only With Invalid XML:
Includes the original XML data column in the output only when the XML is found to be invalid after processing.
If the XML fails XSD validation or contains errors that prevent processing, the original XML data column will be included in the output.
Only With Valid XML:
Includes the original XML data column in the output only when the XML is successfully validated and processed without errors.
If the XML passes XSD validation and is processed successfully, original XML data column will be included in the output.
Never:
Excludes the original XML data column from the output entirely. Regardless of validation or processing outcomes, xml_data will not appear in the output.
Conflicted Array Columns
Enter the complete path of conflicted columns that need to be changed from string to array type.
Example: parentNode.childNode.siblingNode
Keep Columns String
Enable the option to keep the data type of all the parsed columns as string.
You can add further Configuration using the ADD CONFIGURATION button.
If you have any feedback on Gathr documentation, please email us!