Forum Discussion

BasanthP_R's avatar
Qrew Member
5 months ago

Could not parse XML input. XML Parsing Error. unclosed CDATA section at line 9605 column 81

I had a script ready earlier and it is working fine on one of the Remote Desktop and the script used to extract the XML content into CSV and upload it to one of the Quickbase site.
However I was offered another remote desktop hosted by clients from Canada. And when I imported exactly the same SDK and script, it is failing.

My data only has 13 columns. I even checked that cell and found nothing. I’m confused how the same script is running perfectly fine on one machine and showing error on the other. Could you please let me know what might be the cause and give a workaround?

I'm getting below error.

<?xml version="1.0" ?>




            <errtext>Could not parse XML input</errtext>

            <errdetail>XML Parsing Error. unclosed CDATA section at line 9604 column 19 (which is byte 2043742)</errdetail>

Please help me fix the error. 

Below is the script for your details.

import os
import shutil
import zipfile
import xml.etree.ElementTree as ET
import csv
from datetime import datetime
import requests
def conversion(xml_file, csv_file):
    tree = ET.parse(xml_file)
    documents_root = tree.getroot()
    data = []
    header_mapping = {
        "DocumentNo": "Document Number",
        "Status": "Status",
        "DocTypeCode": "Document Type",
        "SheetNumber": "Sheet",
        "RevSub": "Revision/ Submittal",
        "ClientCode": "Client Code",
        "DocumentTitle": "Title",
        "CPackage": "Commit. Package",
        "PoContractNo": "PO/Contract",
        "PmpCreatedDate": "Received",
        "AltRefNo": "Alternate Reference",
        "Discipline": "Discipline",
        "ClientCodeRev": "Client Revision"
    header = list(header_mapping.values())
    uni_document_numbers = set()
    for document in documents_root.findall("Document"):
        doc_type_code = document.get("DocTypeCode", "")
        document_number = document.get("DocumentNo", "")
        if doc_type_code in ["T", "V"] and document_number not in uni_document_numbers:
            row = {header_mapping[field]: document.get(field, "") for field in header_mapping}
    with open(csv_file, 'w', newline='', encoding='utf-8-sig') as csv_file:
        csv_writer = csv.DictWriter(csv_file, fieldnames=header)
g_drive_folder = r"B:"
desktop_folder = os.path.join(os.path.expanduser("~"), "Desktop")
new_folder_name = "ExtractedFiles"
new_folder_path = os.path.join(desktop_folder, new_folder_name)
if not os.path.exists(new_folder_path):
files = sorted(os.listdir(g_drive_folder), key=lambda x: os.path.getmtime(os.path.join(g_drive_folder, x)),
latest_file = files[0]
shutil.copy(os.path.join(g_drive_folder, latest_file), new_folder_path)
zip_file_path = os.path.join(new_folder_path, latest_file)
with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
xml_file_name = [f for f in os.listdir(new_folder_path) if f.endswith(".xml")][0]
xml_file_path = os.path.join(new_folder_path, xml_file_name)
csv_file_name = os.path.splitext(xml_file_name)[0] + ".csv"
csv_file_path = os.path.join(new_folder_path, csv_file_name)
conversion(xml_file_path, csv_file_path)
with open(csv_file_path, 'r', newline='', encoding='utf-8-sig') as csv_file:
    csv_content =
url1 = ""
headers1 = {
    'Accept': 'application/json',
    'Cookie': 'commonVars=~qlb=env~; luid=65522889.c8bb; ptest=1703152789037'
response1 = requests.request("GET", url1, headers=headers1)
xml_response = ET.fromstring(response1.text)
ticket = xml_response.find('.//ticket').text
url2 = ""
payload = f"<qdbapi>\r\n   <udata>mydata</udata>\r\n   <ticket>{ticket}</ticket>\r\n   <apptoken>bdu5ekuchufyzvmfvce6d4834i7</apptoken>\r\n   <records_csv>\r\n      <![CDATA[{csv_content.encode('utf-8').decode('latin-1')}]]>\r\n   </records_csv>\r\n   <clist></clist>\r\n   <skipfirst>1</skipfirst>\r\n</qdbapi>\r\n"
headers2 = {
    'Content-Type': 'application/xml',
    'Cookie': 'commonVars=~qlb=env~; luid=65522889.c8bb; ptest=1703152789037'
response2 = requests.request("POST", url2, headers=headers2, data=payload)

Basanth P R

1 Reply

  • I'm not an expert in Python - but your error indicates specifically something wrong with the cdata array not being closed out correctly. Have you printed out the content from csv_content.encode('utf-8').decode('latin-1') to ensure that the response is a string? ImportFromCSV will require the data to a string csv format - so first off I would ensure that the response of that conversion is being handled and treated properly as a string. 

    Chayce Duncan