Discussions

 View Only
  • 1.  Transitioning to UTF-8

    Posted 03-01-2023 11:50

    Hi everyone,

    Can any of you share with me how you navigated migrating your realm to UTF-8?

    We've only had QB for 2 years, but apparently, UTF-8 wasn't enabled when we started.  Over the past year, I'm watching errors pile up because a lot of our data is in written form (vs. hard numbers).

    For example, we'll have survey data and even the questions being asked contain apostrophe's and parentheses.  QB isn't handling them well, and all of these characters are generating errors.

    To correct this, QB Tech Support had me install and run an encoding scanner to identify the errors in our realm, and it looks to be numbering in the thousands; nearly every record and a majority of our fields are flagged as problematic.

    Some of the issues I see are that the encoding scanner doesn't tell me "what's" wrong with our data, just that this record or that field is an issue.

    QB Tech Support said that to turn UTF-8 on will affect the entire realm AND I need to have all the errors fixed before they do this.

    Our staff is inputting data daily; so the number of errors grows daily.

    We cannot afford to spend thousands on an external company to help us scrub the special characters.  I was debating telling our staff to pause for a long weekend, so I can download the data to excel and do a find and replace function and then re-upload the data...but I'm just worried about breaking or losing data that we need.

    Can any of you recommend a better way?  What did you do to manage this?

    Thank you!!!

    Jen



    ------------------------------
    Jennifer Juhasz
    FamilySmart
    BC, Canada
    ------------------------------


  • 2.  RE: Transitioning to UTF-8

    Posted 03-02-2023 09:33

    Hi Jennifer,

    We are currently on a similar journey. When UTF-8 is enabled and you are using another 3rd party cloud service, like Workato, any non-acsii characters sent to the third party service will cause errors, this is the reason to clear the non-ascii characters. Here is what we are doing to repair the data. 

    I made a rich text formula field and entered the formula in the link below. The link below describes the formula and also the reference ascii Website where this information can be found. I have since removed several of the ascii characters in that formula that do not throw the error but this will get you started.

    What the formula does is display all the characters that will cause an error. You then can build a report and use one of three tools to clean up the data namely: Grid Edit, Find/Replace or (Export-correct-Import). You can test by turning UTF-8 on one App at a time and looking for the black diamond with the question mark. This process has to be done for each table.

    Of course if you're not using third party cloud services UTF-8 can be enabled anytime and then the data can be cleaned up when it is discovered but you won't know is hiding behind the black diamond question mark thingy. 

    Finding non- windows-1252 Characters in Text fields using a formula

    https://community.quickbase.com/discussion/finding-non-windows-1252-characters-in-text-fields-using-a-formula#bm7085aeaa-a2e5-46e1-98f5-0185cb476521

    Good luck.



    ------------------------------
    Jim Harrison
    transparency = knowledge + understanding : The Scrum Dudes
    ------------------------------



  • 3.  RE: Transitioning to UTF-8

    Posted 03-06-2023 21:38

    Hi Jim,

    Thank you so much for replying!  I think your reply was buried in my inbox with all of the other QB discussion threads, and I apologize for the delay in responding to you.

    I wasn't aware I can implement UTF-8 on an app-by-app basis; QB Tech Support told me they had to do it at the realm level - which made the project seem far more daunting (because then everything has to be perfect with no new imperfect data coming in).

    While I'm not using 3rd party apps that are directly importing to QB; we do use forms on our website to collect data, that I am building into Sync Files in Google Drive and syncing that data weekly.  

    Once I have enabled UTF-8, when importing data that contains non-ascii characters such as quotation marks, apostrophe's, or parentheses (these seem to be the biggest challenges in our data), will they continue to cause issues (in your experience and opinion)?

    Thank you again! This gives me the ability to develop a clear plan of action on how to tackle what otherwise felt like an overwhelming and impossible task.

    Take care,

    Jen



    ------------------------------
    Jennifer Juhasz
    FamilySmart
    BC, Canada
    ------------------------------



  • 4.  RE: Transitioning to UTF-8

    Posted 03-08-2023 09:23

    Hi Jennifer,

    Enable UTF-8 per App by going to the App properties=>Advanced Settings and check the UTF-8 checkbox.

    Once UTF-8 is enabled all "newly entered" non-ascii characters display normally. The problem is dealing with the "old" non-ascii characters currently in the text fields of your Apps. All those characters need to be replaced or removed. 



    ------------------------------
    Jim Harrison
    transparency = knowledge + understanding : The Scrum Dudes
    ------------------------------



  • 5.  RE: Transitioning to UTF-8

    Posted 04-14-2023 15:18

    In case someone finds this thread, here is the solution I am using so far.

    First search the App Exchange for: "Mixed Encoding Scanner". Which will take a App DBID and scan the entire App for non-ascii characters.

    Once I know where to go, I can decide to either update each record one at a time or use something like the "Special Character Finder" text-formula field and "Find/replace in these records" all the different non-ascii characters identified by the "Special Character Finder" in the field. When the Mixed Encoding Scanner finds a symbol I haven't included in the formula I add another line. 

    I use the manual update when there are only a few records that need to be updated. There is a button that opens all the fields or one field at a time in the Mixed Encoding Scanner. Then Grid Edit and click four times just to activate edit to change the record. One thing I have noticed is when I save, Quickbase asks me to review every change like someone else is editing the record, which I know is not possible. So using Grid Edit takes a lot of effort and can be very frustrating.

    The second method of using Find/replace in these records is a lot faster. It's annoying that the pop up window is at the opposite corner of the screen so I have to do a lot of mouse traveling. I can edit a lot of records with minimal effort. 

    Improvements I would make to the second option are to locate the pop-up window closer to the more drop down menu, auto refresh the report page on close of the pop-up, allow the User to go back after the Find/replace is complete to perform the next operation instead of having to start at the more drop down menu.

    I have tried QuNect Razor and that identifies one field at a time one table at a time, I cannot recommend using this tool as it is since it takes considerable time to work through one record if there are multiple fields. 



    ------------------------------
    Jim Harrison
    transparency = knowledge + understanding : The Scrum Dudes
    ------------------------------



  • 6.  RE: Transitioning to UTF-8

    Posted 29 days ago

    Just thinking out loud, but if I were to approach this, I would probably use the API to loop through all my apps, and within each app, loop through all my tables, and just convert the string fields of each table to utf-8 on the fly before inserting back into the table. If you have someone on your team that's familiar with python or could hire someone for cheap, they could probably whip something up in an hour or two.  Depends on how many tables you have and many records are in each, but the runtime for this could potentially be pretty quick.




  • 7.  RE: Transitioning to UTF-8

    Posted 28 days ago
    Edited by Jim Harrison 27 days ago

    Hi Tyler,

    I think this is a great idea!
    Here are all the Quickbase objects that may contain non-ascii 1252 characters we have located so far. Some of these cannot be modified via API.

    Field values
    Field names
    Text-Multiple Choice (values even after the type has changed)
    Report names
    Report Descriptions
    Personal Report names
    Table names
    Notification customized text

    I wonder how the proposed script handles non-breaking spaces? 
    We have found the only way to locate and clear non-breaking spaces is to export to excel and then use ctrl+0160 in Find and Replace.

    There is a lot of hand work but I would guess the process could potentially take care of a bulk of the problems. 

    I will add that in some cases we found we could remove the non-ascii value (double-spaces) while others we replaced the value (single-space). It is hard to know when programmatically to choose one over the other without visually confirming the change.

    Lastly one fun filled example we used Quickbase Find/Replace to remove a non-ascii character that removed every space in the field value. A quick table restore put it back but it was a surprise to expect 200 fields (which is what the display said) to get updated and then watch as 200000 values were updated with no way to stop the process. In cases like this export/import is the most efficient.

    Anyhow, we talked with Data Collaborative about why their tool in the QB Exchange doesn't replace and the response was it was too complex.

    I'd be interested to try out a bulk removal tool if you have one, let me know and I'll give it a go. But keep in mind we are 4 Apps away from done and then hopefully we never look back.



    ------------------------------
    Jim Harrison
    transparency = knowledge + understanding : The Scrum Dudes
    ------------------------------



  • 8.  RE: Transitioning to UTF-8

    Posted 12 days ago

    Hi Jennifer & others. I wanted to jump in and let everyone know that we've developed a tool which can significantly reduce the effort required to standardize data encoding. This tool will convert from Windows-1252 to UTF-8 in one sweep at an application level. While we can't guarantee it will have a 100% match rate, it should reduce the burden massively. The reason Data Collaborative couldn't do this was indeed complexity. To develop this against our APIs would require management of triggers (like notifications or integrations) as well as last modified/date modified and could cause issues for users. We've built this tool in a special way to bypass those issues. This is available in a private beta right now and should go GA next month. If you're interested in helping test it, please ensure you are registered in our beta app and reach out to me directly.



    ------------------------------
    Harrison Hersch
    ------------------------------