Quickbase Discussions

 View Only

Does QuickBase support Unicode or other multi-byte character encodings like Big5, GB, HZ, Shift-JIS, JIS?

By Brian Cafferelli posted 01-23-2018 20:40

  



[The Quick Base Knowledge Base is your library of frequently-asked questions that help you better customize your apps to solve your business problems.]


If everyone using a particular Quick Base table or application package sets their browser's character encoding (on Internet Explorer click on the "View" menu and select "Encoding") to the same setting then everything will be displayed and edited consistently whether the setting is a single byte character encoding like IS0-8559 or Western European (Windows) or a multi-byte character encoding like Unicode (UTF-8) or Big5, GB, HZ, Shift-JIS, JIS.

Quick Base stores all characters as a sequence of bytes. Most character sets commonly used in Europe and the United States require only one byte per character to encode the entire character set. The most popular of these character encodings are called "ISO-8859-1" or "Western European (Windows)" or "ISO-LATIN-1".

Under these circumstances each byte store by Quick Base represents a single character.

If you were to change the character encoding of your browser to "Unicode (UTF-8)" and then you were to create a record with some characters like this:

1Ú41Ú23Ú4

You can get these in Windows by typing:

Alt-0188 Alt-0189 Alt-0190

(hold down the Alt key while typing the four digits on the
numeric pad with NumLock on, then release the Alt key)

Then when you save this record Quick Base stores the three characters 1Ú41Ú23Ú4 as six bytes instead of the three if you were using "Western European (Windows)" character encoding.

Now if you stick with the "Unicode (UTF-8)" character encoding these three characters will display as you saw them when you typed them in.

However if you were to switch your browser to "Western European (Windows)" character encoding you would see six characters instead of just three.

So if everyone using a particular Quick Base table or application package sets their browser's character encoding (on Internet Explorer click on the "View" menu and select "Encoding") to the same setting then everything will be displayed and edited consistently whether the setting is a single byte character encoding like IS0-8559 or Western European (Windows) or a multi-byte character encoding like Unicode (UTF-8) or Big5, GB, HZ, Shift-JIS, JIS.

The story for the Quick Base API is a little more complicated.

The Quick Base HTTP API outputs XML and the default XML character set is Unicode. However if you are using Shift-JIS as your character encoding for a particular QuickBase table then you'll want to specify the "encoding" parameter on every call you make to the QuickBase HTTP API to ensure that the character encoding of the XML response is properly set. So to get the schema of a QuickBase table that has been created and maintained with Shift-JIS encoding you'll want to execute the following URL:

https://www.quickbase.com/db/dbid?act=API_GetSchema&encoding=shift_jis

There is one addendum to this story that involves the Quick Base formula language. The "Length" function returns the length of a text string in the number of bytes, not the number of characters. So in some cases a three character Unicode string will return a length of six.

The "Left", "Mid" and "Right" functions may chop characters in half yielding incorrect results including characters that are not part of the original character set.

The "Lower" and "Upper" functions will not work properly with multi-byte character sets.

NOTE: Any apps which use this workaround will not be able to import or export data. As a result, we are not able to restore data in such applications.

Comments

05-17-2023 15:30

Will unicode/emoji be supported in Quickbase text formulas?

02-24-2021 16:37

Is there an update or plan to improve support for Unicode encoding in text fields?