Does QuickBase support Unicode or other multi-byte character encodings like Big5, GB, HZ, Shift-JIS, JIS?

  • 0
  • 1
  • Article
  • Updated 10 months ago
  • (Edited)



[The Quick Base Knowledge Base is your library of frequently-asked questions that help you better customize your apps to solve your business problems.]


If everyone using a particular Quick Base table or application package sets their browser's character encoding (on Internet Explorer click on the "View" menu and select "Encoding") to the same setting then everything will be displayed and edited consistently whether the setting is a single byte character encoding like IS0-8559 or Western European (Windows) or a multi-byte character encoding like Unicode (UTF-8) or Big5, GB, HZ, Shift-JIS, JIS.

Quick Base stores all characters as a sequence of bytes. Most character sets commonly used in Europe and the United States require only one byte per character to encode the entire character set. The most popular of these character encodings are called "ISO-8859-1" or "Western European (Windows)" or "ISO-LATIN-1".

Under these circumstances each byte store by Quick Base represents a single character.

If you were to change the character encoding of your browser to "Unicode (UTF-8)" and then you were to create a record with some characters like this:

1⁄41⁄23⁄4

You can get these in Windows by typing:

Alt-0188 Alt-0189 Alt-0190

(hold down the Alt key while typing the four digits on the
numeric pad with NumLock on, then release the Alt key)

Then when you save this record Quick Base stores the three characters 1⁄41⁄23⁄4 as six bytes instead of the three if you were using "Western European (Windows)" character encoding.

Now if you stick with the "Unicode (UTF-8)" character encoding these three characters will display as you saw them when you typed them in.

However if you were to switch your browser to "Western European (Windows)" character encoding you would see six characters instead of just three.

So if everyone using a particular Quick Base table or application package sets their browser's character encoding (on Internet Explorer click on the "View" menu and select "Encoding") to the same setting then everything will be displayed and edited consistently whether the setting is a single byte character encoding like IS0-8559 or Western European (Windows) or a multi-byte character encoding like Unicode (UTF-8) or Big5, GB, HZ, Shift-JIS, JIS.

The story for the Quick Base API is a little more complicated.

The Quick Base HTTP API outputs XML and the default XML character set is Unicode. However if you are using Shift-JIS as your character encoding for a particular QuickBase table then you'll want to specify the "encoding" parameter on every call you make to the QuickBase HTTP API to ensure that the character encoding of the XML response is properly set. So to get the schema of a QuickBase table that has been created and maintained with Shift-JIS encoding you'll want to execute the following URL:

https://www.quickbase.com/db/dbid?act=API_GetSchema&encoding=shift_jis

There is one addendum to this story that involves the Quick Base formula language. The "Length" function returns the length of a text string in the number of bytes, not the number of characters. So in some cases a three character Unicode string will return a length of six.

The "Left", "Mid" and "Right" functions may chop characters in half yielding incorrect results including characters that are not part of the original character set.

The "Lower" and "Upper" functions will not work properly with multi-byte character sets.

NOTE: Any apps which use this workaround will not be able to import or export data. As a result, we are not able to restore data in such applications.
Photo of Brian Cafferelli

Brian Cafferelli, Quick Base Technical Marketing Manager

  • 1,326 Points 1k badge 2x thumb

Posted 10 months ago

  • 0
  • 1

Be the first to post a reply!