Get data from website table

  • 0
  • 1
  • Question
  • Updated 11 months ago
  • In Progress
OK here is what I am trying to do compared to excel. I have a sheet that pulls rates from various banking sites. In excel its very easy to accomplish this. I click on "From Web" on the data tab, enter the website in the URL field, and select the table in the window below. This then copies the table to my sheet. To update the data I simply click "Refresh All" in the data tab and it grabs the data again. Is there an easy way to do this in Quickbase. 
Photo of Theodore Tomita III

Posted 11 months ago

  • 0
  • 1
Photo of Ⲇanom the ultimate (Dan Diebolt)

Ⲇanom the ultimate (Dan Diebolt), Champion

  • 28,904 Points 20k badge 2x thumb
The answer depends on the site you are pulling the data from and how you access the site. Complicating factor include (1) if you need to be logged into the site. (2) if navigation or interaction (eg CAPTCHAs) is required to reach the URL where the information resides, (3) and how the web site allows its information to be accessed (both legally and technically). There probably is a solution but how you do it depends entirely on the particular site you want to access. So the primary question is what URL are you trying to access. The secondary question is how is the data is identified (which table) and formatted on the page (this is the easier issue to address).

You can stop reading here if you are non-technical but the following information might be helpful to a large group of different users. Let me explain.

When you access a site there are three categories of information that can be in the request and response. Two of the categories everyone is familiar with namely (1) the URL or address and (2) the content or bodyThe URL is the long string of characters you see in the address bar and the content is what is displayed in your browser or download to your computer.

The third category of information that you don't normally see is call the header information.  The headers describe in great detail the (1) size, type, encoding and encryption of the content, (2) technical details about how the content can be accessed such as cookies, tickets, session ids, and caching information and (3) a bewildering variety of other header information including Easter eggs, jokes, job solicitations and other obscure information. Nope I am not joking - enjoy some fun reading:

Fun and Unusual HTTP Response Headers

Also browsers are not the only program that access a web site. Installed application such as Excel can access web sites and servers like those at Google acting as crawlers or robots can access web sties. When a installed program or a server accesses a web site it can pretty much do anything it wants (subject to security and authentication issues) with the information. But when a browser accesses a web site it must respect the the restrictions specified by the headers because all the browser manufactures more or less comply with a common body of standards and best practices.

By default browsers will not allow you to access the content of one web site from another domain (ie unless it has a header named "Access-Control-Allow-Origin" with an appropriate value. Another common restriction is that some sites will not even allow you to put their URL in an <iframe> unless it has a header named "X-Frame-Options" with an appropriate value. In a variety of ways each site can control if and how you can access their information.

There are all sorts of details, exceptions, workarounds and tricks that can be used but the bottom line is that the answer to your question depends on the web site you wish to get the information from.