Forum Discussion

RyanBroderick's avatar
RyanBroderick
Qrew Trainee
4 months ago

Building an Importer - Pipeline

I have data (Individual Units/Jobs) scrapped from builders' websites and want to import it into a specific table. I have built an "Import" table because I want my pipeline to check for duplicates before it imports into my "Units" table. The data that I scrap often is the same + new units.

After checking for duplicates I want it to only import the unique or new units.

Does anybody have a good reference to help, or know how I should build my pipeline? I assume I need to start with an import then run a query on the import table and the units table, and then create based on not finding any that match a specific field that is the "SKU" for the unit.



------------------------------
Ryan Broderick
------------------------------

5 Replies

  • Are you trying to avoid importing if the unit already exists in your target table or are you looking for duplicates within the import set itself or both situations?



    ------------------------------
    Mark Shnier (Your Quickbase Coach)
    mark.shnier@gmail.com
    ------------------------------
    • RyanBroderick's avatar
      RyanBroderick
      Qrew Trainee

      Thanks for the reply! 

      I am trying to avoid importing if the unit already exists. I was planning on using that unique id to be the indicator to know if it is in my "Units" table or not. I know I will end up having duplicates in my "Import" table but only want one in the target table.



      ------------------------------
      Ryan Broderick
      ------------------------------
      • MarkShnier__You's avatar
        MarkShnier__You
        Icon for Qrew Legend rankQrew Legend

        The technique is to create an admin record in a table with just one record and it will be record ID number one.  That gives you a convenient place to build buttons so that you can do various actions with just a click.  

        Then you would also create a scratchpad table where you can stage your scraped data before importing. You make sure that the field names match exactly with the column headings for the file you are importing.  

        Then on your admin record you can make a formula URL button for example to clear the scratchpad table and another button to open up that import window reload the scratch pad.  

        You then construct a formula field on your scratchpad table to build the Key field of your target table. Then I like to make a formula checkbox field on your target table called like [Unit Exists?] Formula of true.   Look that up down to the scratchpad table. That gives you a convenient hook to know whether or not you want to import the any individual scraped data record.

        So that deals with the issue of not overriding existing records.  

        Then build a saved table to table import to import from the scratchpad and map the fields into your target table. Then make a button on the admin table to run that import.  

        OK that's all great but you are bound to have duplicates in your scraped data table and the import will fail if you have duplicates.  That is where you need a formula query to detect duplicates and only import the first duplicate.  

        If you have not been exposed to formula queries they're very powerful but the syntax is a bit more challenging.  

        Not fully tested but you can make this field called for example [Record ID#s of all Units similar to this one] as a formula text field type to return a semi colon delimited list of all the [Record IDs] for the siblings of this Unit. 

        ToText(GetFieldValues(
        GetRecords("{99.EX." & [unit Identifier] & "}"),3))

        Replace the 99 with the field containing the unit identifier.

        Observe the results and where you have duplicates it should be a list of the Record IDs in Record ID sequence where the unit is duplicated.  

        Then you could make another field which will identify if the record is the first duplicate.

        ToNumber(Trim(Left([Record IDs for this unit],";"))) = [Record ID#]

        So then in your saved table to table import you would have a filter that the unit doesn't already exist and also filter that it is the first duplicate.  

           



        ------------------------------
        Mark Shnier (Your Quickbase Coach)
        mark.shnier@gmail.com
        ------------------------------