Forum Discussion

MalcolmMcDonald's avatar
MalcolmMcDonald
Qrew Cadet
30 days ago

Using Pipelines to update and load a CSV into tables

I am having a tough time figuring out the most efficient way to handle this use case. Currently, my pipeline has been running for 12 hours on about 45000 records — so my current method does not work!

I will regularly receive a csv file of about 35-40,000 records or about 5mb of data. The data is basically a contact transaction - has a handful of fields including my employee’s unique employee id.

Using pipelines, I want to create a key ID by concatenating three fields together, and add a Related employee ID from my employee table.

I’ve created a bulk upsert, fetched the csv (which took two hours as it was loading 10 rows at a time), and added a search step to pull the related employee id. It adds the upsert then loops back.

I only have about 200 employees - is there a way to loop through the unique employee ids in the csv to update in bulk or is there a whole better approach I’m missing? I need non technical folks to be able to drop the file and have it run. 

thanks!

 

malcolm

  • My suggestion is to bring in the data on a Connected Sync Table.  That will only take a few minutes to load, like 5 or 10 minutes at most.

    Then can you use a relationship or something to bring in the values you need for making that Key field as a formula?   Then you can run a saved T2T copy to copy the Sync table to the target table.  A T2T copy run basically instantly, even for 45,000 records.  A Pipeline can run that save T2T copy every hour.

  • Thanks Mark, as always. 

    I think my first question to solve is -- given data like this - what's the quickest way to do lookups / updates in bulk.

    Rep NameEmployee_IDFSACOIDStore CodeKnock StatusCreated DateTime
    Joe SmartMH7J1012ACRCDICI089Successful Order2024-11-07T22:08:01
    Jim SmarterMG2C LONDICI171Successful Order2024-11-07T22:08:01

     

    In this example, my employee ID is not a key field -- long story, but my folks might have any number of IDs given what contract they're on. What I need to do is look up this employee ID with the related record ID on the person table.

    Using pipelines; mostly because I like to keep it as consistent -- what is the most efficient way to update this table where the employee ID from this staging table matches the related Person ID (ie the record ID from the Employee table).

    I do this rather a lot for various things, but it's slow -- Search Records where Employee ID matches, return the Record ID, Loop over the search results, Update record, do it again.  Seems to take the API 1-2 seconds to do this lookup.

    I just tried a bucket -- Defined a Pipeline Table, Search Records from Client ID table, pulled my employee ID, added the row in memory -- but the same steps for looking up the bucket table as with the quickbase table -- and not appreciably faster.

     

  • ... so to clarify, the Key field of your Persons table is just [Record ID#] and you need to somehow coax that number down to your 45,000 record import table.

    Could you change the Key field of the Person table to be that Employee ID#?

    If not, I have had some success in the past with making a new connected Sync table which duplicates certain  fields in your Persons table, and then changing the Key field of the Connected Sync table to be the Unique Employee ID#.  Then Persons "mirror" table would be the Parent to the 45,000 record table and you would just lookup the [Record ID#] of the Persons Table down to the 45,000 record table.