EdwardHefter
3 years agoQrew Cadet
Speeding up a pipeline that creates 15K records
I need help speeding up a pipeline.
I have an app that tracks Printed Circuit Board Assemblies (PCBAs) by serial number and the components that are on them. The system lets the user barcode scan each PCBA serial number into a field and the scanner adds a linefeed to the end, so at the end of a run, there may be 100 serial numbers in the field. The system also has a list of which components go on the PCBA, and there may be as many as 150 of them.
When the pipeline executes, it:
1) Splits apart the serial numbers
2) Start a loop for each serial number that:
2a) Creates a "top level" record for the PCBA with some master data as well as that specific serial number
2b) Searches for all of the component records in the master data table associated with that PCBA, then creates a record in the history table with the component info and the PCBA serial number tied to it
3) Loops back for the next serial number and does it all again
With 100 serial numbers and 150 components, that is 15K records. Needless to say, it takes a long time for the pipeline to run (3 hours last time I tried). I am using a bulk upsert for step 2b, which I am sure is helping, but I am looking for other ways to speed this up.
There is a step in the Bulk Record Sets called Copy Records, but I can't find any documentation on it and it looks like it just makes a 1 for 1 copy of the record without allowing changes like adding a serial number.
Any suggestions?
------------------------------
Edward Hefter
www.Sutubra.com
------------------------------
I have an app that tracks Printed Circuit Board Assemblies (PCBAs) by serial number and the components that are on them. The system lets the user barcode scan each PCBA serial number into a field and the scanner adds a linefeed to the end, so at the end of a run, there may be 100 serial numbers in the field. The system also has a list of which components go on the PCBA, and there may be as many as 150 of them.
When the pipeline executes, it:
1) Splits apart the serial numbers
2) Start a loop for each serial number that:
2a) Creates a "top level" record for the PCBA with some master data as well as that specific serial number
2b) Searches for all of the component records in the master data table associated with that PCBA, then creates a record in the history table with the component info and the PCBA serial number tied to it
3) Loops back for the next serial number and does it all again
With 100 serial numbers and 150 components, that is 15K records. Needless to say, it takes a long time for the pipeline to run (3 hours last time I tried). I am using a bulk upsert for step 2b, which I am sure is helping, but I am looking for other ways to speed this up.
There is a step in the Bulk Record Sets called Copy Records, but I can't find any documentation on it and it looks like it just makes a 1 for 1 copy of the record without allowing changes like adding a serial number.
Any suggestions?
------------------------------
Edward Hefter
www.Sutubra.com
------------------------------