Forum Discussion
It is probably possible to do this with formula queries but they will be somewhat complex and they may not scale well if you have many tens of thousands of records. So my suggestion is to do it the old-fashioned way with relationships. That will scale very well.
The first step will be to create a table for the Unique CardTokenNumber. So make a table and create that field and set that field to be the key field and then make a relationship back to the transactions table where the reference field on the right side is a formula field equal to the. CardTokenNumber. (doing this flyer a form the field will make sure that you can still enter records manually).
Initialize this table by making a summary report on your transaction table and using the option to copy to another table and that will let you copy the unique values to the unique values table.
Create a formula checkbox field on the CardTokenNumber index table called [CardTokenNumber Exists?] with a formula of true.
Look up that field down to the transaction table so that you have an ability to know if you have an orphan CardTokenNumber in the transaction table that needs to be created in the index table.
But as a tape/dictate this, I'm realizing that you may want to consider having the index table include the vendor in the Keyfield so maybe the format is something like
Vendor–CardTokenNumber
because I imagine you could have incorrect card token numbers which are duplicated across multiple vendors.
I believe you should be able to use that relationship to get summaries of the number of valid and invalid transactions and get the summary totals you need on that index table.
Then we have to figure out how to maintain the index table. One thing you should do right off the bat is to make a safety net summary report of index table records that need to be created and subscribe yourself to that report.
Then you will want to build a pipeline to detect if the transaction record being created is missing from the index file. If these are created in batches and there could be duplicate orphan children needing the same index record created then the pipeline should add them to the index table, one by one using a bulk upsert, even though it's one record being upserted. The reason for that is that if you simply trigger a pipeline to add a record to the index table, when one is missing, and records are imported into the transaction table in bulk then you can get a race condition where the pipeline is triggered twice, and the second pipeline instance that loses the race, will fail on duplicate key on the index table. To avoid those errors which will just annoy you, you can have the pipeline create a bulk upsert, add the single row, and then commit the upsert. That way if two need to be created at the same split second the second one will simply upsert Merge and not try to cause a duplicate index record.
- dyarosh15 days agoQrew Trainee
Thank you for the idea. I will have to think about it as there are hundreds of thousands of records in the table. We are in the process of re-architecting the process that uses this table and when completed this table will no longer be needed so I was hoping to get the data without having to create more tables and pipelines.