Before posting here, I read through QB help on Summary reports and filtering and searched the Community and then the Internet. Here is the problem. We have a Summary Report that groups a field by e...
Dan, Typically what people are looking for is a report of just the duplicates so they can go in and scrub their data. But they would like it to be delivered as an exception subscription report so that if there are no dupes then there is no email received. That is the real challenge.
>But they would like it to be delivered as an exception subscription report
Maybe - I would like to hear various opinions. My experience is that it is better to clean up data before it enters a new system. So the ideal place to deal with dups would be to prevent the entry of duplicates during the initial form entry or import. But in reality I have found that it often takes human judgement to reconcile which of two "duplicate" records has the more recent information or which specific fields in the two records are stale.
In any case if users can clarify what they want I am sure there are easy script solutions for detecting duplicate records including accounting for typos.
There is an algorithm called the Levenshtein Distance which can compare two strings to see of far apart they are and which has a very simple implementation. Here are some examples that could easily be applied to duplicate detection of records:
Examples:
(1) "there" and "their" are two letters apart (2) "general motors" and "GeneralMotors" are three letters apart (3) "Mark" and "Dan" are three letters apart