misc/migration_tools/dedup_authorities.pl - Deduping authorities script
dedup_authorities.pl [ -h ] [ -where="authid < 5000" ] -c [ -v ] [ -m d ] [ -a PERSO_NAME ]
Options:
-h --help display usage statement
-v --verbose increase verbosity, can be repeated for greater verbosity
-m --method method for choosing the reference authority, can be: date, used, or ppn (UNIMARC)
can be repeated
-w --where a SQL WHERE statement to limit the authority records checked
-c --confirm without this parameter no changes will be made, script will run in test mode
-a --authtypecode check only specified auth type, repeatable
Method(s) used to choose which authority to keep in case we found duplicates. <methods> is a string composed of letters describing what methods to use and in which order. Letters can be: date: keep the most recent authority (based on 005 field) used: keep the most used authority ppn: PPN (UNIMARC only), keep the authority with a ppn (when some authorities don't have one, based on 009 field)
Example: -m ppn -m date -m used Among the authorities that have a PPN, keep the most recent, and if two (or more) have the same date in 005, keep the most used.
Default is 'used'
limit the deduplication to SOME authorities only
Example: -where="authid < 5000" will only auths with a low auth_id (old records)
display verbose logging, can be repeated twice for more info
show usage information.
@record_ids = _choose_records(@record_ids);
This function sorts the list of record ids, based on the passed
methods (see script options).
By default, it sorts on usage count.
It is used in the main loop to decide which record to keep.