<<

NAME

misc/migration_tools/dedup_authorities.pl - Deduping authorities script

SYNOPSIS

dedup_authorities.pl [ -h ] [ -where="authid < 5000" ] -c [ -v ] [ -m d ] [ -a PERSO_NAME ]

 Options:
     -h --help          display usage statement
     -v --verbose       increase verbosity, can be repeated for greater verbosity
     -m --method        method for choosing the reference authority, can be: date, used, or ppn (UNIMARC)
                        can be repeated
     -w --where         a SQL WHERE statement to limit the authority records checked
     -c --confirm       without this parameter no changes will be made, script will run in test mode
     -a --authtypecode  check only specified auth type, repeatable

OPTIONS

--method

Method(s) used to choose which authority to keep in case we found duplicates. <methods> is a string composed of letters describing what methods to use and in which order. Letters can be: date: keep the most recent authority (based on 005 field) used: keep the most used authority ppn: PPN (UNIMARC only), keep the authority with a ppn (when some authorities don't have one, based on 009 field)

Example: -m ppn -m date -m used Among the authorities that have a PPN, keep the most recent, and if two (or more) have the same date in 005, keep the most used.

Default is 'used'

--where

limit the deduplication to SOME authorities only

Example: -where="authid < 5000" will only auths with a low auth_id (old records)

--verbose

display verbose logging, can be repeated twice for more info

--help

show usage information.

_choose_records

    @record_ids = _choose_records(@record_ids);

    This function sorts the list of record ids, based on the passed
    methods (see script options).
    By default, it sorts on usage count.
    It is used in the main loop to decide which record to keep.

<<