MARC21_parse_test.pl - Try parsing and optionally fixing biblioitems.marcxml, report errors


MARC21_parse_test.pl [ -h | -m ] [ -v ] [ -d ] [ -s ] [ -l=N ] [ -o=N ] [ -l ] [ -f ] [ -A | filename ...]

 Help Options:
   -h --help -?   Brief help message
   -m --man       Full documentation, same as --help --verbose
      --version   Prints version info

 Feedback Options:
   -d --dump      Dump MARCXML of biblioitems processed, default OFF
   -s --summary   Print initial and closing summary of good and bad biblioitems counted, default ON
   -L --Lint      Show any warnings from MARC::Lint, default OFF
   -v --verbose   Increase verbosity of output, default OFF

 Run Options:
   -f --fix       Replace biblioitems.marcxml from data in marc field, default OFF
   -A --All       Use the whole biblioitems table as target set, default OFF
   -l --limit     Number of biblioitems to display or fix
   -o --offset    Number of biblioitems to skip (not displayed or fixed)



Target the entire biblioitems table. Beware, on a large table --All can be very costly to performance.


Without this option, no changes to any records are made. With <--fix>, the script attempts to reconstruct biblioitems.marcxml from biblioitems.marc.


Like a LIMIT statement in SQL, this constrains the number of records targeted by the script to an integer N. This applies whether the target records are determined by user input, filenames or <--All>.


Like an OFFSET statement in SQL, this tells the script to skip N of the targetted records. The default is 0, i.e. skip none of them.

The binary ON/OFF options can be negated like: --nosummary Do not display summary. --nodump Do not dump MARCXML. --noLint Do not show MARC::Lint warnings. --nofix Do not change any records. This is the default mode.


Any number of filepath arguments can be referenced. They will be read in order and used to select the target set of biblioitems. The file format should be simply one biblionumber per line. The --limit and --offset options can still be used with biblionumbers specified from file. Files will be ignored under the --All option.


This checks for data corruption or otherwise unparsable data in biblioitems.marcxml. As the name suggests, this script is only useful for MARC21 and will die for marcflavour UNIMARC.

Run MARC21_parse_test.pl the first time with no options and type in individual biblionumbers to test. Or run with --All to go through the entire table. Run the script again with --fix to attempt repair of the same target set.

After fixing any records, you will need to rebuild your index, e.g. rebuild_zebra -b -r -x.



In the most basic form, allows you to input biblionumbers and checks them individually.

MARC21_parse_test.pl --fix

Same thing but fixes them if they fail to parse.

MARC21_parse_test.pl --fix --limit=15 bibnumbers1.txt

Fixes biblioitems from the first 15 biblionumbers in file bibnumbers1.txt. Multiple file arguments can be used.

MARC21_parse_test.pl --All --limit=3 --offset=15 --nosummary --dump

Dumps MARCXML from the 16th, 17th and 18th records found in the database.

MARC21_parse_test.pl -A -l=3 -o=15 -s=0 -d

Same thing as previous example in terse form.


Add more documentation for OPTIONS.

Update zebra status so rebuild of index is not necessary.


MARC::Lint C4::Biblio