RFC for Koha 3.2: Stop copying items to bib MARC and MARCXML

Currently, item record data exists both in the items table and as MARC fields in the bib record (in biblioitems.marc and biblioitems.marcxml). The MARC serialization of the item record is controlled by the MARC framework, and unless one wants a maintenance headache, the same mapping of item fields to MARC subfields must be used for all frameworks.

This has several deleterious effects:

  • updating an item record (e.g., in circulation), forces the ISO 2709 and MARCXML versions of the bib record to be recreated. This is an expensive operation, and if a bib record has several hundred items a checkout or return can take several seconds (or longer) to complete.
  • If a bib has a large number of items, the ISO 2709 version can go over the maximum length - see bug 2453.
  • unless one makes a special effort, one is stuck with the same MARC item mapping for both import and export. Some libraries need to import bib records from more than one source and don't always have the ability to dictate that their Koha item field mapping be used by the record supplier.
  • maintaining two versions of the same data can and has led to bugs.
  • changing the item field mapping requires one to change Zebra (and NoZebra) index definitions.
  • this is a denormalization of the database schema, and one that doesn't provide any performance gains.

Consequently, I propose the following changes, in order of priority:

  1. Take item data out of biblioitems.marc*. Instead, whenever item records need to be embedded in a MARC bib record (primarily during Zebra indexing or record export), do it on the fly. If a bib record has lots of items, this will still be an expensive operation, but it will take place in a batch job or indexing daemon.
  2. Separate the item field mapping from the rest of the MARC framework. Allow multiple item field mappings, each identified by a code, and add the ability to specify a mapping to be used during record import, export, or indexing.
  3. Switch to Zebra's DOM filter and supply item data for indexing outside of the MARC bib record; instead, define an XML schema to wrap bib, item, and summary records without force the last two to be embedded in the MARC. This is similar to an idea that I believe Tümer Garip proposed back in 2006.
 
en/development/rfcs3.2/rfc32_take_items_out_of_bib.txt · Last modified: 2008/09/29 15:18 by gmc
 
Except where otherwise noted, content on this wiki is licensed under the following license:CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki