search - a search script for finding records in a Koha system (Version 3.0)


This script contains a new search API for Koha 3.0. It is designed to be simple to use and configure, yet capable of performing feats like stemming, field weighting, relevance ranking, support for multiple query language formats (CCL, CQL, PQF), full or nearly full support for the bib1 attribute set, extended attribute sets defined in Zebra profiles, access to the full range of Z39.50 query options, federated searches on Z39.50 targets, etc.

I believe the API as represented in this script is mostly sound, even if the individual functions in Search.pm and Koha.pm need to be cleaned up. Of course, you are free to disagree :-)

I will attempt to describe what is happening at each part of this script. -- JF


This script performs two functions:

1. interacts with Koha to retrieve and display the results of a search
2. loads the advanced search page

These two functions share many of the same variables and modules, so the first task is to load what they have in common and determine which template to use. Once determined, proceed to only load the variables and procedures necessary for that function.


If we're loading the advanced search page this script will call a number of display* routines which populate objects that are sent to the template for display of things like search indexes, languages, search limits, branches, etc. These are not stored in the template for two reasons:

1. Efficiency - we have more control over objects inside the script, and it's possible to not duplicate things like indexes (if the search indexes were stored in the template they would need to be repeated)
2. Customization - if these elements were moved to the sql database it would allow a simple librarian to determine which fields to display on the page without editing any html (also how the fields should behave when being searched).

However, they create one problem : the strings aren't translated. I have an idea for how to do this that I will purusue soon.


If we're performing a search, this script performs three primary operations:

1. builds query strings (yes, plural)
2. perform the search and return the results array
3. build the HTML for output to the template

There are several additional secondary functions performed that I will not cover in detail.

1. Building Query Strings

There are several types of queries needed in the process of search and retrieve:

1 Koha query - the query that is passed to Zebra

This is the most complex query that needs to be built. The original design goal was to use a custom CCL2PQF query parser to translate an incoming CCL query into a multi-leaf query to pass to Zebra. It needs to be multi-leaf to allow field weighting, koha-specific relevance ranking, and stemming. When I have a chance I'll try to flesh out this section to better explain.

This query incorporates query profiles that aren't compatible with non-Zebra Z39.50 targets to acomplish the field weighting and relevance ranking.

2 Federated query - the query that is passed to other Z39.50 targets

This query is just the user's query expressed in CCL CQL, or PQF for passing to a non-zebra Z39.50 target (one that doesn't support the extended profile that Zebra does).

3 Search description - passed to the template / saved for future refinements of the query (by user)

This is a simple string that completely expresses the query in a way that can be parsed by Koha for future refinements of the query or as a part of a history feature. It differs from the human search description:

1. it does not contain commas or = signs

4 Human search description - what the user sees in the search_desc area

This is a simple string nearly identical to the Search description, but more human readable. It will contain = signs or commas, etc.

2. Perform the Search

This section takes the query strings and performs searches on the named servers, including the Koha Zebra server, stores the results in a deeply nested object, builds 'faceted results', and returns these objects.

3. Build HTML

The final major section of this script takes the objects collected thusfar and builds the HTML for output to the template and user.

Additional Notes

Not yet completed...


There are many, most are documented in the code. The one that isn't fully documented, but referred to is the need for a full query parser.