Table of Contents

check-url.pl

This tool is available in /misc/cronjobs/check-url.pl.

Documentation

Script itself contains a documentation : perldoc check-url.pl.

NAME

check-url.pl - Check URLs from 856$u field.

USAGE

check-url.pl [–verbose|–help] [–host=http://default.tld]

Scan all URLs found in 856$u of bib records and display if resources are available or not.

PARAMETERS

–host=http://default.tld

Server host used when URL doesn’t have one, ie doesn’t begin with ’http:’. For example, if –host=http://www.mylib.com, then when 856$u contains ’img/image.jpg’, the url checked is: http://www.mylib.com/image.jpg’.

–verbose|-v

Outputs both successful and failed URLs.

–html

Formats output in HTML. The result can be redirected to a file accessible by http. This way, it’s possible to link directly to biblio record in edit mode. With this parameter –host-pro is required.

–host-pro=http://koha-pro.tld

Server host used to link to biblio record editing page.

–help|-h

Where to place the htm file that is generated by the script for someone to retrieve and review

–htmldir

Print this help page. Example command line that only puts out the “bad” urls standard dependancies for perl directory etc.. perl check-url-v5.pl –html –htmldir=/path to docs/koha-tmpl –host=http://koha.xxx.xxx your server here:8080 or 80 if required for staff access.

Example output:

Comments

Questions from David Schuster, Plano ISD, answers from Frédéric Demians.

We've added the ability to run it through a proxy if needed, but you will have to get your proxy information to edit the .pl

Q – Designed as a cron job that I believe would email the results to the cron email? But have not tested it in production yet.

R – It's you choice depending on you needs. See below.

Q –Depending on the number of URL's you have in your database this may take 1-3 minutes per URL to run.

A – It depends of the time required to fetch an URL. For local URL, response is obviously very quick. For remote resources, it depends. There is solution: parallelization! See a module like LWP::Parallel.

Q – This tool does an sql on the existing database for 856 URL links and checks the link reporting back the Biblio number, URL from the Biblio, and the status “OK” or the response from the server, “404…”, “500…” etc.. -

R – It shouldn't write its result into a file. You have to redirect the result to a file if you want, or to a MTA. If you use –html option and –host-pro, the result can be redirected into an html file, for example in the koha-tmpl directory: this way, librarians can open this file and get instant access to biblio records with invalid URLs in modification.

The output of the file provides - Biblio number hotlinked, URL from the 856, results OK or error message.

**On my test system with a proxy set correctly(behind firewall with proxy server) I have 16000 urls and takes about 2 hours to check them all and complete the output.

 
en/development/check-url_enhancements.txt · Last modified: 2010/02/11 06:46 by dschust
 
Except where otherwise noted, content on this wiki is licensed under the following license:CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki