Posted by janusman on October 31, 2009 at 12:01am
| Project: | Millennium OPAC Integration |
| Version: | 6.x-2.x-dev |
| Component: | Miscellaneous |
| Category: | feature request |
| Priority: | normal |
| Assigned: | janusman |
| Status: | closed (fixed) |
Issue Summary
Some libraries have featured lists of items; it would be nice to be able to get the module to import/update items from those lists.
Also, just crawling from a search might be good too. E.g.: import items for search for "branch:branch123 mattype:mattypea", or just a plain keyword search.
It'd also be nice to configure a maximum number of pages to crawl to get records to import (or a maximum total number of items) since, potentially, up to 32,000 items could be harvested in searches (unknown for ftlists)
Comments
#1
This would be a good feature. Also, importing the items from an RSS (a simple regexp would be able to extract bib-ids from an RSS feed or any other web source for that matter).
I'm not sure how well Millennium supports crawling from a search, though. At least the record ids are not visible in the search results.. Maybe using the cart somehow could enable this..
#2
On first look, search seems simple enough, too. The "Add to cart" button or checkbox contains the required bib number, so again, it's just a matter of a regexp. Finding the "Next page" link also seems [relatively] straightforward. =)
Extra points: I'd love a bookmarklet to say, "import all items on the current page", or maybe "Import my current bookcart's contents" =)
#3
A first patch to kick things off. Rough, a bit slow, but works in my testing =)
#4
Committed following patch.
TODO: import from just any given URL (which could include an FTList)
#5
This still has some issues with non-UTF characters messing up during import:
For instance if this record is in the import results:
http://sabio.library.arizona.edu/search~S9/?searchtype=X&searcharg=educa...
which is titled
Jovens lideranças comunitárias e direitos humanos.this message appears in watchdog; note lideranças is cut off...
Batch import error: no record number given in row: array ( 'id' => '32636', 'session' => '126688338312', 'data' => 'a:3:{s:10:"bib_recnum";s:8:"b4751054";s:5:"title";s:50:"Jovens lideran', )#6
Fixed this by centralizing all requests to a new function millennium_http_request() which is a proxy for drupal_http_request(); the new function converts to UTF-8 depending on the response's charset.
TODO: Add UI to allow admins to import from an arbitrary URL.
#7
Committed this patch.
This is a screenshot for the UI.

#8
#9
Automatically closed -- issue fixed for 2 weeks with no activity.