This would let Drupal serve as a central search index for different OPACs.

Bonus points: The "same item" in different libraries occupies one node only (instead of having one node of the "same item" per OPAC). This would mean showing the user all holdings for all crawled OPACs from within the same node; would have to determine how to detect the "same item", which MARC record prevails (newest-changed? priority for each opac?)

Comments

janusman’s picture

Probably "same record" would only be feasable by using LC or OCLC numbers in MARC.

janusman’s picture

As of today, the underlying code and database tables are almost finished to allow this.

TODO:
* reimporting items, checking availability, etc. are still not looking at the originating URL from {millennium_node_bib} and instead assume the current URL.
* it should be possible to configure different opacs in the settings page, and then just pick which to import from during manual import.
* "This is the same item" (a.k.a. FRBR?) algorithms. Perhaps not in the scope for this project, but could be an add-on module that could intervene during the import process?

tituomin’s picture

Looks interesting! About FRBR-ish algorithms: I've come to the conclusion in my own project, that each FRBR entity type needs its own CCK-enabled node type. For example, I have a new node type for works, containing nodereferences to the different manifestations (which are actually the very nodes generated by Millennium Integration). I think this is a good solution: you can use different heuristics to derive works from manifestations, comparing titles, authors and other things (maybe even using WorldCat / LibraryThing APIs). The results so far are promising, not quite finished yet though. Same approach might work for this feature..

janusman’s picture

StatusFileSize
new8.48 KB

I'm committing this patch for now; mainly it includes an optional base_url argument in lots of functions so that the information is fetched from the WebOPAC the record was imported from instead of the current module's settings.

janusman’s picture

Status: Active » Needs review

Setting to needs review

janusman’s picture

StatusFileSize
new2.44 KB

Missed a few.

janusman’s picture

Status: Needs review » Active

Committed last patch.

Need more testing.

janusman’s picture

I think the only thing missing is to explicitly let the admin configure different OPACs on the settings screen.

janusman’s picture

Uh, no; the mass import functions are missing quite a bit. For instance refresing records will default to the currently-set WebOPAC instead of the source for that record.

janusman’s picture

Working on an extensive patch that would get us a LOT closer to having each imported record "know" were it was imported from.

After that would be some way to have a single node pointing to the same item in its different locations.

janusman’s picture

Mega-kitten-killer patch is in:http://drupal.org/cvs?commit=318164

This broke some things afterwards but I think the current DEV version is stable enough for testing; yo do need to run update.php if you want to test.

Things pending:
* Expose the innards to a UI which lets admins specify settings for different OPACs, and also report back on status of each (e.g. total number of items from each OPAC, report fetch time independently, etc.)

I'm thinking I won't dare try to FRBR-ize stuff with some quick-but-badly-thought-out mechanism of my own... for now I guess providing hooks to let other modules sort out if things are equivalent at import-time would be a start. FRBR-izing also requires rethinking the DB architecture, figure out how the holdings information would be shown, etc. So, each bib record from each OPAC would still be a separate node for now.

tituomin’s picture

StatusFileSize
new703 bytes

I got some mild warnings from update_6003:

warning: array_merge() [function.array-merge]: Argument #2 is not an array in /var/www/musa/update.php on line 174.
warning: Invalid argument supplied for foreach() in /var/www/musa/update.php on line 338.

Patch included. I'll try to do some testing.

tituomin’s picture

Status: Active » Needs review
janusman’s picture

Status: Needs review » Active

Committed #13. Thanks!

janusman’s picture

Ok, TODO:

* We need the user to enable one OPAC farily quickly and enter additional ones (or just fill in the name of base URLs from items imported).

* I think we will only enable auto-crawling for ONE of the OPACs for now. In the future we could do simple algorithm like round-robin on each cron run (opac 1 on cron run 1, opac 2 on cron run 2, opac 3 on cron run 3, back to opac 1 on run #4, etc)

* The module should let admins map OPAC base urls to actual names of the libraries/catalogs containing them. Then the items could inherit that name too in taxonomy. Idea: Right now the "Mappings" settings tab lets one map to an "availability" vocabulary... maybe the library name could be the parent term for the availability. I think this also means moving this particular setting out of the mapping tab (which is global for all MARC->Taxonomy) and move it into the settings for each OPAC.

* The base url widget on all settings screens could now change to an autocomplete.

* The status report could benefit from splitting up reports for each OPAC. For now maybe it should just say the number of different OPACs and each line on data tables should mention the base URL or name.

janusman’s picture

StatusFileSize
new349.04 KB

Mockups for what this would look like...

janusman’s picture

StatusFileSize
new31.56 KB

Ok, part 1 of a big patch to make this work.

Done:
* Add/remove multiple OPACs and their names.
* Switched crawl settings to own tab; a select element lets one pick an OPAC from the source table.
* update.php code

After applying patch please empty caches. Run update.php to migrate the existing configured OPAC (and from all imported nodes) into the new source table.

Todo (to come, hopefully, in next patch)
* Taxonomy handling upon source add/remove.
* Taxonomy options
* Taxonomy mapping on node import/update.
* opac name display in holdings table (easy)
* Put back functionality (removed here) for millennium_filter called with only a record # (no base url) and also preview/import records one-by-one.

Buggy:
* AHAH in conjunction with autocomplete in Batch Import. Thinking of switching to just the select box instead of autocomplete text field.

Wishlist:
* Option to check "remove" in source table that will also delete nodes.
* Show number of imported nodes per source.

janusman’s picture

Forgot to mention this also disables the millennium_auth module, as it needs some way to define a single OPAC to associate with logins.

janusman’s picture

Status: Active » Needs review
StatusFileSize
new50.54 KB

New patch. Now working:
- taxonomy terms for opac names
- rename taxonomy when opac name is renamed
- authentication (must set up using new "authentication" tab)

Yet to do:
* preview/import records one-by-one (needs to accept a full URL or a base URL along with a record number... or recieve a record number and ask for a source OPAC before importing)

Wishlist:
* Option to check "remove" in source table that will also delete nodes.

janusman’s picture

StatusFileSize
new53.21 KB

Ok, final patch for review.

Only thing left would be the wishlist item, an option to check "remove" in source table that will also delete nodes. =) However, this belongs in another issue.

janusman’s picture

This is a self-review =)

+++ contributions/modules/millennium/millennium.admin.inc Locally Modified (Based On 1.1.2.31)
@@ -56,7 +42,165 @@
+/**
+ * Submit handler for settings form; handles special values that are not
+ */
+function millennium_admin_settings_form_submit($form, &$form_state) {

Comment was truncated.

+++ contributions/modules/millennium/millennium.admin.inc Locally Modified (Based On 1.1.2.31)
@@ -56,7 +42,165 @@
+  #dpm($form_state);

Remove this debug code.

+++ contributions/modules/millennium/millennium.import.inc Locally Modified (Based On 1.1.2.20)
@@ -163,19 +158,13 @@
+function millennium_fetch_records_via_bookcart($recnums, $complete_holdings = false, $base_url) {

These args should also prolly be rearranged.

+++ contributions/modules/millennium/millennium.module Locally Modified (Based On 1.13.2.33.2.2.2.86)
@@ -647,12 +654,19 @@
+  $items['millennium_autocomplete_js'] = array(
+    'page callback' => 'millennium_autocomplete_js',
+    'type' => MENU_CALLBACK,
+    'access arguments' => array('administer millennium'),
+    'file' => 'millennium.pages.inc',
+  );

I think this is no longer needed.

+++ contributions/modules/millennium/millennium.module Locally Modified (Based On 1.13.2.33.2.2.2.86)
@@ -1563,7 +1576,7 @@
+function millennium_fetch_recordpage($recnum, $mode = "plain", $base_url) { // TODO Fix argument order

Fix argument order

+++ contributions/modules/millennium/millennium.module Locally Modified (Based On 1.13.2.33.2.2.2.86)
@@ -1590,15 +1603,7 @@
+function millennium_permalink($recnum, $mode = 'plain', $base_url) { // TODO fix argument order

Fix thee arguments

+++ contributions/modules/millennium/millennium.pages.inc Locally Modified (Based On 1.1.2.5)
@@ -313,11 +324,29 @@
+
+/**
+ * Callback function for base_url autocomplete form elements
+ */
+function millennium_autocomplete_js() {
+  $suggestions = array();
+  $search_parts = explode('/', trim($_GET['q']));
+  $search_string = implode('/', array_slice($search_parts, 1));
+  $sources = variable_get("millennium_sources", array());
+  // Look for $search_string in all sources
+  foreach ($sources as $base_url => $source_data) {
+    if (strpos($base_url, $search_string) === 0) {
+      $suggestions[$base_url] = $base_url;
+    }
+  }
+  drupal_json($suggestions);
+  exit;
+}

This is no longer needed I think?

+++ contributions/modules/millennium/millennium_auth.module Locally Modified (Based On 1.1.2.6)
@@ -303,11 +352,22 @@
+  $millennium_baseurl = variable_get('millennium_auth_default_base_url', '');
+  // Use HTTPs if settings indicate so. TODO make this automatic?
+  if (variable_get('millennium_auth_use_https', FALSE)) {
+    $millennium_baseurl = str_replace("http://", "https://", $millennium_baseurl);
+  }
+
   // Connect to Millennium and get the patron's information
-  $patroninfo_data = patroninfo_start_session(millennium_get_real_baseurl(), $username, $lastname, $pin);
+  $patroninfo_data = patroninfo_start_session($millennium_baseurl, $username, $lastname, $pin);
   

Change variable from $millennium_baseurl into just $base_url

Powered by Dreditor.

janusman’s picture

Status: Needs review » Needs work
janusman’s picture

Status: Needs work » Needs review
StatusFileSize
new60.96 KB

This looks like it has a shot =)

janusman’s picture

Status: Needs review » Fixed
StatusFileSize
new33.3 KB
new11.9 KB
new25.33 KB
new44.24 KB
new59.77 KB

Yay! Committed this patch: minimal changes from the one in #23.

#355602 by janusman, tituomin: Changed Allow importing from different OPACs.

See attached screenshots to see how the interface looks =)

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.