[Blacklight-development] changing solrmarc indexing in demo app?
Jonathan Rochkind
rochkind at jhu.edu
Tue Jun 23 10:44:05 EDT 2009
If it's in the app, I don't know where it is. Like I said in my original
email, the properties file I found is at:
bl-demo/rails/vendor/plugins/blacklight/config/demo_index.properties
That's in the plugin.
If I'm confused, then I'd definitely appreciate someone un-confusing me,
but so far I'm just getting more confused.
Jonathan
Ross Singer wrote:
> It is? Mine is in the app, although maybe I modified that at some point for Jangle.
>
> The rake tasks make some assumptions about where to find files (that I think can be overridden -- if not, they should be), which might be a way to go.
>
> -Ross.
>
> On Tue, Jun 23, 2009 at 10:33 AM, Jonathan Rochkind <rochkind at jhu.edu<mailto:rochkind at jhu.edu>> wrote:
> Right, of course I'll make local modifcations, but I thought the point of the plug-in was that my local modifications would be in the app, NOT in the plugin.
>
> But the demo_index.properties file is in the plugin.
> Modifying the rake task to take the properties file from somewhere else seems like a reasonable approach. Would like to get feedback from main BL developers to see if this is what is reccommended, or what.
> Ross Singer wrote:
> Jonathan, there is going to have to be some local modification somewhere... the notion of a 'pristine' Blacklight doesn't make sense, since 'out of the box' Blacklight isn't going to have any notion of your local context, data or needs.
>
> Either you'll have to change the demo_index.properties file or the solr_marc.rake file. The former seems simpler and more convenient to me. Alternately, I suppose you could clone the solr_marc.rake, put it in your main Rails app and modify it to use some copy of your *.properties.
>
> After all, all of this is solely for the BL /demo/. Stanford and, I assume, UVA probably have fairly different indexing routines than this for this actual instances. They are also probably fairly different from each other.
>
> -Ross.
>
> On Tue, Jun 23, 2009 at 10:11 AM, Jonathan Rochkind <rochkind at jhu.edu<mailto:rochkind at jhu.edu><mailto:rochkind at jhu.edu<mailto:rochkind at jhu.edu>>> wrote:
> So, now I'm a bit confused about where I change the indexing configuration for SolrMarc in the demo app.
>
> The likely file I find that I think is controlling solrmarc when I run a "rake app:index:marc" is at:
>
> bl-demo/rails/vendor/plugins/blacklight/config/demo_index.properties
>
> But it doesn't seem right to me to have to change a file in the plugin source itself -- I want to leave the plugin pristine, so it can be easily updated when new versions come out, and make changes somewhere in my local app, right?
>
> What's the "right" way to make changes to the SolrMarc configuration, such that when I run "rake app:index:marc" in the demo app, they'll be used?
>
> Thanks for any help. Still curious if anyone has feedback on the below post about whether the 'standard' demo app solrmarc config should be changed?
>
> Jonathan
>
>
> Jonathan Rochkind wrote:
> Sweet, thanks a lot Robert, that's just what I needed to know.
>
> I believe that the 'right' thing to do, then, is:
>
> title_t = 245aa, first
>
>
> In the cases you were previously expecting, where there is only one
> 245$a, it will be indexed as before. In cases where there are multiple
> 245's or multiple $a's, then all of the "a"'s from the first 245 will be
> concatenated, and subsequent 245's will be ignored. I believe this is a
> better way to recover from that unexpected data than crashing entirely.
>
> If I submit a patch to the 'out of the box' Blacklight SOLRmarc config
> to change it like so, would you guys agree? Not sure if that patch goes
> to Blacklight project or SolrMARC project?
>
> Except now I'm confused as to what solrmarc.properties files is actually
> included with the Blacklight demo. The "vanilla blacklight demo"
> properties file in SolrMARC actually has:
>
> title_t = custom, removeTrailingPunct(245a)
>
> Is it possible to have
>
> title_t = custom, removeTrailingPunct(245aa), first ??
>
> Should the properties file you get from a demo checkout be the SolrMarc
> "vanilla blacklight" example? I'm not sure what it is now...?
>
> Also, I could try to enhance the documentation on SolrMarc config to
> explain what "first" does (currently in there by example, but not
> actually described; are any other parameters valid there other than
> 'first'?), and to explain that you can double-up a subfield to mean
> "concatenate all such subfields from a given field" -- but I'm not sure
> I understand it well enough to write those docs? But maybe I'll just go
> go add that as I do understand it to the wiki?
> http://code.google.com/p/solrmarc/wiki/ConfiguringSolrMarc I guess I'd
> need to get added to the SolrMarc project to have permission to edit
> that wiki though. If anyone would like me to help enhance those docs,
> I'm _happy_ to do so, if someone can get me access to do so.
>
> Jonathan
>
> Robert Haschart wrote:
>
> Jonathan,
>
> Rather than changing the schema to not specify multiValued, you also
> have a number of other options:
>
> change the spec for getting the title value from:
>
> title_t = 245a
>
> to :
>
> title_t = 245a, first
>
> which will get only the first occurance of a given field/subfield.
>
> or to:
>
> title_t = 245aa
>
> which will concatenate all 'a' subfields for a given 245 field and
> return them as a single entry.
>
> or even:
>
> title_t = 245aa, first
>
> which will handle instances with multiple 'a' subfields in a single 245
> field as above, but still not die if multiple 245 fields are present.
>
> Lastly two features that were added to Solrmarc very recently (as in the
> demo code does not yet have it) could help your situation. One is that
> errors such as missing required field, duplicate, non-multiValued
> fields, or field with unknown (to solr) names, will generate an error
> message for that record, but allow the indexing to continue.
> Second there is a new standard "custom" index function called
> getSingleIndexEntry which could be used like this:
>
> title_t = custom, getSingleIndexEntry(245aa, true)
>
> It would process the 245aa field spec as above, and then if multiple
> entries results it would select the longest result to use for the
> title_t index entry, and if the second parameter to the function is
> true (and if marc.include_errors is enabled) it will generate a
> marc_error index entry containing the additional errorneous 245 field
> data.
>
> -Bob Haschart
>
>
> Jonathan Rochkind wrote:
>
>
>
> Cool. I'm not sure multiple 245$a's even technically _is_ "bad data".
> It's not neccesarily AACR2, but I found out the hard way recently on
> another project that we have LOTS of non-AACR2 data in our catalog,
> including AACR1 data, pre-AACR data, and data cataloged to rare books
> and manuscripts standards, none of which our catalogers actually
> consider 'bad'.
> So Ross suggests one answer, changing it to:
>
> title_t = custom, removeTrailingPunct(245a), first
>
> So it'll ignore the second 245. Another option might be figuring out
> how to set the solrmarc setup to concatenate two 245$a's, rather than
> ignore the second one, which would seem to me to be the actually
> appropriate thing to do in this case, barring letting title_t take
> multiple values. Is it possible to do that somehow?
>
> Anyone have an opinion on if either of these two things should be done
> in the standard out-of-the-box demo setup, to accomodate this kind of
> data?
>
> Jonathan
>
>
>
> Naomi Dushay wrote:
>
>
>
> Jonathan,
>
> We have a TON of bad data. Lots of records with multiple 245a; lots
> of records with other similar problems.
>
> We use solrmarc, and it nicely steps around these problems. There are
> solrmarc lists;
>
> solrmarc-general at googlegroups.com<mailto:solrmarc-general at googlegroups.com><mailto:solrmarc-general at googlegroups.com<mailto:solrmarc-general at googlegroups.com>>
> solrmarc-technical at googlegroups.com<mailto:solrmarc-technical at googlegroups.com><mailto:solrmarc-technical at googlegroups.com<mailto:solrmarc-technical at googlegroups.com>>
>
>
> - Naomi
>
>
> On Jun 22, 2009, at 1:44 PM, Jonathan Rochkind wrote:
>
>
>
>
>
> Trying to take one baby step at a time in getting a little demo of
> Blacklight with our demo, I'm trying to index a sample of our own
> local MARC records.
>
> I get an error from "rake app:index:marc", not sure exactly why,
> error below. Maybe errors in my MARC? There definitely _will_ be
> illegalities in my marc, would rather the indexer recovered from
> them somehow (or just skipped that record, if nothing else is
> possible), rather than punted the entire input.
>
> Should I switch to SolrMARC instead, is it a bit more forgiving? At
> one point I know I saw a pointer in the documentation to SolrMarc
> docs, to help you get started with solrmarc instead of the bundled
> ruby indexer... but now I can't seem to find it.
>
> Here's what "raek app:index:marc" is telling me (giving me an html
> error message even though it's a command-line rake task, which is a
> bit odd).
> rake aborted!
> <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html;
> charset=ISO-8859-1"/>
> <title>Error 400 </title>
> </head>
> <body><h2>HTTP ERROR: 400</h2><pre>ERROR: [53567] multiple values
> encountered for non multiValued field title_t: [Honan's Handbook to
> medical Europe, A ready reference book to the universities,
> hospitals, clinics, laboratories and general medical work of the
> principal cities of Europe]</pre>
> <p>RequestURI=/solr/update</p><p><i><small><a
> href="http://jetty.mortbay.org/
> ">Powered by Jetty://</a></small></i></p><br/>
> _______________________________________________
> Blacklight-development mailing list
> Blacklight-development at rubyforge.org<mailto:Blacklight-development at rubyforge.org><mailto:Blacklight-development at rubyforge.org<mailto:Blacklight-development at rubyforge.org>>
>
> http://rubyforge.org/mailman/listinfo/blacklight-development
> Blacklightopac Blog http://blacklightopac.org/
>
>
>
> _______________________________________________
> Blacklight-development mailing list
> Blacklight-development at rubyforge.org<mailto:Blacklight-development at rubyforge.org><mailto:Blacklight-development at rubyforge.org<mailto:Blacklight-development at rubyforge.org>>
>
> http://rubyforge.org/mailman/listinfo/blacklight-development
> Blacklightopac Blog http://blacklightopac.org/
>
>
>
> _______________________________________________
> Blacklight-development mailing list
> Blacklight-development at rubyforge.org<mailto:Blacklight-development at rubyforge.org><mailto:Blacklight-development at rubyforge.org<mailto:Blacklight-development at rubyforge.org>>
>
> http://rubyforge.org/mailman/listinfo/blacklight-development
> Blacklightopac Blog http://blacklightopac.org/
>
>
> _______________________________________________
> Blacklight-development mailing list
> Blacklight-development at rubyforge.org<mailto:Blacklight-development at rubyforge.org><mailto:Blacklight-development at rubyforge.org<mailto:Blacklight-development at rubyforge.org>>
>
> http://rubyforge.org/mailman/listinfo/blacklight-development
> Blacklightopac Blog http://blacklightopac.org/
>
>
> _______________________________________________
> Blacklight-development mailing list
> Blacklight-development at rubyforge.org<mailto:Blacklight-development at rubyforge.org><mailto:Blacklight-development at rubyforge.org<mailto:Blacklight-development at rubyforge.org>>
>
> http://rubyforge.org/mailman/listinfo/blacklight-development
> Blacklightopac Blog http://blacklightopac.org/
>
> _______________________________________________
> Blacklight-development mailing list
> Blacklight-development at rubyforge.org<mailto:Blacklight-development at rubyforge.org><mailto:Blacklight-development at rubyforge.org<mailto:Blacklight-development at rubyforge.org>>
>
> http://rubyforge.org/mailman/listinfo/blacklight-development
> Blacklightopac Blog http://blacklightopac.org/
>
>
>
> _______________________________________________
> Blacklight-development mailing list
> Blacklight-development at rubyforge.org<mailto:Blacklight-development at rubyforge.org>
> http://rubyforge.org/mailman/listinfo/blacklight-development
> Blacklightopac Blog http://blacklightopac.org/
>
>
>
More information about the Blacklight-development
mailing list