[Blacklight-development] more little importer tricks - a fix?
Naomi Dushay
ndushay at stanford.edu
Fri May 2 14:11:16 EDT 2008
The java marc importer uses a different version of lucene than
the .tar.gz of blacklight, and the index format is incompatible.
unpacking the solr.war, the lucene jar files are all named lucene-
blah-2007-05-20
the lucene jars in the java importer are lucene version 2.3.1, which
seem to date from 2008-02-22
I swapped out the old lucene jar files for the new ones and made a new
solr.war. Putting my new solr.war into jetty, then firing up solr ...
makes everything happy!
So, to summarize:
1. java importer: index.sh line breaks are not in (lin)ux format.
(thanks Jonathan!)
2. java importer: sample file name in GettingStarted.txt has a typo
(thanks Jonathan!)
3. java importer: sample data has a record with multiple 020
subfield a values. To get the data to index cleanly, you much change
a line in blacklight.properties file:
from
field_list_25a = isbn_display, all, 020a
to
field_list_25a = isbn_display, first, 020a
4. blacklight itself: solr.war in the jetty/webapps directory has an
older version of lucene jars which is incompatible with the lucene
jars in the marc importer. The jars need to be the same. I changed
the jars in the solr.war by unpacking the war, substituting the new
lucene jar files, then repacking the war and putting the new solr.war
in jetty/webapps. It might also work to just put the lucene jars from
the solr.war into the java importer lib, replacing the ones that are
there.
Jonathan -- would you like to test these fixes on your system?
hope this helps!
- Naomi
On May 2, 2008, at 9:20 AM, Naomi Dushay wrote:
> Hi Matt,
>
> I got my blacklight code from the .tar.gz download file, not
> directly from svn. Do I need to pull some updates from the trunk?
>
> - Naomi
>
>
> On May 1, 2008, at 6:04 PM, Matt Mitchell wrote:
>> Hi Naomi,
>>
>> Thanks for the detailed report! The _display fields should be
>> multiValued and the current schema is here:
>>
>> http://blacklight.rubyforge.org/svn/trunk/rails/solr/conf/schema.xml
>>
>> Are you using the trunk version or the last release?
>>
>> Matt
>>
>> On Thu, May 1, 2008 at 8:50 PM, Naomi Dushay <ndushay at stanford.edu>
>> wrote:
>> I tried the importer just now (just pulled it from svn, too.) and
>> hit a few bumps also. I concur with the index.sh problems ... I
>> just ended up executing the java command directly from the command
>> line.
>>
>> I believe the sample data has a record with two 020 subfield a
>> values. From the output of the importer on the sample file:
>>
>> Adding record 8: u89
>> Error indexing
>> org.apache.solr.common.SolrException: ERROR: multiple values
>> encountered for non multiValued field isbn_display:
>> first='0877663637' second='0877663343 (pbk.)'
>> at
>> org
>> .apache
>> .solr.update.DocumentBuilder.addSingleField(DocumentBuilder.java:67)
>> at
>> org
>> .apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:88)
>> at
>> org
>> .apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:
>> 118)
>> at
>> org
>> .apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:
>> 101)
>> at SolrIndexer.addField(Unknown Source)
>> at SolrIndexer.addFields(Unknown Source)
>> at SolrIndexer.indexRecord(Unknown Source)
>> at MarcImporter.addToIndex(Unknown Source)
>> at MarcImporter.importRecords(Unknown Source)
>> at MarcImporter.main(Unknown Source)
>> Adding record 9: u144
>>
>> The SOLR schema.xml file in blacklight/solr-home/conf directory
>> says that all *_display fields are NOT multiValued.
>>
>> To get the sample data to index without an error, it's just a
>> matter of changing one line in the blacklight.properties file:
>>
>> from:
>> field_list_25a = isbn_display, all, 020a
>>
>> to
>> field_list_25a = isbn_display, first, 020a
>>
>> HOWEVER, I am still getting a
>>
>> java.lang.RuntimeException:
>> org.apache.lucene.index.CorruptIndexException: Unknown format
>> version: -4
>>
>> when trying to view the index with solr (http://yer.path:8983/solr/admin
>> ).
>>
>> But I can look at the same index with luke without a problem.
>>
>>
>> - Naomi Dushay
>> Stanford University Libraries
>> ndushay at stanford.edu
>>
>>
>>
>>
>> _______________________________________________
>> Blacklight-development mailing list
>> Blacklight-development at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/blacklight-development
>>
>>
>> _______________________________________________
>> Blacklight-development mailing list
>> Blacklight-development at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/blacklight-development
>
> Naomi Dushay
> ndushay at stanford.edu
>
>
>
> _______________________________________________
> Blacklight-development mailing list
> Blacklight-development at rubyforge.org
> http://rubyforge.org/mailman/listinfo/blacklight-development
Naomi Dushay
ndushay at stanford.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://rubyforge.org/pipermail/blacklight-development/attachments/20080502/c62b3dcc/attachment.html>
More information about the Blacklight-development
mailing list