[Blacklight-development] more indexing woes

Jonathan Rochkind rochkind at jhu.edu
Thu May 1 16:31:04 EDT 2008


Okay, now I tried to index the file of ~1000 MARC records that Bess gave 
me once.

When they indexer ran, it produced a LOT of errors of the form:
Error indexing
org.marc4j.MarcException: unable to parse record length
        at 
org.marc4j.MarcStreamReader.parseLeader(MarcStreamReader.java:317)
        at org.marc4j.MarcStreamReader.next(MarcStreamReader.java:138)
        at MarcTranslatedReader.next(Unknown Source)
        at MarcImporter.importRecords(Unknown Source)
        at MarcImporter.main(Unknown Source)
Caused by: java.lang.NumberFormatException: For input string: "or, R"
        at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
        at java.lang.Integer.parseInt(Integer.java:447)
        at java.lang.Integer.parseInt(Integer.java:497)

I can't tell if it got this error for _every_ record or not, the output 
isn't sufficient. When it finished, it did claim that it "Indexed 1090 
at a rate of about 1302.0per sec".  (Yes, it sure is fast). But I don't 
think I believe it.

Looking at my SOLR stats... 

Still, numDocs : 0

When I do the sample set that came with blacklight_importer, it doesn't 
report any errors, but STILL doesn't succesfully import anything.

So, I'm stymied. Well, that was my open source day for the week. Hear me 
again next week on Thursday when I have another blacklight day and bang 
my head against this again. But I'm not quite sure what to do next, I'm 
stymied.

Jonathan

-- 
Jonathan Rochkind
Digital Services Software Engineer
The Sheridan Libraries
Johns Hopkins University
410.516.8886 
rochkind (at) jhu.edu



More information about the Blacklight-development mailing list