From std5 at nyu.edu Fri Feb 1 22:13:35 2013 From: std5 at nyu.edu (Scot Dalton) Date: Fri, 1 Feb 2013 17:13:35 -0500 Subject: [Umlaut-general] Rails must be updated In-Reply-To: <665DBC51D0250A47B4F9306CE71E5FB776C9A96F@JHEMTEBEX1.win.ad.jhu.edu> References: <665DBC51D0250A47B4F9306CE71E5FB776C9A96F@JHEMTEBEX1.win.ad.jhu.edu> Message-ID: Hi, We attempted to upgrade to Umlaut3/Rails3 today, but had an issue with our mysql connections and had to roll back. We continuously got this error in our logs: 'Umlaut: Threaded service raised exception. Service:, ActiveRecord::ConnectionTimeoutError could not obtain a database connection within 5.000 seconds (waited 5.000 seconds)' Jonathan, it looks like the error is coming from your AR monkey patch, which is intimidating to say the least. Any thoughts? Thanks, Scot On Jan 10, 2013, at Jan 10, 9:18 AM, Jonathan Rochkind wrote: > There is a significant Rails security vulnerability announced -- the vulnerability is as bad as it gets, possibly allowing attackers to execute arbitrary code on your server. > > You must update the version of Rails you are using. > > If you are on Umlaut3/Rails3, this is pretty easy. From a source checkout of your app, > > bundle update rails > bundle show rails (make sure it's 3.2.11 or the latest 3.0.x or 3.1.x ; if not you might need to edit your Gemfile to allow the update) > Commit your Gemfile.lock to your repo, and redeploy your app from your repo > > > But I'm worried some of you guys are still on Rails2/Umlaut2? Umlaut 2 runs on Rails 2.1 or something, not the latest Rails 2.3 branch. There is no security patch release for Rails pre 2.3. There are still ways to change your configuration to close the exploit, but exactly what code may depend on exactly what version of Rails. I can try to help you figure it out if you need it. > > (PS: Get off Umlaut2, really, please). > > Jonathan > > > _______________________________________________ > Umlaut-general mailing list > Umlaut-general at rubyforge.org > http://rubyforge.org/mailman/listinfo/umlaut-general -- Scot Dalton Phone: (212) 998-2674 Web Services Division of Libraries New York University From rochkind at jhu.edu Fri Feb 1 23:56:17 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Fri, 1 Feb 2013 23:56:17 +0000 Subject: [Umlaut-general] Rails must be updated In-Reply-To: References: <665DBC51D0250A47B4F9306CE71E5FB776C9A96F@JHEMTEBEX1.win.ad.jhu.edu>, Message-ID: <665DBC51D0250A47B4F9306CE71E5FB776CCCB6E@JHEMTEBEX1.win.ad.jhu.edu> Hmm, that's odd, I wonder why I haven't had that problem. Are you using MySQL like I am? (It ought to work either way, just trying to pin down any possible differences). You can certainly try removing my monkey patch and seeing what happens. But what my monkey patch is meant to do is _avoid_ connection pool timeout errors like you're seeing -- without my monkey patch (which is actually based on code by someone else), threads can get 'starved' out of database connections. With the monkey patch, threads waiting for a connection should get one on a "first waiting, first access to a connection" basis. (This is fixed in Rails4, but we coudn't get the patch into Rails 3, for annoying reasons). What is your connection pool size set at, with the "pool" key in your database.yml? While it ought to _probably_ work even at the default size of "5", things should work better at 10 or 15 -- I keep mine at 15, and i think the umlaut docs recommend 15. (_without_ my monkey patch, I couldn't get by with even 15). I could have sworn you guys were already on rails3, or at least had already tried it out -- did you not see this error in dev/staging, but only in production under load? Trying to get a fuller account of the context here. ________________________________________ From: umlaut-general-bounces at rubyforge.org [umlaut-general-bounces at rubyforge.org] on behalf of Scot Dalton [std5 at nyu.edu] Sent: Friday, February 01, 2013 5:13 PM To: umlaut-general at rubyforge.org Subject: Re: [Umlaut-general] Rails must be updated Hi, We attempted to upgrade to Umlaut3/Rails3 today, but had an issue with our mysql connections and had to roll back. We continuously got this error in our logs: 'Umlaut: Threaded service raised exception. Service:, ActiveRecord::ConnectionTimeoutError could not obtain a database connection within 5.000 seconds (waited 5.000 seconds)' Jonathan, it looks like the error is coming from your AR monkey patch, which is intimidating to say the least. Any thoughts? Thanks, Scot On Jan 10, 2013, at Jan 10, 9:18 AM, Jonathan Rochkind wrote: > There is a significant Rails security vulnerability announced -- the vulnerability is as bad as it gets, possibly allowing attackers to execute arbitrary code on your server. > > You must update the version of Rails you are using. > > If you are on Umlaut3/Rails3, this is pretty easy. From a source checkout of your app, > > bundle update rails > bundle show rails (make sure it's 3.2.11 or the latest 3.0.x or 3.1.x ; if not you might need to edit your Gemfile to allow the update) > Commit your Gemfile.lock to your repo, and redeploy your app from your repo > > > But I'm worried some of you guys are still on Rails2/Umlaut2? Umlaut 2 runs on Rails 2.1 or something, not the latest Rails 2.3 branch. There is no security patch release for Rails pre 2.3. There are still ways to change your configuration to close the exploit, but exactly what code may depend on exactly what version of Rails. I can try to help you figure it out if you need it. > > (PS: Get off Umlaut2, really, please). > > Jonathan > > > _______________________________________________ > Umlaut-general mailing list > Umlaut-general at rubyforge.org > http://rubyforge.org/mailman/listinfo/umlaut-general -- Scot Dalton Phone: (212) 998-2674 Web Services Division of Libraries New York University _______________________________________________ Umlaut-general mailing list Umlaut-general at rubyforge.org http://rubyforge.org/mailman/listinfo/umlaut-general From std5 at nyu.edu Sat Feb 2 02:10:57 2013 From: std5 at nyu.edu (Scot Dalton) Date: Fri, 1 Feb 2013 21:10:57 -0500 Subject: [Umlaut-general] Rails must be updated In-Reply-To: <665DBC51D0250A47B4F9306CE71E5FB776CCCB6E@JHEMTEBEX1.win.ad.jhu.edu> References: <665DBC51D0250A47B4F9306CE71E5FB776C9A96F@JHEMTEBEX1.win.ad.jhu.edu>, <665DBC51D0250A47B4F9306CE71E5FB776CCCB6E@JHEMTEBEX1.win.ad.jhu.edu> Message-ID: Only in prod under load. Didn't see it in test or staging at all. Connection pool it set at 10, but I can try bumping it up to 15. Are you running passenger? If so, how many instances do you use? We're using MySQL but the MySQL server seems to get bogged down by some selects. Not sure if this is what is causing the timeout. Very confusing. Thanks, Scot On Feb 1, 2013, at Feb 1, 6:56 PM, Jonathan Rochkind wrote: > Hmm, that's odd, I wonder why I haven't had that problem. Are you using MySQL like I am? (It ought to work either way, just trying to pin down any possible differences). > > You can certainly try removing my monkey patch and seeing what happens. But what my monkey patch is meant to do is _avoid_ connection pool timeout errors like you're seeing -- without my monkey patch (which is actually based on code by someone else), threads can get 'starved' out of database connections. With the monkey patch, threads waiting for a connection should get one on a "first waiting, first access to a connection" basis. (This is fixed in Rails4, but we coudn't get the patch into Rails 3, for annoying reasons). > > What is your connection pool size set at, with the "pool" key in your database.yml? While it ought to _probably_ work even at the default size of "5", things should work better at 10 or 15 -- I keep mine at 15, and i think the umlaut docs recommend 15. (_without_ my monkey patch, I couldn't get by with even 15). > > I could have sworn you guys were already on rails3, or at least had already tried it out -- did you not see this error in dev/staging, but only in production under load? Trying to get a fuller account of the context here. > ________________________________________ > From: umlaut-general-bounces at rubyforge.org [umlaut-general-bounces at rubyforge.org] on behalf of Scot Dalton [std5 at nyu.edu] > Sent: Friday, February 01, 2013 5:13 PM > To: umlaut-general at rubyforge.org > Subject: Re: [Umlaut-general] Rails must be updated > > Hi, > We attempted to upgrade to Umlaut3/Rails3 today, but had an issue with our mysql connections and had to roll back. > > We continuously got this error in our logs: > 'Umlaut: Threaded service raised exception. Service:, ActiveRecord::ConnectionTimeoutError could not obtain a database connection within 5.000 seconds (waited 5.000 seconds)' > > Jonathan, it looks like the error is coming from your AR monkey patch, which is intimidating to say the least. > > Any thoughts? > > Thanks, > Scot > > On Jan 10, 2013, at Jan 10, 9:18 AM, Jonathan Rochkind wrote: > >> There is a significant Rails security vulnerability announced -- the vulnerability is as bad as it gets, possibly allowing attackers to execute arbitrary code on your server. >> >> You must update the version of Rails you are using. >> >> If you are on Umlaut3/Rails3, this is pretty easy. From a source checkout of your app, >> >> bundle update rails >> bundle show rails (make sure it's 3.2.11 or the latest 3.0.x or 3.1.x ; if not you might need to edit your Gemfile to allow the update) >> Commit your Gemfile.lock to your repo, and redeploy your app from your repo >> >> >> But I'm worried some of you guys are still on Rails2/Umlaut2? Umlaut 2 runs on Rails 2.1 or something, not the latest Rails 2.3 branch. There is no security patch release for Rails pre 2.3. There are still ways to change your configuration to close the exploit, but exactly what code may depend on exactly what version of Rails. I can try to help you figure it out if you need it. >> >> (PS: Get off Umlaut2, really, please). >> >> Jonathan >> >> >> _______________________________________________ >> Umlaut-general mailing list >> Umlaut-general at rubyforge.org >> http://rubyforge.org/mailman/listinfo/umlaut-general > > > -- > Scot Dalton > Phone: (212) 998-2674 > Web Services > Division of Libraries > New York University > > _______________________________________________ > Umlaut-general mailing list > Umlaut-general at rubyforge.org > http://rubyforge.org/mailman/listinfo/umlaut-general > _______________________________________________ > Umlaut-general mailing list > Umlaut-general at rubyforge.org > http://rubyforge.org/mailman/listinfo/umlaut-general -- Scot Dalton Phone: (212) 998-2674 Web Services Division of Libraries New York University From rochkind at jhu.edu Sat Feb 2 14:22:48 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Sat, 2 Feb 2013 14:22:48 +0000 Subject: [Umlaut-general] Rails must be updated In-Reply-To: References: <665DBC51D0250A47B4F9306CE71E5FB776C9A96F@JHEMTEBEX1.win.ad.jhu.edu> <665DBC51D0250A47B4F9306CE71E5FB776CCCB6E@JHEMTEBEX1.win.ad.jhu.edu>, Message-ID: <665DBC51D0250A47B4F9306CE71E5FB776CCCD8F@JHEMTEBEX1.win.ad.jhu.edu> I am using passenger; the number of instances in passenger is unlikely to be related to this bug, although I think I have only 5. Hmm, another thought -- in any ActiveRecord code that runs in threads, you need to wrap all ActiveRecord-touching code in #with_connection blocks. If you don't, connections will be implicitly checked out of the connection pool, but never checked in -- that is, leaked connections, which would result in the error you see. (There are other ways to avoid leaked connections than using #with_connection, but IMO they are even more insane). In Umlaut, it's pretty much just the 'services' that run in alternate threads. (The 'main' Rails request-response thread has active record checkouts taken care of for it by Rails, it's just extra threads you create yourself where it's an issue). If your services use the Umlaut API to access the underlying database, then the #with_connection is taken care of for you. But I wonder if your primo/aleph services (which I have not tested extensively myself, not having primo or aleph) make some direct ActiveRecord calls, without wrapping them in #with_connection? That could cause the problems you are seeing. Dealing with ActiveRecord and concurrency is one of the most challenging things I've had to do in Umlaut. I had hoped i had taken care of it for everyone (after much work), sad to see you still having problems with it. Here's what I'd do if I were you. First, you're going to have to find a way to 'simulate' the production load on a non-production server, because that's the only way you're going to be able to tell if you've solved the problem anyway, right? And will be very helpful in identifying the problem by process of elimination. So you need to come up with some way to have some software automatically sending requests against your app -- in a way that _does_ result in the error condition, so you know you have a reproduction case to begin with. Then remove the aleph and primo services, and test again -- does the problem go away? (Obviously you couldn't do this in production, is why you have to reproduce in demo somehow). I suspect it will. If it does go away, then you could try putting in just one of both aleph and primo to see if the problem can be isolated to just one of them. But I'd also then wrap the _entire_ execution block (do_request) of the aleph and/or primo connectors in a #with_connection block -- wrapping the entire method in with_connection should result in a situation that requires fewer connections in the pool than without doing anything, but still more connections to be avail in the pool than if you had actually granularly wrapped each ActiveRecord database call in a #with_connection. And/or, then scour the code for lines that either explicitly or implicitly are making a database call (select or save, anything using a db connection), but are doing so 'directly' instead of using Umlaut API. At one point I had some trick that monkey patched ActiveRecord to actually raise on using a db connection without a checkout, so you could find these in debugging (I think AR probably ought to do this by default, ugh, AR). So actually, that'd be a whole seperate option -- cut right to that, I can try to figure out how to do that agian and share it to you. If we were pretty sure it was Aleph/Primo connector logic accidentally using an AR connection without a checkout -- but it would be nice to confirm the problem was in aleph/primo adapters first. Phew. Hope this helps. Glad to provide more assistance as I can. Sorry ActiveRecord concurrency is such a mess. (Another 'nuclear' option would be turning off multi-threaded concurrency in Umlaut, which is possible, but would probably have disastrous effects on your end-use response time). ________________________________________ From: umlaut-general-bounces at rubyforge.org [umlaut-general-bounces at rubyforge.org] on behalf of Scot Dalton [std5 at nyu.edu] Sent: Friday, February 01, 2013 9:10 PM To: umlaut-general at rubyforge.org Subject: Re: [Umlaut-general] Rails must be updated Only in prod under load. Didn't see it in test or staging at all. Connection pool it set at 10, but I can try bumping it up to 15. Are you running passenger? If so, how many instances do you use? We're using MySQL but the MySQL server seems to get bogged down by some selects. Not sure if this is what is causing the timeout. Very confusing. Thanks, Scot On Feb 1, 2013, at Feb 1, 6:56 PM, Jonathan Rochkind wrote: > Hmm, that's odd, I wonder why I haven't had that problem. Are you using MySQL like I am? (It ought to work either way, just trying to pin down any possible differences). > > You can certainly try removing my monkey patch and seeing what happens. But what my monkey patch is meant to do is _avoid_ connection pool timeout errors like you're seeing -- without my monkey patch (which is actually based on code by someone else), threads can get 'starved' out of database connections. With the monkey patch, threads waiting for a connection should get one on a "first waiting, first access to a connection" basis. (This is fixed in Rails4, but we coudn't get the patch into Rails 3, for annoying reasons). > > What is your connection pool size set at, with the "pool" key in your database.yml? While it ought to _probably_ work even at the default size of "5", things should work better at 10 or 15 -- I keep mine at 15, and i think the umlaut docs recommend 15. (_without_ my monkey patch, I couldn't get by with even 15). > > I could have sworn you guys were already on rails3, or at least had already tried it out -- did you not see this error in dev/staging, but only in production under load? Trying to get a fuller account of the context here. > ________________________________________ > From: umlaut-general-bounces at rubyforge.org [umlaut-general-bounces at rubyforge.org] on behalf of Scot Dalton [std5 at nyu.edu] > Sent: Friday, February 01, 2013 5:13 PM > To: umlaut-general at rubyforge.org > Subject: Re: [Umlaut-general] Rails must be updated > > Hi, > We attempted to upgrade to Umlaut3/Rails3 today, but had an issue with our mysql connections and had to roll back. > > We continuously got this error in our logs: > 'Umlaut: Threaded service raised exception. Service:, ActiveRecord::ConnectionTimeoutError could not obtain a database connection within 5.000 seconds (waited 5.000 seconds)' > > Jonathan, it looks like the error is coming from your AR monkey patch, which is intimidating to say the least. > > Any thoughts? > > Thanks, > Scot > > On Jan 10, 2013, at Jan 10, 9:18 AM, Jonathan Rochkind wrote: > >> There is a significant Rails security vulnerability announced -- the vulnerability is as bad as it gets, possibly allowing attackers to execute arbitrary code on your server. >> >> You must update the version of Rails you are using. >> >> If you are on Umlaut3/Rails3, this is pretty easy. From a source checkout of your app, >> >> bundle update rails >> bundle show rails (make sure it's 3.2.11 or the latest 3.0.x or 3.1.x ; if not you might need to edit your Gemfile to allow the update) >> Commit your Gemfile.lock to your repo, and redeploy your app from your repo >> >> >> But I'm worried some of you guys are still on Rails2/Umlaut2? Umlaut 2 runs on Rails 2.1 or something, not the latest Rails 2.3 branch. There is no security patch release for Rails pre 2.3. There are still ways to change your configuration to close the exploit, but exactly what code may depend on exactly what version of Rails. I can try to help you figure it out if you need it. >> >> (PS: Get off Umlaut2, really, please). >> >> Jonathan >> >> >> _______________________________________________ >> Umlaut-general mailing list >> Umlaut-general at rubyforge.org >> http://rubyforge.org/mailman/listinfo/umlaut-general > > > -- > Scot Dalton > Phone: (212) 998-2674 > Web Services > Division of Libraries > New York University > > _______________________________________________ > Umlaut-general mailing list > Umlaut-general at rubyforge.org > http://rubyforge.org/mailman/listinfo/umlaut-general > _______________________________________________ > Umlaut-general mailing list > Umlaut-general at rubyforge.org > http://rubyforge.org/mailman/listinfo/umlaut-general -- Scot Dalton Phone: (212) 998-2674 Web Services Division of Libraries New York University _______________________________________________ Umlaut-general mailing list Umlaut-general at rubyforge.org http://rubyforge.org/mailman/listinfo/umlaut-general From riwi at dtic.dtu.dk Wed Feb 6 15:20:46 2013 From: riwi at dtic.dtu.dk (Rikke Willer) Date: Wed, 6 Feb 2013 15:20:46 +0000 Subject: [Umlaut-general] Service type request parameter? Message-ID: <8B09A024477AC941AE11850045CFDF781BA054@ait-pex02mbx05.win.dtu.dk> Hi Umlaut people, I am wondering whether it is possible to specify which service types to include in an Umlaut response? For instance only include 'fulltext' service responses or only include 'cover_image' and 'abstract' service responses in the response. If not, one option would be to override the create_collection method in Umlaut::ControllerBehavior and filter services based on a custom url parameter. Would this be a recommended way to go about it? Thanks! Rikke Willer Programmer DTU Library --------------------------------------- Technical University of Denmark Technical Information Center of Denmark Anker Engelunds Vej 1 P.O. Box 777 Building 101D 2800 Kgs. Lyngby Direct +45 45257254 riwi at dtic.dtu.dk http://www.dtic.dtu.dk/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rossfsinger at gmail.com Wed Feb 6 15:45:23 2013 From: rossfsinger at gmail.com (Ross Singer) Date: Wed, 6 Feb 2013 10:45:23 -0500 Subject: [Umlaut-general] Service type request parameter? In-Reply-To: <8B09A024477AC941AE11850045CFDF781BA054@ait-pex02mbx05.win.dtu.dk> References: <8B09A024477AC941AE11850045CFDF781BA054@ait-pex02mbx05.win.dtu.dk> Message-ID: <5FB9E8F6-2094-4430-83B8-68CF31C605A8@gmail.com> Hi Rikke, This is definitely doable, although I'm not sure Umlaut at the moment has handlers for specific service responses (it might, Jonathan does a lot of stuff like this in his 'embedded Umlaut' stuff). It used to. This is exactly the sort of thing I had in mind, back in the day. I would definitely use the 'serviceType' entity (svc.foo) in your requests. Here's an example, using the "scholarly community service types" community format: http://alcme.oclc.org/openurl/servlet/OAIHandler/extension?verb=GetMetadata&metadataPrefix=mtx&identifier=info:ofi/fmt:kev:mtx:sch_svc So you would have keys like: svc.fulltext=yes&svc.abstract=yes etc. It's not illegal to include what OpenURL refers to as "private keys" in an entity (basically they're not illegal, they're just non-standard, so you results will be dependent upon the resolver), so you could also include a svc.coverimage=yes, as well. Good luck! -Ross. On Feb 6, 2013, at 10:20 AM, Rikke Willer wrote: > > Hi Umlaut people, > > I am wondering whether it is possible to specify which service types to include in an Umlaut response? For instance only include 'fulltext' service responses or only include > 'cover_image' and 'abstract' service responses in the response. > > If not, one option would be to override the create_collection method in Umlaut::ControllerBehavior and filter services based on a custom url parameter. Would this be a recommended way to go about it? > > Thanks! > > > Rikke Willer > Programmer > DTU Library > --------------------------------------- > Technical University of Denmark > Technical Information Center of Denmark > Anker Engelunds Vej 1 > P.O. Box 777 > Building 101D > 2800 Kgs. Lyngby > Direct +45 45257254 > riwi at dtic.dtu.dk > http://www.dtic.dtu.dk/ > > > > _______________________________________________ > Umlaut-general mailing list > Umlaut-general at rubyforge.org > http://rubyforge.org/mailman/listinfo/umlaut-general -------------- next part -------------- An HTML attachment was scrubbed... URL: From rochkind at jhu.edu Wed Feb 6 16:51:36 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Wed, 06 Feb 2013 11:51:36 -0500 Subject: [Umlaut-general] Service type request parameter? In-Reply-To: <5FB9E8F6-2094-4430-83B8-68CF31C605A8@gmail.com> References: <8B09A024477AC941AE11850045CFDF781BA054@ait-pex02mbx05.win.dtu.dk> <5FB9E8F6-2094-4430-83B8-68CF31C605A8@gmail.com> Message-ID: <51128A18.8000308@jhu.edu> Ah, that's probably a good way to do it. Although i woudln't be opposed to a shortcut umlaut.only_service_type=fulltext either, which meant "yes to fulltext, no to everything else." Rather than or in addition to the API rsinger suggests that is more granular. It depends on the actual use cases. It's definitely possible to get Umlaut to do either one; I don't think Umlaut currently does either one. I don't think Umlaut has 'handlers for specific service responses' (Not even sure what part of currently existing Umlaut architecture, if any, is the 'handler' ross mentions -- I don't think Umlaut works that way). But yeah, we can make umlaut do that, either with a local hack or adding it to Umlaut. Somethign I'm still not sure about: If you tell Umlaut to display only 'fulltext' -- should Umlaut still _fetch_ everything, but only _display_ the fulltext? This way the other stuff is still there cached if you re-load asking for everything, not just fulltext, although we'd have to make sure the cache reloading works right. OR, should Umlaut attempt to only execute services involving 'fulltext' in the first place? That would possibly lead to faster response times. However, as I think about this, I think the former (fetch everything but only _display_ the requested types) is probably easier to implement in Umlaut anyway, and probably makes the most sense in Umlaut's architecture. Still curious if anyone has any opinions there. On 2/6/2013 10:45 AM, Ross Singer wrote: > Hi Rikke, > > This is definitely doable, although I'm not sure Umlaut at the moment > has handlers for specific service responses (it might, Jonathan does a > lot of stuff like this in his 'embedded Umlaut' stuff). It used to. > This is exactly the sort of thing I had in mind, back in the day. > > I would definitely use the 'serviceType' entity (svc.foo) in your requests. > > Here's an example, using the "scholarly community service types" > community format: > http://alcme.oclc.org/openurl/servlet/OAIHandler/extension?verb=GetMetadata&metadataPrefix=mtx&identifier=info:ofi/fmt:kev:mtx:sch_svc > > So you would have keys like: > svc.fulltext=yes&svc.abstract=yes > etc. > > It's not illegal to include what OpenURL refers to as "private keys" in > an entity (basically they're not illegal, they're just non-standard, so > you results will be dependent upon the resolver), so you could also > include a svc.coverimage=yes, as well. > > Good luck! > -Ross. > > On Feb 6, 2013, at 10:20 AM, Rikke Willer > wrote: > >> >> Hi Umlaut people, >> >> I am wondering whether it is possible to specify which service types >> to include in an Umlaut response? For instance only include 'fulltext' >> service responses or only include >> 'cover_image' and 'abstract' service responses in the response. >> >> If not, one option would be to override the create_collection method >> in Umlaut::ControllerBehavior and filter services based on a custom >> url parameter. Would this be a recommended way to go about it? >> >> Thanks! >> >> >> Rikke Willer >> Programmer >> DTU Library >> --------------------------------------- >> Technical University of Denmark >> Technical Information Center of Denmark >> Anker Engelunds Vej 1 >> P.O. Box 777 >> Building 101D >> 2800 Kgs. Lyngby >> Direct +45 45257254 >> riwi at dtic.dtu.dk >> http://www.dtic.dtu.dk/ >> >> >> >> _______________________________________________ >> Umlaut-general mailing list >> Umlaut-general at rubyforge.org >> http://rubyforge.org/mailman/listinfo/umlaut-general > > > > _______________________________________________ > Umlaut-general mailing list > Umlaut-general at rubyforge.org > http://rubyforge.org/mailman/listinfo/umlaut-general > From rochkind at jhu.edu Wed Feb 6 16:44:46 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Wed, 06 Feb 2013 11:44:46 -0500 Subject: [Umlaut-general] Service type request parameter? In-Reply-To: <8B09A024477AC941AE11850045CFDF781BA054@ait-pex02mbx05.win.dtu.dk> References: <8B09A024477AC941AE11850045CFDF781BA054@ait-pex02mbx05.win.dtu.dk> Message-ID: <5112887E.4010603@jhu.edu> Hi Rikke. I went to look at this wiki page: https://github.com/team-umlaut/umlaut/wiki/Umlaut-URL-Parameters To see if there was a documented way to do that. But it's not there, so it's probably not present in Umlaut. I think that feature does probably make sense, you can file a Github issue asking for it, to put it on our list. I might get to adding it at some point, but I might not, pretty busy at work right now. If you can figure out how to add it, I'd look at a pull request -- it's been too long since I was down in the dirt with the Umlaut code to be sure if the way you suggest is the right way or not, but it sounds promissing. One question to ask yourself is about how Umlaut deals with reloading an already cached request -- if you reload with a different spec for what service types to include, do you expect the entire request to be reloaded from scratch; do you expect the original request to be re-used, but only _show_ differnet services, etc. I'm not sure! Might depend on what you're doing. Can you tell us, for our own curiosity, what use cases you want this feature for? (I will try to add it myself for you, just can't promise i'll find the time!) On 2/6/2013 10:20 AM, Rikke Willer wrote: > > Hi Umlaut people, > > I am wondering whether it is possible to specify which service types to > include in an Umlaut response? For instance only include 'fulltext' > service responses or only include > 'cover_image' and 'abstract' service responses in the response. > > If not, one option would be to override the create_collection method in > Umlaut::ControllerBehavior and filter services based on a custom url > parameter. Would this be a recommended way to go about it? > > Thanks! > > > Rikke Willer > Programmer > DTU Library > --------------------------------------- > Technical University of Denmark > Technical Information Center of Denmark > Anker Engelunds Vej 1 > P.O. Box 777 > Building 101D > 2800 Kgs. Lyngby > Direct +45 45257254 > riwi at dtic.dtu.dk > http://www.dtic.dtu.dk/ > > > > > > _______________________________________________ > Umlaut-general mailing list > Umlaut-general at rubyforge.org > http://rubyforge.org/mailman/listinfo/umlaut-general > From m.e.phillips at durham.ac.uk Wed Feb 6 17:47:13 2013 From: m.e.phillips at durham.ac.uk (PHILLIPS M.E.) Date: Wed, 6 Feb 2013 17:47:13 +0000 Subject: [Umlaut-general] Service type request parameter? In-Reply-To: <51128A18.8000308@jhu.edu> References: <8B09A024477AC941AE11850045CFDF781BA054@ait-pex02mbx05.win.dtu.dk> <5FB9E8F6-2094-4430-83B8-68CF31C605A8@gmail.com> <51128A18.8000308@jhu.edu> Message-ID: > If you tell Umlaut to display only 'fulltext' -- should Umlaut still _fetch_ everything, but only _display_ the fulltext? > This way the other stuff is still there cached if you re-load asking for everything, not just fulltext, although we'd > have to make sure the cache reloading works right. > OR, > should Umlaut attempt to only execute services involving 'fulltext' in the first place? That would possibly lead to > faster response times. The problem with the latter, as far as I understand it, is that any service can enhance the metadata and therefore affect the services in the next wave. There is no way for Umlaut to tell which services may enhance metadata and so if you only executed 'fulltext' services the metadata might not be as good as it could be and the user might therefore not be offered some opportunities for fulltext they might otherwise have. -- Matthew Phillips Electronic Systems Librarian, Durham University Durham University Library, Stockton Road, Durham, DH1 3LY +44 (0)191 334 2941 From rochkind at jhu.edu Wed Feb 6 17:56:29 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Wed, 06 Feb 2013 12:56:29 -0500 Subject: [Umlaut-general] Refworks sending bad DOI's -- fix in Umlaut? Message-ID: <5112994D.3040307@jhu.edu> So I just spent about a day diagnosing a weird issue where Refworks, in some cases, generates a bad OpenURL that prevents the link resolver from succesfully getting the user to fulltext. The case that definitely triggers it is when your refworks citation was saved from a Refworks z39.50 search of pubmed, although there may be other cases. The nature of the problem is that in the OpenURL field for DOI, Refworks doubles the actual DOI resulting in an illegal DOI. For instance, if the actual DOI for the article is: "10.1016/j.vaccine.2012.11.026" Then instead of sending that in the DOI field, Refworks will send: "10.1016/j.vaccine.2012.11.026; 10.1016/j.vaccine.2012.11.026" While a human knows, oh, that's the DOI twice... to software this is of course simply a bad/wrong DOI. And it results in SFX generating bad/wrong fulltext links. (To make matters worse, both space and semi-colon are _technically_ legal characters to include in a single DOI). I've not had much luck getting Refworks to acknowledge this problem or express any interest in fixing it. (It may be Pubmed "fault" for double-including a DOI in their z39.50 response). But I realized I can get Umlaut to work around this with Umlaut's 'referent_filter' feature. It would apply the fix just to incoming OPenURLs with Refworks sid's, splitting on "; " to restore the correct DOI. I can easily do this purely locally -- but I could also include the code in Umlaut, either as an option or as the default that it be applied (still can be configured off). One of the downsides is that since "; " are technically legal substring of a DOI, it _could_ mess up legitimate DOI's that have that actual substring (presumably unlikely, but possible). Any opinions on whether I should include this logic in Umlaut, and if so as default behavior or not? From rochkind at jhu.edu Wed Feb 6 18:19:36 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Wed, 06 Feb 2013 13:19:36 -0500 Subject: [Umlaut-general] Service type request parameter? In-Reply-To: References: <8B09A024477AC941AE11850045CFDF781BA054@ait-pex02mbx05.win.dtu.dk> <5FB9E8F6-2094-4430-83B8-68CF31C605A8@gmail.com> <51128A18.8000308@jhu.edu> Message-ID: <51129EB8.5000705@jhu.edu> True, yeah. I think the safest simplest thing to do is the first one, where Umlaut fetches everything, but only displays the requested services. In theory a service should advertise whether they enhance metadata, in a way that Umlaut can use for it's own decision-making. However, not all services do this, because I added the feature a bit late, and it usually doesn't matter. (Even services I am the author of haven't all been updated to do this). Additinoally, some services fetch multiple times of content (fulltext AND something else), and only some of them have the ability to be told "Okay, normally you fetch multiple kinds, but THIS time, only fetch the fulltext", and there isn't a clear generic Umlaut api for this. So yeah, thinking through it by talking it out, I think the first approach is more realistic and maintainable and less error prone -- if you ask Umlaut for only fulltext, it still fetches everything, but only displays fulltext. (For certain use cases hwere that's unacceptable and you need a lean mean umlaut response, there ought to be a way to create a custom _set of services_ trimmed down what you really want. but that feature is not entirely robust/mature/documented either. But it also cold be. Is why I'm curious to hear more about the actual use case here). On 2/6/2013 12:47 PM, PHILLIPS M.E. wrote: >> If you tell Umlaut to display only 'fulltext' -- should Umlaut still _fetch_ everything, but only _display_ the fulltext? >> This way the other stuff is still there cached if you re-load asking for everything, not just fulltext, although we'd >> have to make sure the cache reloading works right. >> OR, >> should Umlaut attempt to only execute services involving 'fulltext' in the first place? That would possibly lead to >> faster response times. > > The problem with the latter, as far as I understand it, is that any service can enhance the metadata and therefore affect the services in the next wave. There is no way for Umlaut to tell which services may enhance metadata and so if you only executed 'fulltext' services the metadata might not be as good as it could be and the user might therefore not be offered some opportunities for fulltext they might otherwise have. > > From m.e.phillips at durham.ac.uk Thu Feb 7 09:08:22 2013 From: m.e.phillips at durham.ac.uk (PHILLIPS M.E.) Date: Thu, 7 Feb 2013 09:08:22 +0000 Subject: [Umlaut-general] Refworks sending bad DOI's -- fix in Umlaut? In-Reply-To: <5112994D.3040307@jhu.edu> References: <5112994D.3040307@jhu.edu> Message-ID: Given the hassle you've had investigating it, I would say definitely include the code in Umlaut and probably turn it on by default. If the DOI is always perfectly duplicated, you could test for /^(.*); \1$/ with a regexp and you would be very unlikely to get false positives. (Regex untested: it's valid Perl but I'm not yet fluent in Ruby so it may need adjustment.) Matthew -- Matthew Phillips Electronic Systems Librarian, Durham University Durham University Library, Stockton Road, Durham, DH1 3LY +44 (0)191 334 2941 -----Original Message----- From: umlaut-general-bounces at rubyforge.org [mailto:umlaut-general-bounces at rubyforge.org] On Behalf Of Jonathan Rochkind Sent: 06 February 2013 17:56 To: umlaut-general at rubyforge.org Subject: [Umlaut-general] Refworks sending bad DOI's -- fix in Umlaut? So I just spent about a day diagnosing a weird issue where Refworks, in some cases, generates a bad OpenURL that prevents the link resolver from succesfully getting the user to fulltext. The case that definitely triggers it is when your refworks citation was saved from a Refworks z39.50 search of pubmed, although there may be other cases. The nature of the problem is that in the OpenURL field for DOI, Refworks doubles the actual DOI resulting in an illegal DOI. For instance, if the actual DOI for the article is: "10.1016/j.vaccine.2012.11.026" Then instead of sending that in the DOI field, Refworks will send: "10.1016/j.vaccine.2012.11.026; 10.1016/j.vaccine.2012.11.026" While a human knows, oh, that's the DOI twice... to software this is of course simply a bad/wrong DOI. And it results in SFX generating bad/wrong fulltext links. (To make matters worse, both space and semi-colon are _technically_ legal characters to include in a single DOI). I've not had much luck getting Refworks to acknowledge this problem or express any interest in fixing it. (It may be Pubmed "fault" for double-including a DOI in their z39.50 response). But I realized I can get Umlaut to work around this with Umlaut's 'referent_filter' feature. It would apply the fix just to incoming OPenURLs with Refworks sid's, splitting on "; " to restore the correct DOI. I can easily do this purely locally -- but I could also include the code in Umlaut, either as an option or as the default that it be applied (still can be configured off). One of the downsides is that since "; " are technically legal substring of a DOI, it _could_ mess up legitimate DOI's that have that actual substring (presumably unlikely, but possible). Any opinions on whether I should include this logic in Umlaut, and if so as default behavior or not? _______________________________________________ Umlaut-general mailing list Umlaut-general at rubyforge.org http://rubyforge.org/mailman/listinfo/umlaut-general From riwi at dtic.dtu.dk Thu Feb 7 12:04:21 2013 From: riwi at dtic.dtu.dk (Rikke Willer) Date: Thu, 7 Feb 2013 12:04:21 +0000 Subject: [Umlaut-general] Service type request parameter? In-Reply-To: <51129EB8.5000705@jhu.edu> References: <8B09A024477AC941AE11850045CFDF781BA054@ait-pex02mbx05.win.dtu.dk> <5FB9E8F6-2094-4430-83B8-68CF31C605A8@gmail.com> <51128A18.8000308@jhu.edu> <51129EB8.5000705@jhu.edu> Message-ID: <8B09A024477AC941AE11850045CFDF781BAAE1@ait-pex02mbx05.win.dtu.dk> Hi Ross, Jonathan and Matthew, thank you for the answers. The use case is to use Umlaut from a discovery interface, e.g. Blacklight, to display only fulltext links in a search result list, while showing more details in a single document view. In general I think it would be a nice feature to be able to specify the service types needed on request time, so you by default could enable all the services you have access to/a possible need for in the Umlaut configuration, and on request time decide which to use. In this use case I think it would make the most sense performancewise to only execute the services needed. Although I see the point wrt referent enhancements, I think this would be worth the trade off, as long as this is clearly stated in the documentation. Wrt caching I would suggest just to keep it simple and to regard requests with different service types specified as separate, with separate caching. This would imply that the service types should be specified with the svc parameter rather than an umlaut parameter. I will implement this in our Umlaut application. If you think this solution makes sense in general, I would be happy to contribute the code changes. Thanks again, Rikke Willer Programmer DTU Library --------------------------------------- Technical University of Denmark Technical Information Center of Denmark Anker Engelunds Vej 1 P.O. Box 777 Building 101D 2800 Kgs. Lyngby Direct +45 45257254 riwi at dtic.dtu.dk http://www.dtic.dtu.dk/ On Feb 6, 2013, at 19:19 , Jonathan Rochkind > wrote: True, yeah. I think the safest simplest thing to do is the first one, where Umlaut fetches everything, but only displays the requested services. In theory a service should advertise whether they enhance metadata, in a way that Umlaut can use for it's own decision-making. However, not all services do this, because I added the feature a bit late, and it usually doesn't matter. (Even services I am the author of haven't all been updated to do this). Additinoally, some services fetch multiple times of content (fulltext AND something else), and only some of them have the ability to be told "Okay, normally you fetch multiple kinds, but THIS time, only fetch the fulltext", and there isn't a clear generic Umlaut api for this. So yeah, thinking through it by talking it out, I think the first approach is more realistic and maintainable and less error prone -- if you ask Umlaut for only fulltext, it still fetches everything, but only displays fulltext. (For certain use cases hwere that's unacceptable and you need a lean mean umlaut response, there ought to be a way to create a custom _set of services_ trimmed down what you really want. but that feature is not entirely robust/mature/documented either. But it also cold be. Is why I'm curious to hear more about the actual use case here). On 2/6/2013 12:47 PM, PHILLIPS M.E. wrote: If you tell Umlaut to display only 'fulltext' -- should Umlaut still _fetch_ everything, but only _display_ the fulltext? This way the other stuff is still there cached if you re-load asking for everything, not just fulltext, although we'd have to make sure the cache reloading works right. OR, should Umlaut attempt to only execute services involving 'fulltext' in the first place? That would possibly lead to faster response times. The problem with the latter, as far as I understand it, is that any service can enhance the metadata and therefore affect the services in the next wave. There is no way for Umlaut to tell which services may enhance metadata and so if you only executed 'fulltext' services the metadata might not be as good as it could be and the user might therefore not be offered some opportunities for fulltext they might otherwise have. _______________________________________________ Umlaut-general mailing list Umlaut-general at rubyforge.org http://rubyforge.org/mailman/listinfo/umlaut-general -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.e.phillips at durham.ac.uk Thu Feb 7 17:32:42 2013 From: m.e.phillips at durham.ac.uk (PHILLIPS M.E.) Date: Thu, 7 Feb 2013 17:32:42 +0000 Subject: [Umlaut-general] Problem with res_id in OpenURL Message-ID: I've just been testing Umlaut with Annee Philologique as the source: http://www.annee-philologique.com/ The OpenURLs it produces include a res_id which appears to be the base URL of our resolver (or rather the base URL we have configured: the OpenURLs are routed to us via the UK OpenURL Router service, but that is immaterial). If there is anything in this field, Umlaut throws a wobbly. I have created some bit.ly shortcuts which show the behaviour on a few Umlaut installations I know of: Johns Hopkins: http://bit.ly/WwOFqI Vanderbilt: http://bit.ly/WDg4oV NYU: http://bit.ly/VJZAPe I tried Johns Hopkins SFX as well, and it does not have problems: http://bit.ly/WDg7RS Should I be asking L'Annee Philologique to alter their OpenURL generation, or is the problem somewhere in Umlaut? There is a mention of res_id in section 4.5 of the Z39.88-2004 KEV implementation guidelines at http://alcme.oclc.org/openurl/docs/implementation_guidelines/KEV_Guidelines-20041209.pdf so I suspect it is a problem with Umlaut or the openurl library. I have had a brief look at the code but I cannot work out what it is trying to do. I amy have another go tomorrow. The problem occurs whatever is in the res_id field. In our case it was "http://openurl.ac.uk/ukfed:dur.ac.uk" but I tried just with "rr" and that produced the same error. The error screen starts: NoMethodError in ResolveController#index undefined method `add_identifier' for nil:NilClass Rails.root: /home/dul0zz56/ConneXions There is nothing in the Application Trace. The Framework Trace starts: openurl (0.4.2) lib/openurl/context_object.rb:387:in `block (2 levels) in import_hash' openurl (0.4.2) lib/openurl/context_object.rb:386:in `each' openurl (0.4.2) lib/openurl/context_object.rb:386:in `block in import_hash' openurl (0.4.2) lib/openurl/context_object.rb:340:in `each' openurl (0.4.2) lib/openurl/context_object.rb:340:in `import_hash' openurl (0.4.2) lib/openurl/context_object.rb:233:in `new_from_form_vars' umlaut (3.0.4) app/models/request.rb:33:in `find_or_create' umlaut (3.0.4) app/controllers/resolve_controller.rb:169:in `init_processing' activesupport (3.2.11) lib/active_support/callbacks.rb:418:in `_run__1528741581413148286__process_action__3947138497140214998__callbacks' activesupport (3.2.11) lib/active_support/callbacks.rb:405:in `__run_callback' activesupport (3.2.11) lib/active_support/callbacks.rb:385:in `_run_process_action_callbacks' but you will probably see something similar if you try the example URL for your own servers. -- Matthew Phillips Electronic Systems Librarian, Durham University Durham University Library, Stockton Road, Durham, DH1 3LY +44 (0)191 334 2941 From rochkind at jhu.edu Thu Feb 7 18:28:05 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Thu, 7 Feb 2013 18:28:05 +0000 Subject: [Umlaut-general] Service type request parameter? In-Reply-To: <8B09A024477AC941AE11850045CFDF781BAAE1@ait-pex02mbx05.win.dtu.dk> References: <8B09A024477AC941AE11850045CFDF781BA054@ait-pex02mbx05.win.dtu.dk> <5FB9E8F6-2094-4430-83B8-68CF31C605A8@gmail.com> <51128A18.8000308@jhu.edu> <51129EB8.5000705@jhu.edu>, <8B09A024477AC941AE11850045CFDF781BAAE1@ait-pex02mbx05.win.dtu.dk> Message-ID: <665DBC51D0250A47B4F9306CE71E5FB776CCFA70@JHEMTEBEX1.win.ad.jhu.edu> Cool, yep, definitely think it makes sense in general, definitely interested in seeing your code if you implement. I think making it work robustly (especially only running relevant services, rather than only showing relevant services; although with regard to umlaut's caching) might be unfortunately tricky, but you will hopefully be able to get something that at least works for you, whether not generalizable to submit back to umlaut -- but definitely interested in seeing what you do! Happy to here you are considering Umlaut too. Are you still sort of experimenting, or are you pretty committed to using Umlaut at the moment? At what institution? When you say you are going to use it to "display only fulltext links in a search result list while showing more details in a single document view" -- are you talking about using the Umlaut api to get fulltext results and embed them directly on your discovery results list page? You of course _could_ do this even without giving any other instructions to Umlaut -- Umlaut could keep fetching everything, but your api consuming code could only pay attention to fulltext, only stick fulltext on the host page. However, there might be performance issues with making 10+ Umlaut requests per results page. They might be ameliorated if you get umlaut to fetch yes, but I do worry a bit -- this is not something I myself have done because of worries about performance. Interested in your experiences for sure. Do want to warn you that you are exploring somewhat new territory here. And do think for this use case, it _might_ make sense to define a different "service list" in Umlaut, the service list used just for the fulltext api results (and anything else supporting you need for metadata enhancement), and then have your request say "Use the service list labelled X". Rather than try to get Umlaut to "fetch only fulltext services". There is already a feature in Umlaut for defining alternate service lists and mentioning them by name to use -- Scot, I forget, are you currently using that feature? ________________________________ From: umlaut-general-bounces at rubyforge.org [umlaut-general-bounces at rubyforge.org] on behalf of Rikke Willer [riwi at dtic.dtu.dk] Sent: Thursday, February 07, 2013 7:04 AM To: Subject: Re: [Umlaut-general] Service type request parameter? Hi Ross, Jonathan and Matthew, thank you for the answers. The use case is to use Umlaut from a discovery interface, e.g. Blacklight, to display only fulltext links in a search result list, while showing more details in a single document view. In general I think it would be a nice feature to be able to specify the service types needed on request time, so you by default could enable all the services you have access to/a possible need for in the Umlaut configuration, and on request time decide which to use. In this use case I think it would make the most sense performancewise to only execute the services needed. Although I see the point wrt referent enhancements, I think this would be worth the trade off, as long as this is clearly stated in the documentation. Wrt caching I would suggest just to keep it simple and to regard requests with different service types specified as separate, with separate caching. This would imply that the service types should be specified with the svc parameter rather than an umlaut parameter. I will implement this in our Umlaut application. If you think this solution makes sense in general, I would be happy to contribute the code changes. Thanks again, Rikke Willer Programmer DTU Library --------------------------------------- Technical University of Denmark Technical Information Center of Denmark Anker Engelunds Vej 1 P.O. Box 777 Building 101D 2800 Kgs. Lyngby Direct +45 45257254 riwi at dtic.dtu.dk http://www.dtic.dtu.dk/ On Feb 6, 2013, at 19:19 , Jonathan Rochkind > wrote: True, yeah. I think the safest simplest thing to do is the first one, where Umlaut fetches everything, but only displays the requested services. In theory a service should advertise whether they enhance metadata, in a way that Umlaut can use for it's own decision-making. However, not all services do this, because I added the feature a bit late, and it usually doesn't matter. (Even services I am the author of haven't all been updated to do this). Additinoally, some services fetch multiple times of content (fulltext AND something else), and only some of them have the ability to be told "Okay, normally you fetch multiple kinds, but THIS time, only fetch the fulltext", and there isn't a clear generic Umlaut api for this. So yeah, thinking through it by talking it out, I think the first approach is more realistic and maintainable and less error prone -- if you ask Umlaut for only fulltext, it still fetches everything, but only displays fulltext. (For certain use cases hwere that's unacceptable and you need a lean mean umlaut response, there ought to be a way to create a custom _set of services_ trimmed down what you really want. but that feature is not entirely robust/mature/documented either. But it also cold be. Is why I'm curious to hear more about the actual use case here). On 2/6/2013 12:47 PM, PHILLIPS M.E. wrote: If you tell Umlaut to display only 'fulltext' -- should Umlaut still _fetch_ everything, but only _display_ the fulltext? This way the other stuff is still there cached if you re-load asking for everything, not just fulltext, although we'd have to make sure the cache reloading works right. OR, should Umlaut attempt to only execute services involving 'fulltext' in the first place? That would possibly lead to faster response times. The problem with the latter, as far as I understand it, is that any service can enhance the metadata and therefore affect the services in the next wave. There is no way for Umlaut to tell which services may enhance metadata and so if you only executed 'fulltext' services the metadata might not be as good as it could be and the user might therefore not be offered some opportunities for fulltext they might otherwise have. _______________________________________________ Umlaut-general mailing list Umlaut-general at rubyforge.org http://rubyforge.org/mailman/listinfo/umlaut-general -------------- next part -------------- An HTML attachment was scrubbed... URL: From rochkind at jhu.edu Thu Feb 7 18:17:06 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Thu, 7 Feb 2013 18:17:06 +0000 Subject: [Umlaut-general] Problem with res_id in OpenURL In-Reply-To: References: Message-ID: <665DBC51D0250A47B4F9306CE71E5FB776CCFA14@JHEMTEBEX1.win.ad.jhu.edu> Any time Umlaut is giving you a stack trace, it's definitely a problem with umlaut! Can you copy and paste this into a Github Issue so we don't forget about it? I'l definitely get on fixing this. It might take me a bit though -- I'm rather sick at the moment, and out at a conference all next week. ________________________________________ From: umlaut-general-bounces at rubyforge.org [umlaut-general-bounces at rubyforge.org] on behalf of PHILLIPS M.E. [m.e.phillips at durham.ac.uk] Sent: Thursday, February 07, 2013 12:32 PM To: umlaut-general at rubyforge.org Subject: [Umlaut-general] Problem with res_id in OpenURL I've just been testing Umlaut with Annee Philologique as the source: http://www.annee-philologique.com/ The OpenURLs it produces include a res_id which appears to be the base URL of our resolver (or rather the base URL we have configured: the OpenURLs are routed to us via the UK OpenURL Router service, but that is immaterial). If there is anything in this field, Umlaut throws a wobbly. I have created some bit.ly shortcuts which show the behaviour on a few Umlaut installations I know of: Johns Hopkins: http://bit.ly/WwOFqI Vanderbilt: http://bit.ly/WDg4oV NYU: http://bit.ly/VJZAPe I tried Johns Hopkins SFX as well, and it does not have problems: http://bit.ly/WDg7RS Should I be asking L'Annee Philologique to alter their OpenURL generation, or is the problem somewhere in Umlaut? There is a mention of res_id in section 4.5 of the Z39.88-2004 KEV implementation guidelines at http://alcme.oclc.org/openurl/docs/implementation_guidelines/KEV_Guidelines-20041209.pdf so I suspect it is a problem with Umlaut or the openurl library. I have had a brief look at the code but I cannot work out what it is trying to do. I amy have another go tomorrow. The problem occurs whatever is in the res_id field. In our case it was "http://openurl.ac.uk/ukfed:dur.ac.uk" but I tried just with "rr" and that produced the same error. The error screen starts: NoMethodError in ResolveController#index undefined method `add_identifier' for nil:NilClass Rails.root: /home/dul0zz56/ConneXions There is nothing in the Application Trace. The Framework Trace starts: openurl (0.4.2) lib/openurl/context_object.rb:387:in `block (2 levels) in import_hash' openurl (0.4.2) lib/openurl/context_object.rb:386:in `each' openurl (0.4.2) lib/openurl/context_object.rb:386:in `block in import_hash' openurl (0.4.2) lib/openurl/context_object.rb:340:in `each' openurl (0.4.2) lib/openurl/context_object.rb:340:in `import_hash' openurl (0.4.2) lib/openurl/context_object.rb:233:in `new_from_form_vars' umlaut (3.0.4) app/models/request.rb:33:in `find_or_create' umlaut (3.0.4) app/controllers/resolve_controller.rb:169:in `init_processing' activesupport (3.2.11) lib/active_support/callbacks.rb:418:in `_run__1528741581413148286__process_action__3947138497140214998__callbacks' activesupport (3.2.11) lib/active_support/callbacks.rb:405:in `__run_callback' activesupport (3.2.11) lib/active_support/callbacks.rb:385:in `_run_process_action_callbacks' but you will probably see something similar if you try the example URL for your own servers. -- Matthew Phillips Electronic Systems Librarian, Durham University Durham University Library, Stockton Road, Durham, DH1 3LY +44 (0)191 334 2941 _______________________________________________ Umlaut-general mailing list Umlaut-general at rubyforge.org http://rubyforge.org/mailman/listinfo/umlaut-general From m.e.phillips at durham.ac.uk Fri Feb 8 09:05:17 2013 From: m.e.phillips at durham.ac.uk (PHILLIPS M.E.) Date: Fri, 8 Feb 2013 09:05:17 +0000 Subject: [Umlaut-general] Problem with res_id in OpenURL In-Reply-To: <665DBC51D0250A47B4F9306CE71E5FB776CCFA14@JHEMTEBEX1.win.ad.jhu.edu> References: <665DBC51D0250A47B4F9306CE71E5FB776CCFA14@JHEMTEBEX1.win.ad.jhu.edu> Message-ID: > Any time Umlaut is giving you a stack trace, it's definitely a problem with umlaut! > > Can you copy and paste this into a Github Issue so we don't forget about it? I'll > definitely get on fixing this. It might take me a bit though -- I'm rather sick at > the moment, and out at a conference all next week. I've opened an issue. Hope you get well again soon. Matthew -- Matthew Phillips Electronic Systems Librarian, Durham University Durham University Library, Stockton Road, Durham, DH1 3LY +44 (0)191 334 2941 From riwi at dtic.dtu.dk Fri Feb 8 20:36:32 2013 From: riwi at dtic.dtu.dk (Rikke Willer) Date: Fri, 8 Feb 2013 20:36:32 +0000 Subject: [Umlaut-general] Service type request parameter? In-Reply-To: <665DBC51D0250A47B4F9306CE71E5FB776CCFA70@JHEMTEBEX1.win.ad.jhu.edu> References: <8B09A024477AC941AE11850045CFDF781BA054@ait-pex02mbx05.win.dtu.dk> <5FB9E8F6-2094-4430-83B8-68CF31C605A8@gmail.com> <51128A18.8000308@jhu.edu> <51129EB8.5000705@jhu.edu>, <8B09A024477AC941AE11850045CFDF781BAAE1@ait-pex02mbx05.win.dtu.dk> <665DBC51D0250A47B4F9306CE71E5FB776CCFA70@JHEMTEBEX1.win.ad.jhu.edu> Message-ID: <8B09A024477AC941AE11850045CFDF781BC2C2@ait-pex02mbx05.win.dtu.dk> Hi Jonathan, Happy to here you are considering Umlaut too. Are you still sort of experimenting, or are you pretty committed to using Umlaut at the moment? At what institution? DTU (Technical University of Denmark). We are still experimenting (leaning towards commitment). When you say you are going to use it to "display only fulltext links in a search result list while showing more details in a single document view" -- are you talking about using the Umlaut api to get fulltext results and embed them directly on your discovery results list page? You of course _could_ do this even without giving any other instructions to Umlaut -- Umlaut could keep fetching everything, but your api consuming code could only pay attention to fulltext, only stick fulltext on the host page. However, there might be performance issues with making 10+ Umlaut requests per results page. They might be ameliorated if you get umlaut to fetch yes, but I do worry a bit -- this is not something I myself have done because of worries about performance. Interested in your experiences for sure. Do want to warn you that you are exploring somewhat new territory here. Thank you for the warning :) I will give it a go, and see how it works out. And do think for this use case, it _might_ make sense to define a different "service list" in Umlaut, the service list used just for the fulltext api results (and anything else supporting you need for metadata enhancement), and then have your request say "Use the service list labelled X". Rather than try to get Umlaut to "fetch only fulltext services". There is already a feature in Umlaut for defining alternate service lists and mentioning them by name to use -- Scot, I forget, are you currently using that feature? That sounds interesting, do you maybe have a code or documentation reference for this? Not quite sure what "service list" refers to. Thanks, Rikke ________________________________ From: umlaut-general-bounces at rubyforge.org [umlaut-general-bounces at rubyforge.org] on behalf of Rikke Willer [riwi at dtic.dtu.dk] Sent: Thursday, February 07, 2013 7:04 AM To: > Subject: Re: [Umlaut-general] Service type request parameter? Hi Ross, Jonathan and Matthew, thank you for the answers. The use case is to use Umlaut from a discovery interface, e.g. Blacklight, to display only fulltext links in a search result list, while showing more details in a single document view. In general I think it would be a nice feature to be able to specify the service types needed on request time, so you by default could enable all the services you have access to/a possible need for in the Umlaut configuration, and on request time decide which to use. In this use case I think it would make the most sense performancewise to only execute the services needed. Although I see the point wrt referent enhancements, I think this would be worth the trade off, as long as this is clearly stated in the documentation. Wrt caching I would suggest just to keep it simple and to regard requests with different service types specified as separate, with separate caching. This would imply that the service types should be specified with the svc parameter rather than an umlaut parameter. I will implement this in our Umlaut application. If you think this solution makes sense in general, I would be happy to contribute the code changes. Thanks again, Rikke Willer Programmer DTU Library --------------------------------------- Technical University of Denmark Technical Information Center of Denmark Anker Engelunds Vej 1 P.O. Box 777 Building 101D 2800 Kgs. Lyngby Direct +45 45257254 riwi at dtic.dtu.dk http://www.dtic.dtu.dk/ On Feb 6, 2013, at 19:19 , Jonathan Rochkind > wrote: True, yeah. I think the safest simplest thing to do is the first one, where Umlaut fetches everything, but only displays the requested services. In theory a service should advertise whether they enhance metadata, in a way that Umlaut can use for it's own decision-making. However, not all services do this, because I added the feature a bit late, and it usually doesn't matter. (Even services I am the author of haven't all been updated to do this). Additinoally, some services fetch multiple times of content (fulltext AND something else), and only some of them have the ability to be told "Okay, normally you fetch multiple kinds, but THIS time, only fetch the fulltext", and there isn't a clear generic Umlaut api for this. So yeah, thinking through it by talking it out, I think the first approach is more realistic and maintainable and less error prone -- if you ask Umlaut for only fulltext, it still fetches everything, but only displays fulltext. (For certain use cases hwere that's unacceptable and you need a lean mean umlaut response, there ought to be a way to create a custom _set of services_ trimmed down what you really want. but that feature is not entirely robust/mature/documented either. But it also cold be. Is why I'm curious to hear more about the actual use case here). On 2/6/2013 12:47 PM, PHILLIPS M.E. wrote: If you tell Umlaut to display only 'fulltext' -- should Umlaut still _fetch_ everything, but only _display_ the fulltext? This way the other stuff is still there cached if you re-load asking for everything, not just fulltext, although we'd have to make sure the cache reloading works right. OR, should Umlaut attempt to only execute services involving 'fulltext' in the first place? That would possibly lead to faster response times. The problem with the latter, as far as I understand it, is that any service can enhance the metadata and therefore affect the services in the next wave. There is no way for Umlaut to tell which services may enhance metadata and so if you only executed 'fulltext' services the metadata might not be as good as it could be and the user might therefore not be offered some opportunities for fulltext they might otherwise have. _______________________________________________ Umlaut-general mailing list Umlaut-general at rubyforge.org http://rubyforge.org/mailman/listinfo/umlaut-general _______________________________________________ Umlaut-general mailing list Umlaut-general at rubyforge.org http://rubyforge.org/mailman/listinfo/umlaut-general -------------- next part -------------- An HTML attachment was scrubbed... URL: From std5 at nyu.edu Fri Feb 8 22:18:34 2013 From: std5 at nyu.edu (Scot Dalton) Date: Fri, 8 Feb 2013 17:18:34 -0500 Subject: [Umlaut-general] Service type request parameter? In-Reply-To: <665DBC51D0250A47B4F9306CE71E5FB776CD0342@JHEMTEBEX1.win.ad.jhu.edu> References: <8B09A024477AC941AE11850045CFDF781BA054@ait-pex02mbx05.win.dtu.dk> <5FB9E8F6-2094-4430-83B8-68CF31C605A8@gmail.com> <51128A18.8000308@jhu.edu> <51129EB8.5000705@jhu.edu> <8B09A024477AC941AE11850045CFDF781BAAE1@ait-pex02mbx05.win.dtu.dk> <665DBC51D0250A47B4F9306CE71E5FB776CCFA70@JHEMTEBEX1.win.ad.jhu.edu> <8B09A024477AC941AE11850045CFDF781BC2C2@ait-pex02mbx05.win.dtu.dk> <665DBC51D0250A47B4F9306CE71E5FB776CD0342@JHEMTEBEX1.win.ad.jhu.edu> Message-ID: We do use the 'service list' feature based on a variety of criteria including client IP and URL params. In the UmlautController I think you need to override create_collection to return the Services you want to run given the context of the request. I'm not sure though if there were already some responses for that request for a given service type, e.g. fulltext, in a different context if a call to the API would return only those service responses associated with the current service list. I'm investigating this for an issue here and will report back. Thanks, Scot On Feb 8, 2013, at 4:31 PM, Jonathan Rochkind wrote: > > That sounds interesting, do you maybe have a code or documentation reference for this? Not quite sure what "service list" refers to. > > So your list of services in config/services.yml ? > > Notice that they all begin with a key `default:`. The idea is that you could then create other named lists of services, in addition to the 'default' one, and specify that you want to use THAT list of services (rather than the 'default') one in a query parameter. > > But I forget how mature or robust this feature is. It's POSSIBLE that it's actually already there and completely done and you can just use it. Scot, do you use this feature, can you provide any more info? > > Rikke, I'm going to be at a conference all next week and probably won't have time to look into this -- but feel free to file a Github Issue asking the question, or remind me the week of February 18th to look into it, and I'll figure it out for you. It might already just be working and usable. > > Either way, I'm sure we can figure out a solution to "just show fulltext" -- I'm more worried, however, as I said, about the performance implication of making 10+ umlaut requests at at time for your result list. Note that if you are using hte Umlaut api, as it sounds like you are, then you ALREADY can only place the fulltext results on the page, and ignore the rest. So an Umlaut improvement that still _fetched_ everything, but only _displayed_ the fulltext hits would be of no value to you whatsoever -- you can already easily do that. So you really DO need a feature to allow you to limit what work Umlaut does, in an effort to make it doable for 10+ requests per page. > > So since that's the case, I really do think the "service list group" feature is the one you want. I think it's going to be a lot more straightforward to set a different group of services to be run in this "only the fulltext" case, then it is to get umlaut to reliably automatically detect which services are 'fulltext' and only run those. > > (See, this is why it's good to actually explain your use case instead of asking a very specific technical question! What you REALLY needed was only apparent after we solicited more info). > > From: umlaut-general-bounces at rubyforge.org [umlaut-general-bounces at rubyforge.org] on behalf of Rikke Willer [riwi at dtic.dtu.dk] > Sent: Friday, February 08, 2013 3:36 PM > To: > Subject: Re: [Umlaut-general] Service type request parameter? > > > Hi Jonathan, > >> Happy to here you are considering Umlaut too. Are you still sort of experimenting, or are you pretty committed to using Umlaut at the moment? At what institution? > > DTU (Technical University of Denmark). We are still experimenting (leaning towards commitment). > >> When you say you are going to use it to "display only fulltext links in a search result list while showing more details in a single document view" -- are you talking about using the Umlaut api to get fulltext results and embed them directly on your discovery results list page? You of course _could_ do this even without giving any other instructions to Umlaut -- Umlaut could keep fetching everything, but your api consuming code could only pay attention to fulltext, only stick fulltext on the host page. >> >> However, there might be performance issues with making 10+ Umlaut requests per results page. They might be ameliorated if you get umlaut to fetch yes, but I do worry a bit -- this is not something I myself have done because of worries about performance. Interested in your experiences for sure. Do want to warn you that you are exploring somewhat new territory here. > > Thank you for the warning :) I will give it a go, and see how it works out. > >> And do think for this use case, it _might_ make sense to define a different "service list" in Umlaut, the service list used just for the fulltext api results (and anything else supporting you need for metadata enhancement), and then have your request say "Use the service list labelled X". Rather than try to get Umlaut to "fetch only fulltext services". There is already a feature in Umlaut for defining alternate service lists and mentioning them by name to use -- Scot, I forget, are you currently using that feature? > > That sounds interesting, do you maybe have a code or documentation reference for this? Not quite sure what "service list" refers to. > > Thanks, > > Rikke > >> >> From: umlaut-general-bounces at rubyforge.org [umlaut-general-bounces at rubyforge.org] on behalf of Rikke Willer [riwi at dtic.dtu.dk] >> Sent: Thursday, February 07, 2013 7:04 AM >> To: >> Subject: Re: [Umlaut-general] Service type request parameter? >> >> >> Hi Ross, Jonathan and Matthew, >> >> thank you for the answers. >> >> The use case is to use Umlaut from a discovery interface, e.g. Blacklight, to display only fulltext links in a search result list, while showing more details in a single document view. >> In general I think it would be a nice feature to be able to specify the service types needed on request time, so you by default could enable all the services you have access to/a possible need for in the Umlaut configuration, and on request time decide which to use. >> >> In this use case I think it would make the most sense performancewise to only execute the services needed. >> >> Although I see the point wrt referent enhancements, I think this would be worth the trade off, as long as this is clearly stated in the documentation. >> >> Wrt caching I would suggest just to keep it simple and to regard requests with different service types specified as separate, with separate caching. This would imply that the service types should be specified with the svc parameter rather than an umlaut parameter. >> >> I will implement this in our Umlaut application. >> If you think this solution makes sense in general, I would be happy to contribute the code changes. >> >> Thanks again, >> >> >> Rikke Willer >> Programmer >> DTU Library >> --------------------------------------- >> Technical University of Denmark >> Technical Information Center of Denmark >> Anker Engelunds Vej 1 >> P.O. Box 777 >> Building 101D >> 2800 Kgs. Lyngby >> Direct +45 45257254 >> riwi at dtic.dtu.dk >> http://www.dtic.dtu.dk/ >> >> >> >> On Feb 6, 2013, at 19:19 , Jonathan Rochkind wrote: >> >>> True, yeah. >>> >>> I think the safest simplest thing to do is the first one, where Umlaut fetches everything, but only displays the requested services. >>> >>> In theory a service should advertise whether they enhance metadata, in a way that Umlaut can use for it's own decision-making. However, not all services do this, because I added the feature a bit late, and it usually doesn't matter. (Even services I am the author of haven't all been updated to do this). >>> >>> Additinoally, some services fetch multiple times of content (fulltext AND something else), and only some of them have the ability to be told "Okay, normally you fetch multiple kinds, but THIS time, only fetch the fulltext", and there isn't a clear generic Umlaut api for this. >>> >>> So yeah, thinking through it by talking it out, I think the first approach is more realistic and maintainable and less error prone -- if you ask Umlaut for only fulltext, it still fetches everything, but only displays fulltext. >>> >>> (For certain use cases hwere that's unacceptable and you need a lean mean umlaut response, there ought to be a way to create a custom _set of services_ trimmed down what you really want. but that feature is not entirely robust/mature/documented either. But it also cold be. Is why I'm curious to hear more about the actual use case here). >>> >>> On 2/6/2013 12:47 PM, PHILLIPS M.E. wrote: >>>>> If you tell Umlaut to display only 'fulltext' -- should Umlaut still _fetch_ everything, but only _display_ the fulltext? >>>>> This way the other stuff is still there cached if you re-load asking for everything, not just fulltext, although we'd >>>>> have to make sure the cache reloading works right. >>>>> OR, >>>>> should Umlaut attempt to only execute services involving 'fulltext' in the first place? That would possibly lead to >>>>> faster response times. >>>> >>>> The problem with the latter, as far as I understand it, is that any service can enhance the metadata and therefore affect the services in the next wave. There is no way for Umlaut to tell which services may enhance metadata and so if you only executed 'fulltext' services the metadata might not be as good as it could be and the user might therefore not be offered some opportunities for fulltext they might otherwise have. >>>> >>>> >>> _______________________________________________ >>> Umlaut-general mailing list >>> Umlaut-general at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/umlaut-general >> >> _______________________________________________ >> Umlaut-general mailing list >> Umlaut-general at rubyforge.org >> http://rubyforge.org/mailman/listinfo/umlaut-general > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rochkind at jhu.edu Fri Feb 8 22:30:09 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Fri, 8 Feb 2013 22:30:09 +0000 Subject: [Umlaut-general] Service type request parameter? In-Reply-To: References: <8B09A024477AC941AE11850045CFDF781BA054@ait-pex02mbx05.win.dtu.dk> <5FB9E8F6-2094-4430-83B8-68CF31C605A8@gmail.com> <51128A18.8000308@jhu.edu> <51129EB8.5000705@jhu.edu> <8B09A024477AC941AE11850045CFDF781BAAE1@ait-pex02mbx05.win.dtu.dk> <665DBC51D0250A47B4F9306CE71E5FB776CCFA70@JHEMTEBEX1.win.ad.jhu.edu> <8B09A024477AC941AE11850045CFDF781BC2C2@ait-pex02mbx05.win.dtu.dk> <665DBC51D0250A47B4F9306CE71E5FB776CD0342@JHEMTEBEX1.win.ad.jhu.edu>, Message-ID: <665DBC51D0250A47B4F9306CE71E5FB776CD041E@JHEMTEBEX1.win.ad.jhu.edu> Awesome. If it's not there now, I want to add a feature where you can manually supply ¨aut.collection in the URL to specify a 'collection' (or 'service list') manually, without needing to over-ride create_collection. Even if it includes responses from a previous context if they were already fetched (I think it probably does; I can't decide if I think it should or not, but i wouldn't be surprised if it did) -- I still think this still isn't a problem for Rikke's use case. If you're using the API anyway, it doesn't really matter what results are there, you can just pull out the fulltext ones you are interested, the only reason you want to keep from fetching non-fulltext is for performance, if they were already fetched in a past request anyway, for Rikke's use case it doesn't hurt anything if they happen to be included in the API response (but you ignore them), if I understand things right. Scot, if you are having a problem with previously fetched responses from a different context being returned.... I might have some ideas on how to fix. It's all about the caching, we'd need to find a way to over-ride the construction of the cache key (or make it configurable) so it's based on the things you want to determine the context (like IP address), so Requests aren't re-used from previous contexts. ________________________________ From: Scot Dalton [std5 at nyu.edu] Sent: Friday, February 08, 2013 5:18 PM To: Jonathan Rochkind Cc: umlaut-general at rubyforge.org; scot.dalton at nyu.edu Subject: Re: [Umlaut-general] Service type request parameter? We do use the 'service list' feature based on a variety of criteria including client IP and URL params. In the UmlautController I think you need to override create_collection to return the Services you want to run given the context of the request. I'm not sure though if there were already some responses for that request for a given service type, e.g. fulltext, in a different context if a call to the API would return only those service responses associated with the current service list. I'm investigating this for an issue here and will report back. Thanks, Scot On Feb 8, 2013, at 4:31 PM, Jonathan Rochkind > wrote: > That sounds interesting, do you maybe have a code or documentation reference for this? Not quite sure what "service list" refers to. So your list of services in config/services.yml ? Notice that they all begin with a key `default:`. The idea is that you could then create other named lists of services, in addition to the 'default' one, and specify that you want to use THAT list of services (rather than the 'default') one in a query parameter. But I forget how mature or robust this feature is. It's POSSIBLE that it's actually already there and completely done and you can just use it. Scot, do you use this feature, can you provide any more info? Rikke, I'm going to be at a conference all next week and probably won't have time to look into this -- but feel free to file a Github Issue asking the question, or remind me the week of February 18th to look into it, and I'll figure it out for you. It might already just be working and usable. Either way, I'm sure we can figure out a solution to "just show fulltext" -- I'm more worried, however, as I said, about the performance implication of making 10+ umlaut requests at at time for your result list. Note that if you are using hte Umlaut api, as it sounds like you are, then you ALREADY can only place the fulltext results on the page, and ignore the rest. So an Umlaut improvement that still _fetched_ everything, but only _displayed_ the fulltext hits would be of no value to you whatsoever -- you can already easily do that. So you really DO need a feature to allow you to limit what work Umlaut does, in an effort to make it doable for 10+ requests per page. So since that's the case, I really do think the "service list group" feature is the one you want. I think it's going to be a lot more straightforward to set a different group of services to be run in this "only the fulltext" case, then it is to get umlaut to reliably automatically detect which services are 'fulltext' and only run those. (See, this is why it's good to actually explain your use case instead of asking a very specific technical question! What you REALLY needed was only apparent after we solicited more info). ________________________________ From: umlaut-general-bounces at rubyforge.org [umlaut-general-bounces at rubyforge.org] on behalf of Rikke Willer [riwi at dtic.dtu.dk] Sent: Friday, February 08, 2013 3:36 PM To: > Subject: Re: [Umlaut-general] Service type request parameter? Hi Jonathan, Happy to here you are considering Umlaut too. Are you still sort of experimenting, or are you pretty committed to using Umlaut at the moment? At what institution? DTU (Technical University of Denmark). We are still experimenting (leaning towards commitment). When you say you are going to use it to "display only fulltext links in a search result list while showing more details in a single document view" -- are you talking about using the Umlaut api to get fulltext results and embed them directly on your discovery results list page? You of course _could_ do this even without giving any other instructions to Umlaut -- Umlaut could keep fetching everything, but your api consuming code could only pay attention to fulltext, only stick fulltext on the host page. However, there might be performance issues with making 10+ Umlaut requests per results page. They might be ameliorated if you get umlaut to fetch yes, but I do worry a bit -- this is not something I myself have done because of worries about performance. Interested in your experiences for sure. Do want to warn you that you are exploring somewhat new territory here. Thank you for the warning :) I will give it a go, and see how it works out. And do think for this use case, it _might_ make sense to define a different "service list" in Umlaut, the service list used just for the fulltext api results (and anything else supporting you need for metadata enhancement), and then have your request say "Use the service list labelled X". Rather than try to get Umlaut to "fetch only fulltext services". There is already a feature in Umlaut for defining alternate service lists and mentioning them by name to use -- Scot, I forget, are you currently using that feature? That sounds interesting, do you maybe have a code or documentation reference for this? Not quite sure what "service list" refers to. Thanks, Rikke ________________________________ From: umlaut-general-bounces at rubyforge.org [umlaut-general-bounces at rubyforge.org] on behalf of Rikke Willer [riwi at dtic.dtu.dk] Sent: Thursday, February 07, 2013 7:04 AM To: > Subject: Re: [Umlaut-general] Service type request parameter? Hi Ross, Jonathan and Matthew, thank you for the answers. The use case is to use Umlaut from a discovery interface, e.g. Blacklight, to display only fulltext links in a search result list, while showing more details in a single document view. In general I think it would be a nice feature to be able to specify the service types needed on request time, so you by default could enable all the services you have access to/a possible need for in the Umlaut configuration, and on request time decide which to use. In this use case I think it would make the most sense performancewise to only execute the services needed. Although I see the point wrt referent enhancements, I think this would be worth the trade off, as long as this is clearly stated in the documentation. Wrt caching I would suggest just to keep it simple and to regard requests with different service types specified as separate, with separate caching. This would imply that the service types should be specified with the svc parameter rather than an umlaut parameter. I will implement this in our Umlaut application. If you think this solution makes sense in general, I would be happy to contribute the code changes. Thanks again, Rikke Willer Programmer DTU Library --------------------------------------- Technical University of Denmark Technical Information Center of Denmark Anker Engelunds Vej 1 P.O. Box 777 Building 101D 2800 Kgs. Lyngby Direct +45 45257254 riwi at dtic.dtu.dk http://www.dtic.dtu.dk/ On Feb 6, 2013, at 19:19 , Jonathan Rochkind > wrote: True, yeah. I think the safest simplest thing to do is the first one, where Umlaut fetches everything, but only displays the requested services. In theory a service should advertise whether they enhance metadata, in a way that Umlaut can use for it's own decision-making. However, not all services do this, because I added the feature a bit late, and it usually doesn't matter. (Even services I am the author of haven't all been updated to do this). Additinoally, some services fetch multiple times of content (fulltext AND something else), and only some of them have the ability to be told "Okay, normally you fetch multiple kinds, but THIS time, only fetch the fulltext", and there isn't a clear generic Umlaut api for this. So yeah, thinking through it by talking it out, I think the first approach is more realistic and maintainable and less error prone -- if you ask Umlaut for only fulltext, it still fetches everything, but only displays fulltext. (For certain use cases hwere that's unacceptable and you need a lean mean umlaut response, there ought to be a way to create a custom _set of services_ trimmed down what you really want. but that feature is not entirely robust/mature/documented either. But it also cold be. Is why I'm curious to hear more about the actual use case here). On 2/6/2013 12:47 PM, PHILLIPS M.E. wrote: If you tell Umlaut to display only 'fulltext' -- should Umlaut still _fetch_ everything, but only _display_ the fulltext? This way the other stuff is still there cached if you re-load asking for everything, not just fulltext, although we'd have to make sure the cache reloading works right. OR, should Umlaut attempt to only execute services involving 'fulltext' in the first place? That would possibly lead to faster response times. The problem with the latter, as far as I understand it, is that any service can enhance the metadata and therefore affect the services in the next wave. There is no way for Umlaut to tell which services may enhance metadata and so if you only executed 'fulltext' services the metadata might not be as good as it could be and the user might therefore not be offered some opportunities for fulltext they might otherwise have. _______________________________________________ Umlaut-general mailing list Umlaut-general at rubyforge.org http://rubyforge.org/mailman/listinfo/umlaut-general _______________________________________________ Umlaut-general mailing list Umlaut-general at rubyforge.org http://rubyforge.org/mailman/listinfo/umlaut-general -------------- next part -------------- An HTML attachment was scrubbed... URL: From rochkind at jhu.edu Fri Feb 8 21:31:28 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Fri, 8 Feb 2013 21:31:28 +0000 Subject: [Umlaut-general] Service type request parameter? In-Reply-To: <8B09A024477AC941AE11850045CFDF781BC2C2@ait-pex02mbx05.win.dtu.dk> References: <8B09A024477AC941AE11850045CFDF781BA054@ait-pex02mbx05.win.dtu.dk> <5FB9E8F6-2094-4430-83B8-68CF31C605A8@gmail.com> <51128A18.8000308@jhu.edu> <51129EB8.5000705@jhu.edu> <8B09A024477AC941AE11850045CFDF781BAAE1@ait-pex02mbx05.win.dtu.dk> <665DBC51D0250A47B4F9306CE71E5FB776CCFA70@JHEMTEBEX1.win.ad.jhu.edu>, <8B09A024477AC941AE11850045CFDF781BC2C2@ait-pex02mbx05.win.dtu.dk> Message-ID: <665DBC51D0250A47B4F9306CE71E5FB776CD0342@JHEMTEBEX1.win.ad.jhu.edu> > That sounds interesting, do you maybe have a code or documentation reference for this? Not quite sure what "service list" refers to. So your list of services in config/services.yml ? Notice that they all begin with a key `default:`. The idea is that you could then create other named lists of services, in addition to the 'default' one, and specify that you want to use THAT list of services (rather than the 'default') one in a query parameter. But I forget how mature or robust this feature is. It's POSSIBLE that it's actually already there and completely done and you can just use it. Scot, do you use this feature, can you provide any more info? Rikke, I'm going to be at a conference all next week and probably won't have time to look into this -- but feel free to file a Github Issue asking the question, or remind me the week of February 18th to look into it, and I'll figure it out for you. It might already just be working and usable. Either way, I'm sure we can figure out a solution to "just show fulltext" -- I'm more worried, however, as I said, about the performance implication of making 10+ umlaut requests at at time for your result list. Note that if you are using hte Umlaut api, as it sounds like you are, then you ALREADY can only place the fulltext results on the page, and ignore the rest. So an Umlaut improvement that still _fetched_ everything, but only _displayed_ the fulltext hits would be of no value to you whatsoever -- you can already easily do that. So you really DO need a feature to allow you to limit what work Umlaut does, in an effort to make it doable for 10+ requests per page. So since that's the case, I really do think the "service list group" feature is the one you want. I think it's going to be a lot more straightforward to set a different group of services to be run in this "only the fulltext" case, then it is to get umlaut to reliably automatically detect which services are 'fulltext' and only run those. (See, this is why it's good to actually explain your use case instead of asking a very specific technical question! What you REALLY needed was only apparent after we solicited more info). ________________________________ From: umlaut-general-bounces at rubyforge.org [umlaut-general-bounces at rubyforge.org] on behalf of Rikke Willer [riwi at dtic.dtu.dk] Sent: Friday, February 08, 2013 3:36 PM To: Subject: Re: [Umlaut-general] Service type request parameter? Hi Jonathan, Happy to here you are considering Umlaut too. Are you still sort of experimenting, or are you pretty committed to using Umlaut at the moment? At what institution? DTU (Technical University of Denmark). We are still experimenting (leaning towards commitment). When you say you are going to use it to "display only fulltext links in a search result list while showing more details in a single document view" -- are you talking about using the Umlaut api to get fulltext results and embed them directly on your discovery results list page? You of course _could_ do this even without giving any other instructions to Umlaut -- Umlaut could keep fetching everything, but your api consuming code could only pay attention to fulltext, only stick fulltext on the host page. However, there might be performance issues with making 10+ Umlaut requests per results page. They might be ameliorated if you get umlaut to fetch yes, but I do worry a bit -- this is not something I myself have done because of worries about performance. Interested in your experiences for sure. Do want to warn you that you are exploring somewhat new territory here. Thank you for the warning :) I will give it a go, and see how it works out. And do think for this use case, it _might_ make sense to define a different "service list" in Umlaut, the service list used just for the fulltext api results (and anything else supporting you need for metadata enhancement), and then have your request say "Use the service list labelled X". Rather than try to get Umlaut to "fetch only fulltext services". There is already a feature in Umlaut for defining alternate service lists and mentioning them by name to use -- Scot, I forget, are you currently using that feature? That sounds interesting, do you maybe have a code or documentation reference for this? Not quite sure what "service list" refers to. Thanks, Rikke ________________________________ From: umlaut-general-bounces at rubyforge.org [umlaut-general-bounces at rubyforge.org] on behalf of Rikke Willer [riwi at dtic.dtu.dk] Sent: Thursday, February 07, 2013 7:04 AM To: > Subject: Re: [Umlaut-general] Service type request parameter? Hi Ross, Jonathan and Matthew, thank you for the answers. The use case is to use Umlaut from a discovery interface, e.g. Blacklight, to display only fulltext links in a search result list, while showing more details in a single document view. In general I think it would be a nice feature to be able to specify the service types needed on request time, so you by default could enable all the services you have access to/a possible need for in the Umlaut configuration, and on request time decide which to use. In this use case I think it would make the most sense performancewise to only execute the services needed. Although I see the point wrt referent enhancements, I think this would be worth the trade off, as long as this is clearly stated in the documentation. Wrt caching I would suggest just to keep it simple and to regard requests with different service types specified as separate, with separate caching. This would imply that the service types should be specified with the svc parameter rather than an umlaut parameter. I will implement this in our Umlaut application. If you think this solution makes sense in general, I would be happy to contribute the code changes. Thanks again, Rikke Willer Programmer DTU Library --------------------------------------- Technical University of Denmark Technical Information Center of Denmark Anker Engelunds Vej 1 P.O. Box 777 Building 101D 2800 Kgs. Lyngby Direct +45 45257254 riwi at dtic.dtu.dk http://www.dtic.dtu.dk/ On Feb 6, 2013, at 19:19 , Jonathan Rochkind > wrote: True, yeah. I think the safest simplest thing to do is the first one, where Umlaut fetches everything, but only displays the requested services. In theory a service should advertise whether they enhance metadata, in a way that Umlaut can use for it's own decision-making. However, not all services do this, because I added the feature a bit late, and it usually doesn't matter. (Even services I am the author of haven't all been updated to do this). Additinoally, some services fetch multiple times of content (fulltext AND something else), and only some of them have the ability to be told "Okay, normally you fetch multiple kinds, but THIS time, only fetch the fulltext", and there isn't a clear generic Umlaut api for this. So yeah, thinking through it by talking it out, I think the first approach is more realistic and maintainable and less error prone -- if you ask Umlaut for only fulltext, it still fetches everything, but only displays fulltext. (For certain use cases hwere that's unacceptable and you need a lean mean umlaut response, there ought to be a way to create a custom _set of services_ trimmed down what you really want. but that feature is not entirely robust/mature/documented either. But it also cold be. Is why I'm curious to hear more about the actual use case here). On 2/6/2013 12:47 PM, PHILLIPS M.E. wrote: If you tell Umlaut to display only 'fulltext' -- should Umlaut still _fetch_ everything, but only _display_ the fulltext? This way the other stuff is still there cached if you re-load asking for everything, not just fulltext, although we'd have to make sure the cache reloading works right. OR, should Umlaut attempt to only execute services involving 'fulltext' in the first place? That would possibly lead to faster response times. The problem with the latter, as far as I understand it, is that any service can enhance the metadata and therefore affect the services in the next wave. There is no way for Umlaut to tell which services may enhance metadata and so if you only executed 'fulltext' services the metadata might not be as good as it could be and the user might therefore not be offered some opportunities for fulltext they might otherwise have. _______________________________________________ Umlaut-general mailing list Umlaut-general at rubyforge.org http://rubyforge.org/mailman/listinfo/umlaut-general _______________________________________________ Umlaut-general mailing list Umlaut-general at rubyforge.org http://rubyforge.org/mailman/listinfo/umlaut-general -------------- next part -------------- An HTML attachment was scrubbed... URL: From rochkind at jhu.edu Wed Feb 20 23:03:05 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Wed, 20 Feb 2013 18:03:05 -0500 Subject: [Umlaut-general] note, Umlaut JS not currently compatible with Jquery 1.9 In-Reply-To: <51254E7C.2010800@jhu.edu> References: <51254E7C.2010800@jhu.edu> Message-ID: <51255629.6090900@jhu.edu> Okay released Umlaut 3.0.5 which should be compat with jquery 1.9. If anyone else notices any problems, of course let us know. On 2/20/2013 5:30 PM, Jonathan Rochkind wrote: > I just noticed Umlaut's JS is not compatible with JQuery 1.9. > > If you update your jquery-rails gem (on purpose or accidentally) you'll > get JQuery 1.9, and umlaut's JS (including background service loading) > will stop working. > > I'll get this fixed pretty soon and release a point-release. In the > meantime, don't upgrade jquery-rails to a version that supplies JQuery > 1.9. :) From rochkind at jhu.edu Wed Feb 20 23:17:25 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Wed, 20 Feb 2013 18:17:25 -0500 Subject: [Umlaut-general] Q: template change around fulltext coverage Message-ID: <51255985.8080206@jhu.edu> I've been thinking that the default template that shows you fulltext links needs to make the dates of coverage more prominent. This is only talking about the "title-level" screens that output dates of coverage. Currently, taking more or less after SFX, it looks something like this: * JSTOR Life Sciences Collection Available from 1880 volume: 1 issue: 1 until 2007 volume: 318 issue: 5858. * EBSCOhost Academic Search Complete Available from 1997 until 2004. That is, the name of the target is prominent, with a possibly lengthy description of coverage below. But what our users care about is more the coverage as a distinguishing feature of different links, not the name. However, the full description of coverage is LENGTHY, possibly including volumes and issues, possibly including embargo information. Here is what I'm going to try, duplicating JUST the years as a summary, as a prefix of the first line, leaving the entire statement of coverage there for people who want details, probably with a CSS style greying it out. * 1880-2007: JSTOR Life Sciences Collection Available from 1880 volume: 1 issue: 1 until 2007 volume: 318 issue: 5858. * 1997-2004: EBSCOhost Academic Search Complete Available from 1997 until 2004. * 1997-present: Something Available from 1997 * 1997-2012: Something Embargoed Available from 1997 with a one year embargo. The years will be bold but not part of the hyperlink. That seems to me to be the simplest thing that will improve things. I am probably going to do this, and make it a part of Umlaut. It will work for the SFX plug-in, other possible future plugins in will work for if they can provide machine actionable coverage dates. Main question, can I go ahead and make this the ordinary baked in template in future Umlaut releases, or should I make this an _option_ instead, so you can still get the old one? Or any other feedback? Jonathan From rochkind at jhu.edu Wed Feb 20 22:30:20 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Wed, 20 Feb 2013 17:30:20 -0500 Subject: [Umlaut-general] note, Umlaut JS not currently compatible with Jquery 1.9 Message-ID: <51254E7C.2010800@jhu.edu> I just noticed Umlaut's JS is not compatible with JQuery 1.9. If you update your jquery-rails gem (on purpose or accidentally) you'll get JQuery 1.9, and umlaut's JS (including background service loading) will stop working. I'll get this fixed pretty soon and release a point-release. In the meantime, don't upgrade jquery-rails to a version that supplies JQuery 1.9. :) From rochkind at jhu.edu Wed Feb 20 23:34:10 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Wed, 20 Feb 2013 18:34:10 -0500 Subject: [Umlaut-general] delete rubyforge project, move this list? Message-ID: <51255D72.40008@jhu.edu> Does anyone still need the old Umlaut 2.x svn repo on rubyforge? You might if you're still running umlaut 2.x. While the umlaut 2.x code IS on our new github repo, some of the umlaut 2.x instructions and supported utilities assume svn access. However, I'd really like to delete the rubyforge project entirely, to avoid confusion. Is anyone still using it? I would also like to move this listserv away from rubyforge, which isn't all that reliable and acts weirdly, and because I want to delete the rubyforge project entirely. The only good option I know for listserv hosting that works how I'd want is Google Groups. Others have suggested alternatives before, but none of them I've looked at have seemed satisfactory to me. But you have one more chance to suggest alternatives if you know of something you'd prefer to Google Groups. :) From rochkind at jhu.edu Thu Feb 21 22:44:05 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Thu, 21 Feb 2013 17:44:05 -0500 Subject: [Umlaut-general] New Umlaut/SFX feature for collapsing targets from same vendor platform Message-ID: <5126A335.70305@jhu.edu> The way the SFX KB works, you often end up with multiple links that all go to the same place, say different EBSCO packages that are all really the same destination. You may be using SFX 'display logic' feature to try and deal with that, and that is often fine. But if you are unhappy with that feature, and want to configure this in Umlaut instead, there's a new feature in the Umlaut SFX plugin for this -- in particular, it takes account of coverages when 'rolling up' apparently duplicate links, so as not to suppress a link that really reports unique coverage. This is in Umlaut master, but not yet in an Umlaut release. More info here: http://bibwild.wordpress.com/2013/02/21/umlaut-new-feature-for-sfx-target-roll-up/ From rochkind at jhu.edu Wed Feb 27 16:00:06 2013 From: rochkind at jhu.edu (Jonathan Rochkind) Date: Wed, 27 Feb 2013 11:00:06 -0500 Subject: [Umlaut-general] bye bye 1.8.7? Message-ID: <512E2D86.5030309@jhu.edu> So when I released Umlaut 3, I put the somewhat vague statement in the readme "Only tested with 1.9.3" But Umlaut still worked with 1.8.7 at that point. Later Scot awesomely added Continuous Integration running of automated tests with travis, and included running the tests under both 1.8.7 and 1.9.3. Making the README not entirely accurate. (note that umlaut still doens't have GREAT test coverage, I would not have a huge level of confidence that everything works just because of that. But we add more tests all the time, and try to add them for any new code we write). But lately I keep getting test failures in 1.8.7 but not 1.9.3. Usually but not always they are 'testing failures' rather than actual code failures (something wrong with the testing framework under 1.8.7, not with the actual code). But always they are an inconvenience for the developer. My inclination is to stop the 1.8.7 tests, and change the README to say Umlaut is 1.9.3 only, for real. I doubt anyone is running Umlaut 3 under 1.8.7. 1.9.3 was pretty well established before Umlaut 3.0.0 came out. 1.8.7 will, in only a few months, be completely end-of-lifed without even security updates. Rails 4 does not run on 1.8.7. Ruby 2.0 just came out. 1.8.7 is dead. So say something soon if you don't want me to, otherwise, I will. Now, you ask, what about the future? Ruby 2.0 and Rails 4.0 just came out. But I'm honestly in no hurry to get Umlaut working under either one. I've spent way too much time in the past 2 years migrating Umlaut to newer rails/ruby/etc, compared to adding new features, I'm kind of sick of it. No desire to be the early adopter with either ruby 2.0 or rails 4.0, let others work out the kinks first. So I'm in no hurry, and have no specific timeline or plans. When I or someone else does get around to testing under Rails 4 and/or ruby 2.0 and making changes to accomodate... I would expect/suggest that one should _first_ deal with Rails4 sticking with ruby 1.9.3, and only when that is working flawlessly move on to ruby 2.0. I expect ruby 1.9.3 will stick around far longer than Rails3 will.