From scaudill at gmail.com Wed Dec 13 14:14:06 2006 From: scaudill at gmail.com (Stephen Caudill) Date: Wed, 13 Dec 2006 14:14:06 -0500 Subject: [s3-dev] AWS::S3::SignatureDoesNotMatch error when content-type can't be inferred Message-ID: <8613E045-44D9-4805-BF00-CBCBDA8DD439@gmail.com> Marcel, First off. The lib is nice, a big thanks to you :) Down to bidness. When AWS::S3 fails to infer the content-type (odd file extension, no file extension), it throws an AWS::S3::SignatureDoesNotMatch error, which is misleading. I've gotten the same results from both 0.2.1 and latest trunk. here's a script that demonstrates it: [02:05:58][caudill at lazuli][Desktop]$ cat aws-script-helper-test.rb require 'rubygems' require 'aws/s3' include AWS::S3 Base.establish_connection!( :access_key_id => 'foo', :secret_access_key => 'bar' ) Bucket.create('buckets-r-teh-niftyest') S3Object.store('s3sh', File.open('/usr/local/bin/s3sh'), 'buckets-r- teh-niftyest') % Here's the error, as reported by RubyMate: AWS::S3::SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your key and signing method. method raise in error.rb at line 38 method request in base.rb at line 72 method put in base.rb at line 83 method store in object.rb at line 242 at top level in aws-script-helper-test.rb at line 12 And as reported when run by ruby directly: /usr/local//lib/ruby/gems/1.8/gems/aws-s3-0.2.1.1166035662/lib/aws/s3/ error.rb:38:in `raise': The AWS Access Key Id you provided does not exist in our records. (AWS::S3::InvalidAccessKeyId) from /usr/local//lib/ruby/gems/1.8/gems/aws- s3-0.2.1.1166035662/lib/aws/s3/base.rb:72:in `request' from /usr/local//lib/ruby/gems/1.8/gems/aws- s3-0.2.1.1166035662/lib/aws/s3/base.rb:83:in `put' from /usr/local//lib/ruby/gems/1.8/gems/aws- s3-0.2.1.1166035662/lib/aws/s3/bucket.rb:79:in `create' from aws-script-helper-test.rb:10 Still digging into your lib, so I'm not sure where the indirection is coming from yet, but I'll dig in and see if I can figure it out. Cheers, Stephen From marcel at vernix.org Thu Dec 14 17:59:25 2006 From: marcel at vernix.org (Marcel Molina Jr.) Date: Thu, 14 Dec 2006 22:59:25 +0000 Subject: [s3-dev] AWS::S3::SignatureDoesNotMatch error when content-type can't be inferred In-Reply-To: <8613E045-44D9-4805-BF00-CBCBDA8DD439@gmail.com> References: <8613E045-44D9-4805-BF00-CBCBDA8DD439@gmail.com> Message-ID: <20061214225925.GJ79397@comox.textdrive.com> On Wed, Dec 13, 2006 at 02:14:06PM -0500, Stephen Caudill wrote: > Down to bidness. When AWS::S3 fails to infer the content-type (odd > file extension, no file extension), it throws an > AWS::S3::SignatureDoesNotMatch error, which is misleading. I've > gotten the same results from both 0.2.1 and latest trunk. > > here's a script that demonstrates it: > > [02:05:58][caudill at lazuli][Desktop]$ cat aws-script-helper-test.rb > require 'rubygems' > require 'aws/s3' > include AWS::S3 > > Base.establish_connection!( > :access_key_id => 'foo', > :secret_access_key => 'bar' > ) > > Bucket.create('buckets-r-teh-niftyest') > > S3Object.store('s3sh', File.open('/usr/local/bin/s3sh'), 'buckets-r- > teh-niftyest') > % > > Here's the error, as reported by RubyMate: > > AWS::S3::SignatureDoesNotMatch: The request signature we calculated > does not match the signature you provided. Check your key and signing > method. Hey Stephen, thanks for reporting this. Someone brought this up a few weeks ago (http://developer.amazonwebservices.com/connect/message.jspa?messageID=49153#49156) and at the time I couldn't recreate it. I still can't recreate it actually, which is weird: >> S3Object.store('s3sh', File.open('/opt/local/bin/s3sh'), 'marcel') => # This is the patch that David Hanson proposed: Index: lib/aws/s3/object.rb =================================================================== --- lib/aws/s3/object.rb (revision 142) +++ lib/aws/s3/object.rb (working copy) @@ -302,6 +302,8 @@ return if options.has_key?(:content_type) if mime_type = MIME::Types.type_for(key).first options[:content_type] = mime_type.content_type + else + options[:content_type] = 'binary/octet-stream' end end end Since I can't reproduce the problem, I don't know if that fixes it though it seems to have worked for him. Unfortunately that breaks other behavior. It makes it so you can't explicitly set a content type on an instance of an S3Object. The fix so that both live happily ever after probably isn't that tough. I plan on looking into the specifics and working up a fix soon. Thanks again for the report and sorry for the lag time in responding (I was withouth internet yesterday), marcel -- Marcel Molina Jr. From metalhead at metalhead.ws Sat Dec 16 15:10:06 2006 From: metalhead at metalhead.ws (Metalhead) Date: Sat, 16 Dec 2006 21:10:06 +0100 Subject: [s3-dev] patch for 1000 objects limitation Message-ID: <20061216211006.13ce04f5@huginn.asgard.yggdrasill> Hi all, I've included a patch against 0.2.1 that works around the limit of not more than 1000 objectss being returned by the service. I've successfully tested it with a bucket containing 50000+ items. I'm not entirely sure that I've added the patch at the right method because I don't understand the code completely, so please feel free to comment on it and suggest improvements. I've also added '+' to the list of unsafe characters for URL escaping because S3 converts '+' to ' ' when used in the value of the marker option. Regards, Lars diff -ru aws-s3-0.2.1/lib/aws/s3/bucket.rb aws-s3-0.2.1-mine/lib/aws/s3/bucket.rb --- aws-s3-0.2.1/lib/aws/s3/bucket.rb 2006-12-04 08:29:30.000000000 +0100 +++ aws-s3-0.2.1-mine/lib/aws/s3/bucket.rb 2006-12-16 21:00:09.000000000 +0100 @@ -103,7 +103,15 @@ # There are several options which allow you to limit which objects are retrieved. The list of object filtering options # are listed in the documentation for Bucket.objects. def find(name = nil, options = {}) - new(get(path(name, options)).bucket) + response = get(path(name, options)).bucket + if not options.has_key?(:max_keys) and response['is_truncated'] + begin + options[:marker] = response['contents'].last['key'] + temp_response = get(path(name, options)).bucket + response['contents'] += temp_response['contents'] + end while(temp_response['is_truncated']) + end + new(response) end # Return just the objects in the bucket named name. diff -ru aws-s3-0.2.1/lib/aws/s3/extensions.rb aws-s3-0.2.1-mine/lib/aws/s3/extensions.rb --- aws-s3-0.2.1/lib/aws/s3/extensions.rb 2006-12-04 07:36:21.000000000 +0100 +++ aws-s3-0.2.1-mine/lib/aws/s3/extensions.rb 2006-12-16 21:00:09.000000000 +0100 @@ -1,5 +1,9 @@ #:stopdoc: +module URI + UNSAFE = Regexp.new(URI::UNSAFE.to_s.sub('+', '')) +end + class Hash def to_query_string(include_question_mark = true) return '' if empty? -- 4th Law of Hacking: you will find the exit at the entrance. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://rubyforge.org/pipermail/amazon-s3-dev/attachments/20061216/d2cf609d/attachment.bin From marcel at vernix.org Sun Dec 17 16:07:48 2006 From: marcel at vernix.org (Marcel Molina Jr.) Date: Sun, 17 Dec 2006 21:07:48 +0000 Subject: [s3-dev] patch for 1000 objects limitation In-Reply-To: <20061216211006.13ce04f5@huginn.asgard.yggdrasill> References: <20061216211006.13ce04f5@huginn.asgard.yggdrasill> Message-ID: <20061217210748.GS79397@comox.textdrive.com> On Sat, Dec 16, 2006 at 09:10:06PM +0100, Metalhead wrote: > I've included a patch against 0.2.1 that works around the limit of not more > than 1000 objectss being returned by the service. I've successfully tested it with > a bucket containing 50000+ items. > > I'm not entirely sure that I've added the patch at the right method because I > don't understand the code completely, so please feel free to comment on it and > suggest improvements. > > I've also added '+' to the list of unsafe characters for URL escaping because S3 > converts '+' to ' ' when used in the value of the marker option. Hey, thanks for the patch. Supporting the retrieval of more than 1000 objects in a bucket listing has been on my TODO list for a while and has languished there as I've been on the fence about whether I wanted it implemented. Personally, I've never had a need for fetching that many objects in a bucket listing, even though we've got millions of objects in our buckets. It's just not something I've needed. Just curious: Do you actually need this in your use of the library or were you implementing it because you noticed it wasn't supported? If you do actually need it, I'd be interested to know what scenario calls for it. That's one of the neat things about throwing the library out there for the world to use. Finding out all the ways other people are using it that are different from how you are using it. I'm still a bit on the fence about it so I'd like to hear more. I think a better place for it would be in AWS::S3::Bucket::Response which would transparently handle appending to the response until it got all the objects that it wanted so that the find method doesn't need to know anything about the 1000 object limitation, though the response would need to have access to the request options so it knew what :max_keys was set to. The feature would definitely also need tests. You may have noticed that I have a suite of remote tests which aren't mocks and actually hit S3. On the one hand this is a good candidate for those remote tests (which at present only I can run, since they go against my personal account). This would likely need to not be one of those, otherwise it would take too long to run. I think a huge xml fixture would need to be generated for an initial request of 1000 and then a second for the remainder. They would then be mocked using the request_returns mocking method I have in the tests. At present, even with libxml, I haven't seen very good performance from the library when it needs to parse that much xml. Which isn't to say I wouldn't be interested in improving performance, but rather just that I'm wary of diving into adding the capability for slurping down thousands of thousands of object xml without making sure that doesn't blow away your system's resources. Again though, thanks for the work put into the patch. I'm interested to hear more about what led you to implementing it. marcel -- Marcel Molina Jr. From scaudill at gmail.com Fri Dec 15 11:15:48 2006 From: scaudill at gmail.com (Stephen Caudill) Date: Fri, 15 Dec 2006 11:15:48 -0500 Subject: [s3-dev] AWS::S3::SignatureDoesNotMatch error when content-type can't be inferred In-Reply-To: <20061214225925.GJ79397@comox.textdrive.com> References: <8613E045-44D9-4805-BF00-CBCBDA8DD439@gmail.com> <20061214225925.GJ79397@comox.textdrive.com> Message-ID: <596512B7-22D6-4007-9A85-2EF88E44AC03@gmail.com> On Dec 14, 2006, at 5:59 PM, Marcel Molina Jr. wrote: > Hey Stephen, thanks for reporting this. > > Someone brought this up a few weeks ago (http:// > developer.amazonwebservices.com/connect/message.jspa? > messageID=49153#49156) and at the time I couldn't recreate > it. I still can't recreate it actually, which is weird: > >>> S3Object.store('s3sh', File.open('/opt/local/bin/s3sh'), 'marcel') > => # Hmm... that is weird. I wonder if it's an environment thing? I'm on a MacBook Pro. Here's my Ruby version, straight from the horses mouth: [10:38:37][caudill at lazuli][caudill]$ ruby -v ruby 1.8.5 (2006-08-25) [i686-darwin8.8.1] I noted that you have a test that specifically checks to ensure that you can store a file with no extension and no content-type specified... which passes on my computer, only further adding to my befuddlement. Hmmm... I think I've got a box that's running ruby 1.8.4, let me give that a whirl. Okay, so on a Fedora Core 1 box: [10:54:32][root at toohot][~]$ ruby -v ruby 1.8.4 (2005-12-24) [i686-linux] [11:01:03][root at toohot][lib]$ irb irb(main):001:0> require 'aws/s3' => true irb(main):002:0> include AWS::S3 => Object irb(main):003:0> Base.establish_connection!( irb(main):004:1* :access_key_id => 'foo', irb(main):005:1* :secret_access_key => 'bar' irb(main):006:1> ) => #, @secret_access_key="bar", @options= {:secret_access_key=>"bar", :port=>80, :server=>"s3.amazonaws.com", :acc ess_key_id=>"foo"}, @access_key_id="foo"> irb(main):007:0> Bucket.create('buckets-r-teh-niftyest') => true irb(main):010:0> S3Object.store('s3sh', File.open('/usr/local/bin/ s3sh'), 'buckets-r-teh-niftyest') => # Works like a charm. So, from my end, it looks like a difference between ruby 1.8.4 and ruby 1.8.5. Also, given that the test that checks for nil content-type being properly inferred (in trunk it's: test/object_test.rb:47) is passing when the functionality fails, there's probably a bug there too. > Since I can't reproduce the problem, I don't know if that fixes it > though it > seems to have worked for him. Unfortunately that breaks other > behavior. It > makes it so you can't explicitly set a content type on an instance > of an > S3Object. The fix so that both live happily ever after probably > isn't that > tough. I plan on looking into the specifics and working up a fix soon. I've got some extra time this afternoon, so I'll take a look into it too. I've got a lot more to go on now than I did before, so maybe I'll have some luck with it. > Thanks again for the report and sorry for the lag time in > responding (I was > withouth internet yesterday), > marcel No worries :) thanks a bunch for the lib. It's significantly more elegant and complete than what I'd cobbled together :) I'm currently using AWS::S3 on a development server on EC2 right now, to manage our automated backups since there's no such thing as persistent storage. Always an adventure... Cheers! - Stephen From metalhead at metalhead.ws Sun Dec 17 16:43:11 2006 From: metalhead at metalhead.ws (Metalhead) Date: Sun, 17 Dec 2006 22:43:11 +0100 Subject: [s3-dev] AWS::S3::SignatureDoesNotMatch error when content-type can't be inferred In-Reply-To: <596512B7-22D6-4007-9A85-2EF88E44AC03@gmail.com> References: <8613E045-44D9-4805-BF00-CBCBDA8DD439@gmail.com> <20061214225925.GJ79397@comox.textdrive.com> <596512B7-22D6-4007-9A85-2EF88E44AC03@gmail.com> Message-ID: <20061217224311.5f502840@huginn.asgard.yggdrasill> Just as a side note, I've noticed that bug, too. It seems to be a S3 problem though, specifically with whitespace in http requests. The problem occured for me when there was more than one line of whitespace seperating the HTTP method from the rest of the headers, or a line of whitespace after the first header line after the HTTP method. I guess that the server side has problems with whitespace... Lars -- They say that there once was a fearsome chaotic samurai named Luk No. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://rubyforge.org/pipermail/amazon-s3-dev/attachments/20061217/cb9a268a/attachment.bin From metalhead at metalhead.ws Sun Dec 17 17:04:16 2006 From: metalhead at metalhead.ws (Metalhead) Date: Sun, 17 Dec 2006 23:04:16 +0100 Subject: [s3-dev] patch for 1000 objects limitation In-Reply-To: <20061217210748.GS79397@comox.textdrive.com> References: <20061216211006.13ce04f5@huginn.asgard.yggdrasill> <20061217210748.GS79397@comox.textdrive.com> Message-ID: <20061217230416.4a89738f@huginn.asgard.yggdrasill> > Just curious: Do you actually need this in your use of the library or were > you implementing it because you noticed it wasn't supported? If you do > actually need it, I'd be interested to know what scenario calls for it. > That's one of the neat things about throwing the library out there for the > world to use. Finding out all the ways other people are using it that are > different from how you are using it. I'm working on a backup solution that uses S3, the initial implementation was using the everything-at-once approach, so you can guess that there were more than 1000 files ;) Now I'm processing one directory at a time, but still there're cases with more than 1000 objects -- e.g. mailing list directories, browser caches, temporary directories. > I'm still a bit on the fence about it so I'd like to hear more. I think a > better place for it would be in AWS::S3::Bucket::Response which would > transparently handle appending to the response until it got all the objects > that it wanted so that the find method doesn't need to know anything about > the 1000 object limitation, though the response would need to have access to > the request options so it knew what :max_keys was set to. That's what I thought. The feature should definitely be handled further down in the application stack, but that stack doesn't have access to all the information needed. > The feature would definitely also need tests. You may have noticed that I > have a suite of remote tests which aren't mocks and actually hit S3. On the > one hand this is a good candidate for those remote tests (which at present > only I can run, since they go against my personal account). This would likely > need to not be one of those, otherwise it would take too long to run. I think > a huge xml fixture would need to be generated for an initial request of 1000 > and then a second for the remainder. They would then be mocked using the > request_returns mocking method I have in the tests. I'm not too sure about the tests. Either it has to be a real test against S3, which is difficult to realise, or it's a mock. If it's a mock, it's valuable as a regression test, but as you're mocking the actual interface, it doesn't really help for the real thing. > At present, even with libxml, I haven't seen very good performance from the > library when it needs to parse that much xml. Which isn't to say I wouldn't > be interested in improving performance, but rather just that I'm wary of > diving into adding the capability for slurping down thousands of thousands of > object xml without making sure that doesn't blow away your system's > resources. That was quite a problem when I was testing it. Storing several 1000 objects is definitely not something for systems with few resources. Still, processing time for parsing the XML was about the same as querying the service and retrieving the XML on my machine (Pentium M 1.4GHz). In my opinion, it doesn't really matter in that case. If you're putting several 1000 objects into your bucket, you really should know what you're doing, otherwise it's your own fault. I needed about 170MB of memory for ca. 55,000 objects. > Again though, thanks for the work put into the patch. I'm interested to hear > more about what led you to implementing it. There'll (hopefully) soon be an application to follow. In the meantime, there'll be more patches ;) I've noticed 2 bugs, which I'll be posting patches for shortly. Also, I've had very good experiences with persistent HTTP connections with the example lib provided by Amazon (performance gains of several 100%), so I'm planning to patch your lib to do persistent connections, too. Regards, Lars -- PLEASE ignore previous rumor. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://rubyforge.org/pipermail/amazon-s3-dev/attachments/20061217/52b03e00/attachment-0001.bin From metalhead at metalhead.ws Mon Dec 18 07:01:15 2006 From: metalhead at metalhead.ws (Metalhead) Date: Mon, 18 Dec 2006 13:01:15 +0100 Subject: [s3-dev] bugfixes object creation / metadata store Message-ID: <20061218130115.01b0911e@huginn.asgard.yggdrasill> Hi, I've included a patch that fixes 2 bugs I've noticed. The first bug occurs when the creation of a new S3Object is attempted; About.new requires a hash as an argument. Fixed by adding the empty hash as the default argument. The second bug is also concerned with the creation of new objects; the metadata of non-stored objects isn't saved because a new About object is created when the object is stored. Fixed by introducing a new attribute that checks whether About has been initialised. Thanks, Lars diff -bru aws-s3-0.2.1/lib/aws/s3/object.rb aws-s3-0.2.1-mine/lib/aws/s3/object.rb --- aws-s3-0.2.1/lib/aws/s3/object.rb 2006-12-04 07:03:00.000000000 +0100 +++ aws-s3-0.2.1-mine/lib/aws/s3/object.rb 2006-12-18 12:50:01.000000000 +0100 @@ -304,7 +304,7 @@ end class About < Hash #:nodoc: - def initialize(headers) + def initialize(headers={}) super() replace(headers) metadata @@ -491,7 +491,12 @@ # # => 'audio/mpeg' # some_object.store def about - stored? ? self.class.about(key, bucket.name) : About.new + if stored? + self.class.about(key, bucket.name) + else + attributes['about_initialized'] = true + About.new + end end memoized :about @@ -561,7 +566,7 @@ # Returns true if the current object has been stored on S3 yet. def stored? - !attributes['e_tag'].nil? + !attributes['e_tag'].nil? || attributes['about_initialized'] end def ==(s3object) #:nodoc: -- Old hackers never die: young ones do. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://rubyforge.org/pipermail/amazon-s3-dev/attachments/20061218/88bc6671/attachment.bin From marcel at vernix.org Mon Dec 18 16:46:35 2006 From: marcel at vernix.org (Marcel Molina Jr.) Date: Mon, 18 Dec 2006 21:46:35 +0000 Subject: [s3-dev] AWS::S3::SignatureDoesNotMatch error when content-type can't be inferred In-Reply-To: <596512B7-22D6-4007-9A85-2EF88E44AC03@gmail.com> References: <8613E045-44D9-4805-BF00-CBCBDA8DD439@gmail.com> <20061214225925.GJ79397@comox.textdrive.com> <596512B7-22D6-4007-9A85-2EF88E44AC03@gmail.com> Message-ID: <20061218214635.GV79397@comox.textdrive.com> On Fri, Dec 15, 2006 at 11:15:48AM -0500, Stephen Caudill wrote: > >Hey Stephen, thanks for reporting this. > > > >Someone brought this up a few weeks ago (http:// > >developer.amazonwebservices.com/connect/message.jspa? > >messageID=49153#49156) and at the time I couldn't recreate > >it. I still can't recreate it actually, which is weird: > > > >>>S3Object.store('s3sh', File.open('/opt/local/bin/s3sh'), 'marcel') > > => # > > Hmm... that is weird. I wonder if it's an environment thing? > > I'm on a MacBook Pro. Here's my Ruby version, straight from the > horses mouth: > > [10:38:37][caudill at lazuli][caudill]$ ruby -v > ruby 1.8.5 (2006-08-25) [i686-darwin8.8.1] > > I noted that you have a test that specifically checks to ensure that > you can store a file with no extension and no content-type > specified... which passes on my computer, only further adding to my > befuddlement. > > Hmmm... I think I've got a box that's running ruby 1.8.4, let me give > that a whirl. Ok. I've got ruby 1.8.0, 1.8.2, 1.8.4 and 1.8.5 installed and can now run the tests against all these versions. I'll be able to debug differences between different versions of ruby now. marcel -- Marcel Molina Jr. From marcel at vernix.org Tue Dec 19 19:03:56 2006 From: marcel at vernix.org (Marcel Molina Jr.) Date: Wed, 20 Dec 2006 00:03:56 +0000 Subject: [s3-dev] bugfixes object creation / metadata store In-Reply-To: <20061218130115.01b0911e@huginn.asgard.yggdrasill> References: <20061218130115.01b0911e@huginn.asgard.yggdrasill> Message-ID: <20061220000356.GD79397@comox.textdrive.com> On Mon, Dec 18, 2006 at 01:01:15PM +0100, Metalhead wrote: > I've included a patch that fixes 2 bugs I've noticed. > > The first bug occurs when the creation of a new S3Object is attempted; About.new > requires a hash as an argument. Fixed by adding the empty hash as the default > argument. > > The second bug is also concerned with the creation of new objects; the metadata > of non-stored objects isn't saved because a new About object is created when the > object is stored. Fixed by introducing a new attribute that checks whether About > has been initialised. Yeah, creating an object via some_bucket.new_object is a little shaky at the moment. Your patch looks like it fixes the issues, but I can't just apply it on faith without tests, so I'm working on some tests for this stuff. Thanks for reporting these. Three cheers for making unstored objects more robust :) Thanks again, marcel -- Marcel Molina Jr. From metalhead at metalhead.ws Wed Dec 20 04:31:48 2006 From: metalhead at metalhead.ws (Metalhead) Date: Wed, 20 Dec 2006 10:31:48 +0100 Subject: [s3-dev] bugfixes object creation / metadata store In-Reply-To: <20061220000356.GD79397@comox.textdrive.com> References: <20061218130115.01b0911e@huginn.asgard.yggdrasill> <20061220000356.GD79397@comox.textdrive.com> Message-ID: <20061220103148.247f8ef9@huginn.asgard.yggdrasill> > Your patch looks like it fixes the issues, but I can't just apply it > on faith without tests, so I'm working on some tests for this stuff. Ahh, yes, one day I'll install all the mock stuff that's needed for the tests and actually test what I changed myself (and include the tests in the patches) ;) Lars -- Direct a direct hit on your direct opponent, directing in the right direction. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://rubyforge.org/pipermail/amazon-s3-dev/attachments/20061220/e4faa3a4/attachment.bin From metalhead at metalhead.ws Thu Dec 21 09:07:12 2006 From: metalhead at metalhead.ws (Metalhead) Date: Thu, 21 Dec 2006 15:07:12 +0100 Subject: [s3-dev] patch to fix etag parsing with XmlSimple Message-ID: <20061221150712.40ced913@huginn.asgard.yggdrasill> Hi, I've noticed a bug that occurs when XmlSimple is used. XmlSimple doesn't automatically convert '"' to '"', so parsing the etags will fail. I've included a patch that fixes this problem. Thanks, Lars diff -Bru aws-s3-0.2.1/lib/aws/s3/parsing.rb aws-s3-0.2.1-mine/lib/aws/s3/parsing.rb --- aws-s3-0.2.1/lib/aws/s3/parsing.rb 2006-12-02 22:09:37.000000000 +0100 +++ aws-s3-0.2.1-mine/lib/aws/s3/parsing.rb 2006-12-21 15:00:48.000000000 +0100 @@ -55,7 +55,11 @@ when /^\d+$/: Integer(self) when datetime_format: Time.parse(self) else - self + if AWS::S3::Parsing.parser == XmlSimple + REXML::Text.unnormalize(self) + else + self + end end end -- Seepage? Leaky pipes? Rising damp? Summon the plumber! -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://rubyforge.org/pipermail/amazon-s3-dev/attachments/20061221/dda08995/attachment.bin From metalhead at metalhead.ws Thu Dec 21 10:10:17 2006 From: metalhead at metalhead.ws (Metalhead) Date: Thu, 21 Dec 2006 16:10:17 +0100 Subject: [s3-dev] patch to add persistent connections Message-ID: <20061221161017.4a639971@huginn.asgard.yggdrasill> Hi, it's patch day today :) I've included a patch to connection.rb to add the option of making connections persistent. I've noticed speed gains of about 100% when doing batch listings of objects. Thanks, Lars diff -bru aws-s3-0.2.1/lib/aws/s3/connection.rb aws-s3-0.2.1-mine/lib/aws/s3/connection.rb --- aws-s3-0.2.1/lib/aws/s3/connection.rb 2006-12-04 03:59:04.000000000 +0100 +++ aws-s3-0.2.1-mine/lib/aws/s3/connection.rb 2006-12-21 16:05:03.000000000 +0100 @@ -19,18 +19,13 @@ end def request(verb, path, headers = {}, body = nil, &block) - http.start do - request = request_method(verb).new(path, headers) - authenticate!(request) - if body - if body.respond_to?(:read) - request.body_stream = body - request.content_length = body.respond_to?(:lstat) ? body.lstat.size : body.size + if options[:persistent] + @http.start unless @http.started? + request_internal(verb, path, headers, body, &block) else - request.body = body - end + http.start do + request_internal(verb, path, headers, body, &block) end - http.request(request, &block) end end @@ -50,6 +45,20 @@ end private + def request_internal(verb, path, headers = {}, body = nil, &block) + request = request_method(verb).new(path, headers) + if body + if body.respond_to?(:read) + request.body_stream = body + request.content_length = body.respond_to?(:lstat) ? body.lstat.size : body.size + else + request.body = body + end + end + authenticate!(request) + http.request(request, &block) + end + def extract_keys! missing_keys = [] extract_key = Proc.new {|key| options[key] || (missing_keys.push(key); nil)} @@ -62,6 +71,9 @@ extract_keys! @http = Net::HTTP.new(options[:server], options[:port]) @http.use_ssl = !options[:use_ssl].nil? || options[:port] == 443 + if options[:persistent] + @http.start + end @http end @@ -121,6 +133,8 @@ # argument is set. # * :use_ssl - Whether requests should be made over SSL. If set to true, the :port argument # will be implicitly set to 443, unless specified otherwise. Defaults to false. + # * :persistent - Whether to use persistent connections to + # the server (i.e. not open and close a connection for each request). Defaults to false. def establish_connection!(options = {}) # After you've already established the default connection, just specify # the difference for subsequent connections @@ -147,12 +161,17 @@ # Removes the connection for the current class. If there is no connection for the current class, the default # connection will be removed. - def disconnect - connections.delete(connection_name) || connections.delete(default_connection) + def disconnect(connection=connection_name) + connection = connections[connection] ? connection_name : default_connection + if connections[connection].options[:persistent] + connections[connection].http.finish + end + connections.delete(connection) end # Clears *all* connections, from all classes, with prejudice. def disconnect! + connections.each_key { |k| disconnect(k) } connections.clear end @@ -174,7 +193,7 @@ class Options < Hash #:nodoc: class << self def valid_options - [:access_key_id, :secret_access_key, :server, :port, :use_ssl] + [:access_key_id, :secret_access_key, :server, :port, :use_ssl, :persistent] end end -- Telepathy is just a trick: once you know how to do it, it's easy. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://rubyforge.org/pipermail/amazon-s3-dev/attachments/20061221/eafe5918/attachment.bin From marcel at vernix.org Thu Dec 21 12:46:23 2006 From: marcel at vernix.org (Marcel Molina Jr.) Date: Thu, 21 Dec 2006 17:46:23 +0000 Subject: [s3-dev] patch to fix etag parsing with XmlSimple In-Reply-To: <20061221150712.40ced913@huginn.asgard.yggdrasill> References: <20061221150712.40ced913@huginn.asgard.yggdrasill> Message-ID: <20061221174623.GH79397@comox.textdrive.com> On Thu, Dec 21, 2006 at 03:07:12PM +0100, Metalhead wrote: > I've noticed a bug that occurs when XmlSimple is used. XmlSimple > doesn't automatically convert '"' to '"', so parsing the etags > will fail. I've included a patch that fixes this problem. I used to account for this but if you update to the latest version of XmlSimple this has been fixed. marcel -- Marcel Molina Jr. From metalhead at metalhead.ws Thu Dec 21 13:03:28 2006 From: metalhead at metalhead.ws (Metalhead) Date: Thu, 21 Dec 2006 19:03:28 +0100 Subject: [s3-dev] patch to fix etag parsing with XmlSimple In-Reply-To: <20061221174623.GH79397@comox.textdrive.com> References: <20061221150712.40ced913@huginn.asgard.yggdrasill> <20061221174623.GH79397@comox.textdrive.com> Message-ID: <20061221190328.29eb9eb6@huginn.asgard.yggdrasill> > I used to account for this but if you update to the latest version of > XmlSimple this has been fixed. Ahh, ok. Maybe still leave it in with an additional check for the version of XmlSimple? Just so that it works for everybody. Thanks, Lars -- They say that everyone wanted rec.games.hack to undergo a name change. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://rubyforge.org/pipermail/amazon-s3-dev/attachments/20061221/983613f6/attachment.bin From marcel at vernix.org Thu Dec 21 13:26:20 2006 From: marcel at vernix.org (Marcel Molina Jr.) Date: Thu, 21 Dec 2006 18:26:20 +0000 Subject: [s3-dev] patch to fix etag parsing with XmlSimple In-Reply-To: <20061221190328.29eb9eb6@huginn.asgard.yggdrasill> References: <20061221150712.40ced913@huginn.asgard.yggdrasill> <20061221174623.GH79397@comox.textdrive.com> <20061221190328.29eb9eb6@huginn.asgard.yggdrasill> Message-ID: <20061221182620.GA69262@comox.textdrive.com> On Thu, Dec 21, 2006 at 07:03:28PM +0100, Metalhead wrote: > > I used to account for this but if you update to the latest version of > > XmlSimple this has been fixed. > > Ahh, ok. Maybe still leave it in with an additional check for the version of > XmlSimple? Just so that it works for everybody. Yeah, I considered that. Anyone who installs via gems since the day the XmlSimple update was released, will get the latest (fixed) XmlSimple. So I'm reluctant to add version checking cruft into the code for functionality that I believe most people don't use and few people are succeptible to. I think I'd rather add a comment to the etag method and *maybe* add a version constraint to the gem spec for the people who install aws/s3 but *already* have an older version of XmlSimple. What do you think? Good enough? marcel -- Marcel Molina Jr. From marcel at vernix.org Thu Dec 21 13:28:01 2006 From: marcel at vernix.org (Marcel Molina Jr.) Date: Thu, 21 Dec 2006 18:28:01 +0000 Subject: [s3-dev] patch to add persistent connections In-Reply-To: <20061221161017.4a639971@huginn.asgard.yggdrasill> References: <20061221161017.4a639971@huginn.asgard.yggdrasill> Message-ID: <20061221182801.GB69262@comox.textdrive.com> On Thu, Dec 21, 2006 at 04:10:17PM +0100, Metalhead wrote: > I've included a patch to connection.rb to add the option of making connections > persistent. I've noticed speed gains of about 100% when doing batch listings of > objects. This looks good. Question: Why not always make it persistent? Why even introduce the option? Why would someone not want a persistent connection? marcel -- Marcel Molina Jr. From metalhead at metalhead.ws Thu Dec 21 13:33:18 2006 From: metalhead at metalhead.ws (Metalhead) Date: Thu, 21 Dec 2006 19:33:18 +0100 Subject: [s3-dev] patch to fix etag parsing with XmlSimple In-Reply-To: <20061221182620.GA69262@comox.textdrive.com> References: <20061221150712.40ced913@huginn.asgard.yggdrasill> <20061221174623.GH79397@comox.textdrive.com> <20061221190328.29eb9eb6@huginn.asgard.yggdrasill> <20061221182620.GA69262@comox.textdrive.com> Message-ID: <20061221193318.7fe0eb1c@huginn.asgard.yggdrasill> > Yeah, I considered that. Anyone who installs via gems since the day > the XmlSimple update was released, will get the latest (fixed) XmlSimple. So I'm > reluctant to add version checking cruft into the code for functionality that > I believe most people don't use and few people are succeptible to. > > I think I'd rather add a comment to the etag method and *maybe* add a version > constraint to the gem spec for the people who install aws/s3 but *already* > have an older version of XmlSimple. > > What do you think? Good enough? The comment sounds good, I wouldn't add a constraint though. Problem is (at least for me) that additionally to gems, my distribution uses its own package management for some of the gems. I prefer to use this if the gem is available there because this way it fits in nicely with the rest of my packages. This means that I'll almost never have the latest version of a package, so having a constraint would force me to use gems over my package management (or both in parallel!). It wasn't something I noticed when developing my application, I just installed all the mock stuff and ran the tests. One of them failed because of that. Why on earth are Amazon returning XML-ised quotes in the etag anyway? That's really quite filthy. Lars -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://rubyforge.org/pipermail/amazon-s3-dev/attachments/20061221/250fa7af/attachment.bin From metalhead at metalhead.ws Thu Dec 21 13:40:29 2006 From: metalhead at metalhead.ws (Metalhead) Date: Thu, 21 Dec 2006 19:40:29 +0100 Subject: [s3-dev] patch to add persistent connections In-Reply-To: <20061221182801.GB69262@comox.textdrive.com> References: <20061221161017.4a639971@huginn.asgard.yggdrasill> <20061221182801.GB69262@comox.textdrive.com> Message-ID: <20061221194029.6722765d@huginn.asgard.yggdrasill> > Question: Why not always make it persistent? Why even introduce the option? > Why would someone not want a persistent connection? I'm still experimenting with it. It appears that there're more timeouts with persistent connections, though that could just be coincidence. I guess when you're doing batch requests but need a lot of time to process data in between requests persistent connections aren't necessary. Why would somebody not want it? Well, why would somebody not want SSL? ;) My philosophy is that if you can make it an option easily (and provide a useful default so people don't *have* to know), do it. Us not being able to imagine why somebody would not want it doesn't mean that everybody will always be satisfied with 640K, err, need to turn persistent connections off ;) Lars -- They say that soldiers are always prepared and usually protected. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://rubyforge.org/pipermail/amazon-s3-dev/attachments/20061221/22c703f3/attachment.bin From marcel at vernix.org Thu Dec 21 21:19:51 2006 From: marcel at vernix.org (Marcel Molina Jr.) Date: Fri, 22 Dec 2006 02:19:51 +0000 Subject: [s3-dev] patch to add persistent connections In-Reply-To: <20061221161017.4a639971@huginn.asgard.yggdrasill> References: <20061221161017.4a639971@huginn.asgard.yggdrasill> Message-ID: <20061222021951.GG69262@comox.textdrive.com> On Thu, Dec 21, 2006 at 04:10:17PM +0100, Metalhead wrote: > I've included a patch to connection.rb to add the option of making connections > persistent. I've noticed speed gains of about 100% when doing batch listings of > objects. I can confirm the speed ups. The remote tests ran in 51 seconds without a persistent connection and in 30 seconds with a persistent connection. Wrapping up some loose ends with this and I'll commit it. marcel -- Marcel Molina Jr. From marcel at vernix.org Thu Dec 21 21:34:16 2006 From: marcel at vernix.org (Marcel Molina Jr.) Date: Fri, 22 Dec 2006 02:34:16 +0000 Subject: [s3-dev] patch to add persistent connections In-Reply-To: <20061221194029.6722765d@huginn.asgard.yggdrasill> References: <20061221161017.4a639971@huginn.asgard.yggdrasill> <20061221182801.GB69262@comox.textdrive.com> <20061221194029.6722765d@huginn.asgard.yggdrasill> Message-ID: <20061222023416.GH69262@comox.textdrive.com> On Thu, Dec 21, 2006 at 07:40:29PM +0100, Metalhead wrote: > > Question: Why not always make it persistent? Why even introduce the option? > > Why would someone not want a persistent connection? > > I'm still experimenting with it. It appears that there're more timeouts with > persistent connections, though that could just be coincidence. I guess when > you're doing batch requests but need a lot of time to process data in between > requests persistent connections aren't necessary. About the timeouts, after talking with a sys admin about it, he said that firewalls will frequently kill a long lived http connection because it's assuming you are trying to tunnel something through that is faked http to bypass proxies or whatever. So having persistent on will perhaps cause broken pipe errors or the like once in a while for long running processes. Regardless, I think I'll make it on by default, with an explanation in the docs that if one sees connection errors, they can turn off the :persistent option. marcel -- Marcel Molina Jr. From marcel at vernix.org Thu Dec 21 23:41:14 2006 From: marcel at vernix.org (Marcel Molina Jr.) Date: Fri, 22 Dec 2006 04:41:14 +0000 Subject: [s3-dev] AWS::S3::SignatureDoesNotMatch error when content-type can't be inferred In-Reply-To: <596512B7-22D6-4007-9A85-2EF88E44AC03@gmail.com> References: <8613E045-44D9-4805-BF00-CBCBDA8DD439@gmail.com> <20061214225925.GJ79397@comox.textdrive.com> <596512B7-22D6-4007-9A85-2EF88E44AC03@gmail.com> Message-ID: <20061222044113.GJ69262@comox.textdrive.com> On Fri, Dec 15, 2006 at 11:15:48AM -0500, Stephen Caudill wrote: > Works like a charm. So, from my end, it looks like a difference > between ruby 1.8.4 and ruby 1.8.5. Also, given that the test that > checks for nil content-type being properly inferred (in trunk it's: > test/object_test.rb:47) is passing when the functionality fails, > there's probably a bug there too. Just a heads up that I've checked in a fix for this. All tests pass against both 1.8.4 and 1.8.5. Thanks again for reporting it. I'll likely be pushing out a bug fix release tomorrow. marcel -- Marcel Molina Jr. From marcel at vernix.org Fri Dec 22 00:02:52 2006 From: marcel at vernix.org (Marcel Molina Jr.) Date: Fri, 22 Dec 2006 05:02:52 +0000 Subject: [s3-dev] [ANN] 0.3.0 release Message-ID: <20061222050252.GK69262@comox.textdrive.com> I've released 0.3.0. It contains a couple small new features as well as several enhancements and bug fixes thanks in large part to you guys. Here are the changes: - Ensure content type is eventually set to account for changes made to Net::HTTP in Ruby version 1.8.5. Reported by [David Hanson , Stephen Caudill, Tom Mornini ] - Add :persistent option to connections which keeps a persistent connection rather than creating a new one per request, defaulting to true. Based on a patch by [Metalhead ] - If we are retrying a request after rescuing one of the retry exceptions, rewind the body if its an IO stream so it starts at the beginning. [Jamis Buck] - Ensure that all paths being submitted to S3 are valid utf8. If they are not, we remove the extended characters. Ample help from [Jamis Buck] - Wrap logs in Log objects which exposes each line as a Log::Line that has accessors by name for each field. - Various performance optimizations for the extensions code. [Roman LE NEGRATE ] - Make S3Object.copy more efficient by streaming in both directions in parallel. - Open up Net:HTTPGenericRequest to make the chunk size 1 megabyte, up from 1 kilobyte. - Add S3Object.exists? marcel -- Marcel Molina Jr. From metalhead at metalhead.ws Thu Dec 28 16:52:46 2006 From: metalhead at metalhead.ws (Metalhead) Date: Thu, 28 Dec 2006 22:52:46 +0100 Subject: [s3-dev] object reload bug? Message-ID: <20061228225246.2e37763e@huginn.asgard.yggdrasill> Hi all, I've come across a rather subtle bug. When I'm batch processing a large number of files (> 1000), my application will get slower and slower, using more and more memory and cpu time. This only occurs when putting those files, not when simply testing for their existence and comparing the metadata. I'm listing keys hierarchically (to mirror directories). I've tracked down the bug to the reload! function in bucket.rb. Here's what happens: - when changing the directory, I call the objects method for the new one to get the files on the remote side - this clears the object_cache if there are no files, i.e. I'm putting and not just comparing them - I then index into the bucket array of objects to find out whether I need to update - the [] method iterates over the objects, calling the objects method again, this time without any options (I always call it with at least the delimiter option set because I list keys hierarchically) - as the object_cache is empty, it goes ahead and calls reload without any options, causing it to refresh the *entire* bucket - this causes a lot of data traffic, XML parsing, memory and cpu load -- the symptoms I was experiencing I'm not sure how to fix this, or, indeed, whether it's a bug at all. Any advice/comments would be appreciated. Thanks, Lars -- Orcs do not procreate in dark rooms. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://rubyforge.org/pipermail/amazon-s3-dev/attachments/20061228/4b5a81f9/attachment.bin From jhosteny at gmail.com Thu Dec 28 20:39:53 2006 From: jhosteny at gmail.com (beechflyer74) Date: Thu, 28 Dec 2006 20:39:53 -0500 Subject: [s3-dev] A bit off topic Message-ID: <2f0ee70f0612281739l5f2df0c6t46b0c3c7d27c7ca1@mail.gmail.com> Marcel, Thanks for the library. It's been very useful for me already. I was wondering if you intend on releasing a library for EC2 as well? If this is in the works, I'd be willing to contribute. Regards, Joe From marcel at vernix.org Fri Dec 29 01:23:02 2006 From: marcel at vernix.org (Marcel Molina Jr.) Date: Fri, 29 Dec 2006 06:23:02 +0000 Subject: [s3-dev] A bit off topic In-Reply-To: <2f0ee70f0612281739l5f2df0c6t46b0c3c7d27c7ca1@mail.gmail.com> References: <2f0ee70f0612281739l5f2df0c6t46b0c3c7d27c7ca1@mail.gmail.com> Message-ID: <20061229062302.GI82301@comox.textdrive.com> On Thu, Dec 28, 2006 at 08:39:53PM -0500, beechflyer74 wrote: > Thanks for the library. It's been very useful for me already. Hey thanks, my pleasure. > I was wondering if you intend on releasing a library for EC2 as well? > If this is in the works, I'd be willing to contribute. Cool. Yeah, I've been planning on it, as are others on this list I've gathered. One of the first steps as I see it is extracting reusable bits of the S3 code into a set of classes that all AWS libraries could use. For example, the authentication code, the request code and the connection management code. This would then become a gem dependency for all AWS libs that I work on (or anyone else for that matter if they are interested). I'm a bit fatigued from working on the S3 library to jump right into EC2 immediately (after the "holidays" I think I might be all set) but the clock is a ticking and there's code to be written. Several people have also expressed interest. The 'amazon' project on rubyforge is intended to be an AWS umbrella. I'm also planning on writing a lib for SQS (which is way simpler than EC2). I'll give a holler on this list if I get working on anything. I encourage you to do the same. marcel -- Marcel Molina Jr. From marcel at vernix.org Fri Dec 29 01:35:15 2006 From: marcel at vernix.org (Marcel Molina Jr.) Date: Fri, 29 Dec 2006 06:35:15 +0000 Subject: [s3-dev] object reload bug? In-Reply-To: <20061228225246.2e37763e@huginn.asgard.yggdrasill> References: <20061228225246.2e37763e@huginn.asgard.yggdrasill> Message-ID: <20061229063515.GJ82301@comox.textdrive.com> On Thu, Dec 28, 2006 at 10:52:46PM +0100, Metalhead wrote: > I've tracked down the bug to the reload! function in bucket.rb. Here's what > happens: > - when changing the directory, I call the objects method for the new one to get > the files on the remote side > - this clears the object_cache if there are no files, i.e. I'm putting and not > just comparing them > - I then index into the bucket array of objects to find out whether I need to > update > - the [] method iterates over the objects, calling the objects method again, > this time without any options (I always call it with at least the delimiter > option set because I list keys hierarchically) > - as the object_cache is empty, it goes ahead and calls reload without any > options, causing it to refresh the *entire* bucket > - this causes a lot of data traffic, XML parsing, memory and cpu load -- the > symptoms I was experiencing Sounds nasty. Thanks as always for the report. That caching scheme isn't very satisfactory. There is far too much implicit magic happening. I managed to confuse myself quite a few times while implementing it. I plan on rethinking bucket list fetching all together to make it more like a cursor. I'm currently on the road but I'll check it out when I get settled. marcel -- Marcel Molina Jr. From metalhead at metalhead.ws Fri Dec 29 08:17:31 2006 From: metalhead at metalhead.ws (Metalhead) Date: Fri, 29 Dec 2006 14:17:31 +0100 Subject: [s3-dev] object reload bug? In-Reply-To: <20061229063515.GJ82301@comox.textdrive.com> References: <20061228225246.2e37763e@huginn.asgard.yggdrasill> <20061229063515.GJ82301@comox.textdrive.com> Message-ID: <20061229141731.5a3c58c3@huginn.asgard.yggdrasill> > Sounds nasty. Thanks as always for the report. That caching scheme isn't very > satisfactory. There is far too much implicit magic happening. I managed to > confuse myself quite a few times while implementing it. I plan on rethinking > bucket list fetching all together to make it more like a cursor. I'm > currently on the road but I'll check it out when I get settled. I think it'd be good to have an option to turn off caching altogether. If you're only accessing objects once (e.g. backup) it only adds overhead and increases the memory footprint. Could people please post whether they use caching in their applications at all? Regarding the reimplementation of bucket listing, this could be done with lazy evaluation [1]. This would also blend in nicely with the caching and eliminate the need to reload unless a particular object is changed. Lars [1] http://www-128.ibm.com/developerworks/linux/library/l-lazyprog.html The example code is in Scheme, but implementing it in Ruby wouldn't be that different. -- They say that having polymorph control won't shock you. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://rubyforge.org/pipermail/amazon-s3-dev/attachments/20061229/4af33782/attachment-0001.bin From metalhead at metalhead.ws Fri Dec 29 09:49:22 2006 From: metalhead at metalhead.ws (Metalhead) Date: Fri, 29 Dec 2006 15:49:22 +0100 Subject: [s3-dev] object reload bug? In-Reply-To: <20061228225246.2e37763e@huginn.asgard.yggdrasill> References: <20061228225246.2e37763e@huginn.asgard.yggdrasill> Message-ID: <20061229154922.07d204cb@huginn.asgard.yggdrasill> I've found a partial fix. I'm now only listing directories hierarchically, not particular objects. The only time reload is called with an empty object_cache now is when a new, empty directory has been listed and the first object is added. This is because of when :stored then add object unless objects.include?(object) in bucket.rb. I suggest changing that line to when :stored then object_cache.delete(object); add object When an object is stored, the most recent information is available on the local side. Therefore, there is no need to query S3 for it, which will almost always return additional information about other objects that's not needed. As only the object_cache is modified, we only need to make sure that an object isn't put twice into the cache; hence the delete. The speed gain when operating on about 2500 files with no global reload compared to the unmodified code and attempting to list each object individually before storing it is about 300%. Lars -- They say that the gods are happy when they drop objects at your feet. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://rubyforge.org/pipermail/amazon-s3-dev/attachments/20061229/13d024cc/attachment.bin From maccman at gmail.com Sun Dec 31 14:14:51 2006 From: maccman at gmail.com (Alex MacCaw) Date: Sun, 31 Dec 2006 19:14:51 +0000 Subject: [s3-dev] adding metadata before upload Message-ID: <14cc92570612311114g88e8468we941052e4cbf834e@mail.gmail.com> If I create a new object, I can seem to set metadata or content_type. For example: aws = AWS::S3:: Bucket.find(asset.bucket.s3_name).new_object aws.key = asset.s3_name aws.value = asset.file_data aws.content_type = asset.content_type # Doesn't work aws.metadata[:encrypted] = asset.encrypted # Neither does this aws.store I get an argument error, "wrong number of arguments (0 for 1)". I would recommend allowing people to add metadata in the store class method (this is what I tried to do originally). Marcel, keep up the good work and happy new year. -- http://www.eribium.org | http://juggernaut.rubyforge.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/amazon-s3-dev/attachments/20061231/b1ae4230/attachment.html