In file net/ssh/transport, @socket.send may fail to deliver whole package at once. Send returns numbers of bytes actually
sent, so it should be invoked in loop. If library fails to deliver whole package, it falls into unstable state, most
likely hanging in infinite loop, waiting for server response at /net-ssh-1.1.2/lib/net/ssh/transport/packet-stream.rb:201:in
`select'
In attachment patch is provided.
usage:
patch your_gems_dir/net-ssh-1.1.2/lib/net/ssh/transport/packet-stream.rb < packet-stream.patch
------
Full description of problem:
I've runned into this problem multiple times while attempting to send file using sftp.
When using net/sftp for transferring large files to server, sometimes it got stuck in internal loop at:
/net-ssh-1.1.2/lib/net/ssh/transport/packet-stream.rb:201:in `select'
It looks like net/sftp was waiting for packet which will never arrive.
[DEBUG] Wed Feb 06 20:24:40 +0100 2008 -- sftp.operations.write: received request id 1
[DEBUG] Wed Feb 06 20:24:40 +0100 2008 -- transport.session: waiting for packet from server...
[DEBUG] Wed Feb 06 20:24:40 +0100 2008 -- transport.incoming_packet_stream: reading 8 bytes from socket...
(At this point SFTP stucks.)
This problem occured mostly when either rails gem was loaded or sftp has been put in timeout block. This was strange,
as simple scripts were running fine.
Further investigation of system calls which are being made before/in infinite loop has revealed:
send(3, "\250\10\217VJ\347\332\225\210l\255\0\\%x\347\326)\202\244"..., 29120, 0) = 14480
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
sigreturn() = ? (mask now [])
...
select(5, [3], [], [], {0, 9978}) = 0 (Timeout)
gettimeofday({1202380855, 386344}, NULL) = 0
select(5, [3], [], [], {0, 0}) = 0 (Timeout)
gettimeofday({1202380855, 386391}, NULL) = 0
select(0, [], [], [], {0, 0}) = 0 (Timeout)
sigprocmask(SIG_BLOCK, NULL, []) = 0
gettimeofday({1202380855, 386461}, NULL) = 0
gettimeofday({1202380855, 386482}, NULL) = 0
select(5, [3], [], [], {0, 9979}) = 0 (Timeout)
gettimeofday({1202380855, 398357}, NULL) = 0
select(5, [3], [], [], {0, 0}) = 0 (Timeout)
gettimeofday({1202380855, 398404}, NULL) = 0
select(0, [], [], [], {0, 0}) = 0 (Timeout)
It is evident that send was interrupted by SIGVTALRM signal (used internally by ruby to manage threads), and didn't
manage to send whole package at once.
This situation is not properly handled in /net-ssh-1.1.2/lib/net/ssh/transport/packet-stream.rb:85:send method, leaving
rest of library believe, that whole packet was send.
|