Forums | Admin

Discussion Forums: open-discussion

Start New Thread Start New Thread

 

By: Rob Morris
RE: Diff::LCS bug - spurious remove action [ reply ]  
2007-06-24 06:20
OK, so the cause is the way traverse_sequences handles String (and ONLY string!) arguments. Seems the author wanted to have the elements returned be strings, rather than character fixnums, ie "a" vs 97.

As a result, the code sets a flag:

string = seq1.kind_of?(String)

If string is true, instead of using seq1[index] to access each element, it uses seq1[index,1].

This works, unless index = seq1.length, in which case you get an empty string "" instead of nil, which you might expect. The code, naturally, tests with .nil? on each element. And generates the incorrect remove action.

You can fix this one of two ways. If (like me) you don't want strings for your sub-elements, just set string = false on line 399 of lcs.rb. This should result in a major speedup as well, as it prevents the generation of a zillion one character strings that you'll most likely be concatenating anyway. (you can then remove all the conditionals on the 'string' switch too, of course...)

The other option is to go fix the .nil? calls to be .empty? instead, and (if you're not on Rails) monkey patch the Nil object to respond true to empty? calls. Fun stuff. :-)

Hope this helps someone out there...

By: Rob Morris
RE: Diff::LCS bug - spurious remove action [ reply ]  
2007-06-24 05:45
Follow-up: only seems to happen for String arguments. Arrays work correctly.

Ex:

LCS.diff('ab','ab') =>
[[#<Diff::LCS::Change:0x2a97ee7b88 @position=2, @action="-", @element="">]]

LCS.diff('ab'.to_a,'ab'.to_a) =>
[]

Oddness. Anyone have a clue what's going on here?

By: Rob Morris
Diff::LCS bug - spurious remove action [ reply ]  
2007-06-24 05:26
Hey Austin, saw a post from a while back re: 1.8.4 and LCS bugs, where you requested an example. Here's a simple example of broken behavior in Ruby v1.8.5 and LCS v1.1.2:

>> changes = Diff::LCS.diff('hello','hello')
=> [[#<Diff::LCS::Change:0x2a97f23d40 @position=5, @action="-", @element="">]]

The diff method is returning a remove action on the null at the end of the a input. Seems to happen regardless of what a and b are on the diff() call.