Forums | Admin

Discussion Forums: help

Start New Thread Start New Thread

 

By: James Gray
RE: Custom quote string? [ reply ]  
2007-02-12 17:59
Michael:

Changing the parsing Regexp is how I started when I tried to add this feature, so I'm pretty sure I know about what you have.

Let me see if I can explain where the problems come in with this approach.

My main reason for adding this feature was to support the semi-common CSV variant that uses \" for escaped quotes. In order to do that though, you have to add an escape for a normal \, which is generally done with \\. So, in order to support this common variant you need to know three thing: the quote character is ", an escaped quote is \", and an addition escape is used as \\ for \.

I found a way to pass all that information into FasterCSV.

The problem then becomes that FasterCSV's Regexp parser chokes and dies on these options. I can write a Regexp to handle the new conditions, but I'm not clever enough to teach the code to do this whenever the rules change in some new way.

Given that, I chose not to support overriding the quote character. I could make it work for some instances, but not all, so adding the feature just seemed like it would cause confusion when it mysteriously didn't work for some set of rules. It seemed better to disallow it altogether.

Hope that makes sense.

James Edward Gray II

By: Michael Silver
RE: Custom quote string? [ reply ]  
2007-02-12 16:22
James, I changed the regular expression used to extract the fields to take a custom string for the quote char instead of ", like it already does for the field separator. I also added an option for the quote character to be passed into the code the same as the other optional params are.

I am unable to write tests to validate what I have, but your tests do pass. I can't write any tests for it as of yet, because I don't understand the test code, nor Ruby's tests, or your class well enough to properly test my changes.

I can give you my code but I would imagine I am missing something big, beyond it failing by passing in an invalid quote separator, and you probably aren't interesting in code that may break your code, understandably. If you still want to look at the changes and critique, I am more than interested.

Currently the modded class suits my needs since I need to read some CSV files with odd separators. Once I better understand how to write tests and the other aspects of your code, I will write the necessary units tests.

...Michael...

By: James Gray
RE: Custom quote string? [ reply ]  
2007-02-09 18:53
You are welcome to send me your changes and I will definitely consider then.

I must stress again though that I was unable to do this myself. It's a very complex issue.

Remember to handle escapes and write lot's of tests to make sure you nailed the edge cases.

James Edward Gray II

By: Michael Silver
RE: Custom quote string? [ reply ]  
2007-02-09 18:38
James, I changed some of the regex code in faster_csv.rb to allow passing in an alternative quote string for reading csv files, but I haven't changed the code to output with the alternative string. After reading your response I realize that I may be breaking functionality somewhere else though.

I will try to change the output code and provide some unit tests. Would you be willing to review my code and perhaps even include it if it is acceptable?

By the way, your code is well written and very easy to understand, which helped a lot since I am somewhat new to ruby.

...Michael...

By: James Gray
RE: Custom quote string? [ reply ]  
2007-02-09 17:48
Sadly, the quote characters cannot be overridden with FasterCSV. (Note that the Ruby Cookbook has a recipe for this they claim works, but it fails in several cases.)

I really did try to add this feature at one point, but the design of the FasterCSV parser makes it all but impractical to do right.

Given that, it's better to just preprocess the lines, changing the quotes to what FasterCSV will recognize. Depending on the rules of quoting nd escaping you might be able to get away with line.tr("~", '"').parse_csv, but it can be more complicated at times.

Good luck.

James Edward Gray II

By: Michael Silver
Custom quote string? [ reply ]  
2007-02-09 06:15
I have an unusual need to read a CSV file that uses ~ instead of quotes (and ^ instead of commas) due to the use of quotes in the data.

Is there any way to customize the quote character in FasterCSV? I imagine it would be a fairly simple change since it appears to be using regex to determine the quoted strings.

Great library and I look forward to seeing it in the core ruby distribution.

...Thanks...
...Michael...