Date: 2011-03-06 00:04
Sender: Per Grottum
Pasted in correspondence with Jeremy Evans:
From: Per <p...@medisin.uio.no>
Setting:
win 7
sqlite3 database in utf-16le format
sqlite3.dll 3.7.2 compiled from source from www.sqlite.org
sqlite3-ruby 2.2.3 compiled from source from rubyforge
using the instructions in the readme-file
gem install sequel
Tested sqlite.dll in a C-program: works OK
Tested sqlite3-ruby with utf-8 encoded ruby source: works
OK
With sequel it does not work. This is the program:
# -*- coding: utf-8 -*-
require 'sequel'
mm = Sequel.sqlite('MM.DB')
tbl = mm.tables
tbl.each { |t| puts t }
I have included som debugging output at various levels down
to sqlite3.dll and there is a problem with encoding:
sequel sends US- ASCII code to sqlite3-ruby and it is not
properly converted to utf-8/utf-16le along the line and
ends up raising an SQL-error in sqlite3.dll.
Has anybody run across this problem and found a solution?
Thank you for your help.
Per
------------------------
From: Jeremy Evans <jeremyeva...@gmail.com>
Sequel does not modify encodings. If you are getting
strings in encodings you don't expect, or the driver is
raising an error because it isn't converting strings
properly, the problem is at the database driver level. I'm
guessing you have sqlite3-ruby 1.2.3, and you could
try upgrading to 1.3.1 (gem install sqlite3-ruby) and see
if that fixes it. If not, you should probably bring it up
with the sqlite3-ruby maintainer.
Thanks,
Jeremy
------------------
From: Per <p...@medisin.uio.no>
I compiled v. 1.3.1 of sqlite3-ruby from source, and that
is the only sqlite3-ruby code on the machine.
As you can see from the code snippet in my first mail, I
generate no sql myself, only call the built-in sequel
method "tables" to get a list of the tables. This method
expands internally to the sql-string "SELECT * FROM
`sqlite_master` WHERE (type = 'table' AND NOT name =
'sqlite_sequence')" which turns out to be in US-ASCII. The
documentation of the sqlite driver says "In the current
implementation of SQLite, the SQL parser only works with
UTF-8 text." (http://www.sqlite.org/version3.html), so it
seems that it is the responsibility of the calling
functions to encode the sql in utf-8.
The small program below omits sequel and uses sqlite3-ruby
directly.
It is encoded in utf-8, and the sql passes faultlessly
through the system and returns the list of tables.
------------------------------------
# -*- coding: utf-8 -*-
require 'sqlite3'
db = SQLite3::Database.new( "mm.db" )
db.execute("select name from sqlite_master where
type='table'") do |row|
p row
p row[0].encoding.name
end
db.close
------------------------------------
Thank you for looking into this.
Per
------------------
From: Jeremy Evans <jeremyeva...@gmail.com>
The SQL created by Sequel should be valid US-ASCII and
valid UTF-8.
You mentioned you had some debugging output, but as you
didn't link to it, it's hard for me to judge whether this
is a problem with Sequel or somewhere else. From what
you've described, I don't think this is a problem with
Sequel.
Jeremy
--------------------
From: Per <p...@medisin.uio.no>
Maybe this is a question of some encoding switch that I
have not set, but I do not understand how I can influence
the coding of the sql that the 'tables' method
generates. The sql is built internally in
sequel/adapters/shared/sqlite.rb and results in an US-ASCII
sql string that I have no direct control over.
The sql is subsequently executed by the _execute method in
sequel/adapters/sqlite.rb. I have added a couple of lines
to the beginning of this method that expressly changes to
coding to utf-8, and everything works nicely with this
hack:
def _execute(type, sql, opts, &block)
begin
STDERR.puts "in sequel/adapters/sqlite.rb,
_execute: " + sql.to_s + " " + sql.encoding.name
csql = sql
sql = csql.encode("utf-8")
STDERR.puts "in sequel/adapters/sqlite.rb,
_execute: " + sql.to_s + " " + sql.encoding.name
...
This is the debugging output:
in sequel/adapters/sqlite.rb, _execute: SELECT * FROM
`sqlite_master` WHERE (type = 'table' AND NOT name
= 'sqlite_sequence') US-ASCII
in sequel/adapters/sqlite.rb, _execute: SELECT * FROM
`sqlite_master` WHERE (type = 'table' AND NOT name
= 'sqlite_sequence') UTF-8
Again, thank you for your patience.
Per
-------------------
From: Jeremy Evans <jeremyeva...@gmail.com>
That's probably a bug in the database driver (sqlite3-
ruby), and possibly specific to Windows. The SQL string is
valid US-ASCII and valid UTF-8, so the .encode('utf-8')
shouldn't actually change the bytes in the string. I
suggest you report it to the sqlite3-ruby project. If they
determine that it's a problem with Sequel and not sqlite3-
ruby, I'll work with them to get it resolved.
Jeremy
|