[ruby-dbi-users] [dbd_pg] Prepared statement performance

KUBO Takehiro kubo at jiubao.org
Sat Sep 6 22:05:34 EDT 2008


I looked in manuals and source codes of libpq, ruby-pg, perl dbd::pg
and postgrsql jdbc driver. But I have not run it. I may misunderstand
the behavior.

Server side prepared statements are not always good for performance.
It depend on the database archtecture. For example. On Oracle, all
statement's execution plans are cached in server side shared memory
and reused by all other connections. This means that the folloing
steps are skipped if the SQL is on the cache.
 * parse the SQL statement.
 * make possible execution (access) plans.
 * choose one by comparing estimated costs

But postgresql's SQL statements are not cached automatically. Database
client application must manage it explicitly by a name of the
statement, and it is reused only when the name is reused explicitly.

IMO, almost dbi applications have a negative impact rather than
benefit.

Case 1:
   dbh.execute('update emp set sal = sal + 10 where empno = 1')

This dbi code executes the following ruby-pg code internally.
   conn.prepare('ruby-dbi:Pg:1234', 'update emp set sal = sal + 10
where empno = 1')
   conn.exec_prepared('ruby-dbi:Pg:1234')
   conn.exec('DEALLOCATE "ruby-dbi:Pg:1234"')

It needs three network round trips. If dbd_pg didn't use prepared
statements as follows, it needed only one round trip.
   conn.exec('update emp set sal = sal + 10 where empno = 1')

This is not a small impact if the application runs in DMZ and the
database server is behind the firewall.


Case 2:
   dbh.execute('update emp set sal = sal + 10 where empno = ?', 1)
   dbh.execute('update emp set sal = sal + 10 where empno = ?', 2)

This executes the following code internally.
   conn.prepare('ruby-dbi:Pg:1234', 'update emp set sal = sal + 10
where empno = $1')
   conn.exec_prepared('ruby-dbi:Pg:1234', [1])
   conn.exec('DEALLOCATE "ruby-dbi:Pg:1234"')
   conn.prepare('ruby-dbi:Pg:1235', 'update emp set sal = sal + 10
where empno = $1')
   conn.exec_prepared('ruby-dbi:Pg:1235', [2])
   conn.exec('DEALLOCATE "ruby-dbi:Pg:1235"')

It receive no benefit of prepared statements. The sql statements are
exactly same but the statement names are different. Even though the
name are accidentaly same (same object id), the first sql's execution
plan has been already deallocated.

It needs six network round trips. If dbd_pg didn't use prepared
statements as follows, it needed only two round trips.
   conn.exec('update emp set sal = sal + 10 where empno = $1', [1])
   conn.exec('update emp set sal = sal + 10 where empno = $1', [2])


Case 3:
   sth = dbh.prepare('update emp set sal = sal + 10 where empno = ?')
   1.upto 100 do |i|
     sth.execute(i)
   end
   sth.finish

This executes the following code internally.
   conn.prepare('ruby-dbi:Pg:1234', 'update emp set sal = sal + 10
where empno = $1')
   1.upto 100 do |i|
     conn.exec_prepared('ruby-dbi:Pg:1234', [i])
   end
   conn.exec('DEALLOCATE "ruby-dbi:Pg:1234"')

This is a good case. The sql statement is parsed and the execution
plan is made only once. The plan is reused 100 times. If ruby-dbi
didn't use a prepared statement, the posgres server would parse 100
sql statements and make 100 execution plans.


In the three cases, the last one is the only case which has a benefit
of prepared statement. But case 3 is rare when the application is for
transaction system. Majority code will issue queries such as case 1
or case 2.

IMO, dbd_pg should not use server side prepared statements by default.
They should be used only when (1) it is requested or (2) dbd_pg
consider that now is the time to use them.

(1) add an attribute "pg_server_prepare" as Perl DBD::Pg does.
    (pg_server_prepare is enabled by default on perl dbd...)

    http://search.cpan.org/~turnstep/DBD-Pg/Pg.pm#prepare

  For example:

    DBI.connect("dbi:Pg:database=#{dbname};pg_serer_prepare=true", ...)
      => set default value of the connection.

    dbh.prepare(sql_statement, {:pg_server_prepare => true})
      => use server prepare.

    dbh.prepare(sql_statement, {:pg_server_prepare => false})
      => don't use server prepare.

    dbh.prepare(sql_statement)
      => depend on default value of the connection.

(2) enable server side prepared statemetn when the sth is executed 5
    (or user-specified number) times as postgresql jdbc driver does.

      http://jdbc.postgresql.org/documentation/81/server-prepare.html

  For example:

    sth = dbh.prepare('update emp set sal = sal + 10 where empno = ?')
    sth.execute(1)
    # => conn.exec('update emp set sal = sal + 10 where empno = $1', [1])

    sth.execute(2)
    # => conn.exec('update emp set sal = sal + 10 where empno = $1', [2])

    sth.execute(3)
    # => conn.exec('update emp set sal = sal + 10 where empno = $1', [3])

    sth.execute(4)
    # => conn.exec('update emp set sal = sal + 10 where empno = $1', [4])

    sth.execute(5)
    # => conn.exec('update emp set sal = sal + 10 where empno = $1', [5])

    sth.execute(6)
    # reach the threshold. enable server side prepare.
    # => conn.prepare('ruby-dbi:Pg:1234', 'update emp set sal = sal +
10 where empno = $1')
    #    conn.exec_prepared('ruby-dbi:Pg:1234', [6])

    sth.execute(7)
    # => conn.exec_prepared('ruby-dbi:Pg:1234', [7])

      .....

    sth.finish
    # => conn.exec('DEALLOCATE "ruby-dbi:Pg:1234"')

I prefer the latter because ruby-dbi users have no need to care whether
server side prepared statements should be useed or not.

Well, I'll add this issue to 'Feature Requests' also.


More information about the ruby-dbi-users mailing list