niXforums Forum Index
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   PreferencesPreferences   Log in to check your private messagesLog in to check your private messages   Log inLog in 
·  nixdoc.net ·  man pages ·  Linux HOWTOs ·  FreeBSD Tips ·  Forums
navigation Forum index » Databases » IBM DB2
High Performance DPF INSERT's
Post new topic   Reply to topic Page 1 of 1 [3 Posts] View previous topic :: View next topic
Author Message
wombat53
*nix forums beginner


Joined: 11 Jul 2006
Posts: 3

PostPosted: Wed Jul 12, 2006 1:28 am    Post subject: High Performance DPF INSERT's Reply with quote

Hi Group
Are there any DB2 UDB ESE DPF V8.2 users exploiting "buffered inserts"
(BIND parm INSERT BUF) *and* "multi-row INSERTS" (many rows associated
with the VALUES clause of the INSERT to minimize number of calls to the
RDMBS engine) in an SQL PL Stored Procedure?
I ask, as the latter would imply embedded DYNAMIC SQL due to
potentially varying number of rows to be inserted (assume it's
varying), and so, the SQL INSERT would have be built dynamically; it's
not clear that DYNAMIC SQL and Buffered Inserts are compatible. i.e.
that one reaps either or both the benefits of these two
Hi-Performance-oriented features. It's quite possible they are
incomaptible, and I would have to abandon one or the other (multi-row
probably...)
Thanks
wombat53
Back to top
Serge Rielau
*nix forums Guru


Joined: 29 Apr 2005
Posts: 1583

PostPosted: Thu Jul 13, 2006 11:07 am    Post subject: Re: High Performance DPF INSERT's Reply with quote

wombat53 wrote:
Quote:
Hi Group
Are there any DB2 UDB ESE DPF V8.2 users exploiting "buffered inserts"
(BIND parm INSERT BUF) *and* "multi-row INSERTS" (many rows associated
with the VALUES clause of the INSERT to minimize number of calls to the
RDMBS engine) in an SQL PL Stored Procedure?
I ask, as the latter would imply embedded DYNAMIC SQL due to
potentially varying number of rows to be inserted (assume it's
varying), and so, the SQL INSERT would have be built dynamically; it's
not clear that DYNAMIC SQL and Buffered Inserts are compatible. i.e.
that one reaps either or both the benefits of these two
Hi-Performance-oriented features. It's quite possible they are
incomaptible, and I would have to abandon one or the other (multi-row
probably...)
Buffered insert works only if the insert follow each other immediately

_without_any_SQL_in_between_.
In a loop in an SQL Procedure it's rather unlikely that that is the
case. The loop control will likely cause SQL execution unless it's trivial.

If you want to speed up mass inserts in a DPF system keep in mind that
you are bottlenecking on the SQL Procedure logic itself (running on the
coordinator).
You may get big benefits from parallelizing the procedure. That is CALL
the proc on each data-node (or at least each available coordinator if
you have multiple) and have it process a subset of the data (preferably
local data).
I have seen for a computational heavy batch process reding from a
staging table linear scalability for 2 concurrent procedure calls per
data node.


Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ondemandbusiness/conf2006/
Back to top
wombat53
*nix forums beginner


Joined: 11 Jul 2006
Posts: 3

PostPosted: Wed Jul 19, 2006 2:56 pm    Post subject: Re: High Performance DPF INSERT's Reply with quote

Serge Rielau wrote:
Quote:
wombat53 wrote:
Hi Group
Are there any DB2 UDB ESE DPF V8.2 users exploiting "buffered inserts"
(BIND parm INSERT BUF) *and* "multi-row INSERTS" (many rows associated
with the VALUES clause of the INSERT to minimize number of calls to the
RDMBS engine) in an SQL PL Stored Procedure?
I ask, as the latter would imply embedded DYNAMIC SQL due to
potentially varying number of rows to be inserted (assume it's
varying), and so, the SQL INSERT would have be built dynamically; it's
not clear that DYNAMIC SQL and Buffered Inserts are compatible. i.e.
that one reaps either or both the benefits of these two
Hi-Performance-oriented features. It's quite possible they are
incomaptible, and I would have to abandon one or the other (multi-row
probably...)
Buffered insert works only if the insert follow each other immediately
_without_any_SQL_in_between_.
In a loop in an SQL Procedure it's rather unlikely that that is the
case. The loop control will likely cause SQL execution unless it's trivial.

If you want to speed up mass inserts in a DPF system keep in mind that
you are bottlenecking on the SQL Procedure logic itself (running on the
coordinator).
You may get big benefits from parallelizing the procedure. That is CALL
the proc on each data-node (or at least each available coordinator if
you have multiple) and have it process a subset of the data (preferably
local data).
I have seen for a computational heavy batch process reding from a
staging table linear scalability for 2 concurrent procedure calls per
data node.


Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ondemandbusiness/conf2006/
Thanks Serge.

We have decided to bypass the problems of commingling DYNAMIC SQL and
Buffered Inserts, and replace the SQL with STATIC. We will use CASE
expression to "jump" to the appropriate statically bound INSERT, for
however many rows we have available to us from out Message Queue (that
will be a parm, and taken care of within the CASE expr. within tyhe
STATIC INSERT). We are also mindful of the Special Consideratrions for
using Buffered Inserts abot as you have noted and as documented in ch.
31 of V8.2 "Prog. Client Applications".
I am following up on your comment "That is CALL
Quote:
the proc on each data-node (or at least each available coordinator if
you have multiple) and have it process a subset of the data (preferably
local data)."
Is this something along the lines of intelligently identifying the

optimal co-ordinator node for SQL (INSERT) throught the two api's of
sqlugrpn - Get Row Partitioning Number, and
sqlugtpi - Get Table Partitioning Information (called only once) such
that co-ordinatoer node and data node are one, with co-ordinator
distributed across servers, and at the same thereby minimizing data
movement; or are you getting at something different, when you say
parallelizing the procedure? Perhaps sorting the messaged input in some
way, multi-threading the INSERTS, or simply having multiple
co-ordinator nodes, along then lines of the BCU/BPU methodology?etc..
We expect to be running ESE(DPF)/LINUX V9, most recent Beta, or GA of
July 28. We are looking at many million if SQL INSERT's/day, a
potential choke-point.
Thanks
wombat53
Back to top
Google

Back to top
Display posts from previous:   
Post new topic   Reply to topic Page 1 of 1 [3 Posts] View previous topic :: View next topic
The time now is Thu Nov 20, 2008 10:36 pm | All times are GMT
navigation Forum index » Databases » IBM DB2
Jump to:  

Similar Topics
Topic Author Forum Replies Last Post
No new posts Performance and Consistency ?? likun.navipal@gmail.com Berkeley DB 4 Fri Jul 21, 2006 4:24 am
No new posts AIX performance tuning jpzhai@gmail.com AIX 5 Fri Jul 21, 2006 2:27 am
No new posts High paging usage on database process mychrislo@gmail.com AIX 0 Wed Jul 19, 2006 9:59 am
No new posts Performance problem News AIX 1 Wed Jul 19, 2006 9:55 am
No new posts Antw: Performance problem with query Christian Rengstl PostgreSQL 10 Tue Jul 18, 2006 6:24 pm

Loans | Books | Myspace Layouts | Free Text Messages | Starting Over
Copyright © 2004-2005 DeniX Solutions SRL
 
Other DeniX Solutions sites: Unix/Linux blog |  electronics forum |  medicine forum |  science forum | 
Privacy Policy


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.1922s ][ Queries: 16 (0.1125s) ][ GZIP on - Debug on ]