niXforums Forum Index
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   PreferencesPreferences   Log in to check your private messagesLog in to check your private messages   Log inLog in 
·  nixdoc.net ·  man pages ·  Linux HOWTOs ·  FreeBSD Tips ·  Forums
navigation Forum index » Databases » PostgreSQL
customizing pg_dump together with copy.c's DoCopy function
Post new topic   Reply to topic Page 1 of 1 [6 Posts] View previous topic :: View next topic
Author Message
lynnsettle@yahoo.com
*nix forums beginner


Joined: 10 Jul 2006
Posts: 4

PostPosted: Mon Jul 10, 2006 10:13 pm    Post subject: customizing pg_dump together with copy.c's DoCopy function Reply with quote

My first ever newsgroup PostgreSQL question... I want to move data
between some very large databases (100+ gb) of different schema at our
customer sites. I cannot expect there to be much working partition
space, so the databases cannot exist simultaneously. I am also
restricted to hours, not days, to move the data.

I felt that pg_dump/pg_restore with the compressed format would do the
job for me. I was able to create a modified pg_dump program without any
difficulty. But I need to customize the speedy COPY FROM and COPY TO
commands to perform my necessary schema and data content changes. I
tried copying /src/backend/copy.c/h to my customized pg_dump project,
renamed them and their DoCopy function, and added it to my makefile.
This created conflicts between libpq-fe.h and libpq.h. For example:

postgresql-7.4.13/src/interfaces/libpq/libpq-fe.h:191: error:
conflicting types for `PQArgBlock'
postgresql-7.4.13/src/include/libpq/libpq.h:39: error: previous
declaration of `PQArgBlock'

Is it possible to compile-link together frontend pg_dump code with
backend code from copy.c? If not, how can I go about customizing
pg_dump to have low-level control over the speedy but inflexible COPY
TO/FROM commands? I already tried all this with regular sql calls and
it was unnacceptably far too slow.

thanks -Lynn


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq
Back to top
lynnsettle@yahoo.com
*nix forums beginner


Joined: 10 Jul 2006
Posts: 4

PostPosted: Tue Jul 11, 2006 12:39 am    Post subject: Re: customizing pg_dump together with copy.c's DoCopy function Reply with quote

After further reading, I'm wondering if I should instead try to use
libpq calls like PQgetCopyData, PQputCopyData, and PQputCopyEnd.

Would they be a viable alternative to provide both the speed and
flexibility I'm looking for?

-Lynn


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Back to top
Tom Lane
*nix forums Guru


Joined: 24 Mar 2005
Posts: 2070

PostPosted: Tue Jul 11, 2006 2:17 pm    Post subject: Re: customizing pg_dump together with copy.c's DoCopy function Reply with quote

"lynnsettle@yahoo.com" <lynnsettle@yahoo.com> writes:
Quote:
Is it possible to compile-link together frontend pg_dump code with
backend code from copy.c?

No. Why do you think you need to modify pg_dump at all?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
Back to top
lynnsettle@yahoo.com
*nix forums beginner


Joined: 10 Jul 2006
Posts: 4

PostPosted: Tue Jul 11, 2006 3:43 pm    Post subject: Re: customizing pg_dump together with copy.c's DoCopy function Reply with quote

Quote:
Is it possible to compile-link together frontend pg_dump code with
backend code from copy.c?

No. Why do you think you need to modify pg_dump at all?


pg_dump and pg_restore provide important advantages for upgrading a
customer's database on site:

They are fast. I want to minimize downtime.
They allow compression. I often will have relatively little free disk
space to work with.
My concept is "customized dump", drop database, create new schema
database, "customized restore".

My upgrade requires many schema and data content changes. I've tried
using standard SQL statements in perl scripts to do all of it, but even
with no indexes on inserts, later creating indexes for the lookup work,
and every other optimization I know of, a 100gb database requires
several days to turn our old database into a new one. I was hoping that
I could modify the speedy pg_dump/pg_restore utilities to make these
changes "on the fly". It gets tricky because I have to restore some of
the data to different tables having varying schema and also change the
table linking. But this is all doable as long as I can "massage" the
SQL statements and data both when it goes into the dump file and when
it is getting restored back out.

Or am I trying to do the impossible?
-Lynn


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
Back to top
Brian Mathis
*nix forums beginner


Joined: 12 Oct 2005
Posts: 7

PostPosted: Sun Jul 16, 2006 8:17 pm    Post subject: Re: customizing pg_dump together with copy.c's DoCopy function Reply with quote

pg_dump by default dumps to STDOUT, which you should use in a pipeline to
perform any modifications. To me this seems pretty tricky, but should be
doable. Modifying pg_dump really strikes me as the wrong way to go about
it. Pipelines operate in memory, and should be very fast, depending on how
you write the filtering program. You would need to dump the data without
compression, then compress it coming out the other end (maybe split it up
too). Something like this:
pg_dump | myfilter | gzip | split --bytes=2000M - mydump.

Also, you can't expect to have speed if you have no disk space.
Reading/writing to the same disk will kill you. If you could set up some
temp space over NFS on the local network, that should gain you some speed.


On 11 Jul 2006 08:43:17 -0700, lynnsettle@yahoo.com <lynnsettle@yahoo.com>
wrote:
Quote:

Is it possible to compile-link together frontend pg_dump code with
backend code from copy.c?

No. Why do you think you need to modify pg_dump at all?


pg_dump and pg_restore provide important advantages for upgrading a
customer's database on site:

They are fast. I want to minimize downtime.
They allow compression. I often will have relatively little free disk
space to work with.
My concept is "customized dump", drop database, create new schema
database, "customized restore".

My upgrade requires many schema and data content changes. I've tried
using standard SQL statements in perl scripts to do all of it, but even
with no indexes on inserts, later creating indexes for the lookup work,
and every other optimization I know of, a 100gb database requires
several days to turn our old database into a new one. I was hoping that
I could modify the speedy pg_dump/pg_restore utilities to make these
changes "on the fly". It gets tricky because I have to restore some of
the data to different tables having varying schema and also change the
table linking. But this is all doable as long as I can "massage" the
SQL statements and data both when it goes into the dump file and when
it is getting restored back out.

Or am I trying to do the impossible?
-Lynn

Back to top
lynnsettle@yahoo.com
*nix forums beginner


Joined: 10 Jul 2006
Posts: 4

PostPosted: Thu Jul 20, 2006 1:17 am    Post subject: Re: customizing pg_dump together with copy.c's DoCopy function Reply with quote

Brian,

Those are very interesting ideas. Thanks. I've been playing around with
pg_dump. Modifying it to selectively dump/restore tables and columns is
pretty easy. But as you say, changing the data content within the data
buffers to reflect varying column values, changed column types, and new
columns seems tricky. I wonder if anyone knows of any example code
around somewhere...
-Lynn

"Brian Mathis" wrote:
Quote:
pg_dump by default dumps to STDOUT, which you should use in a pipeline to
perform any modifications. To me this seems pretty tricky, but should be
doable. Modifying pg_dump really strikes me as the wrong way to go about
it. Pipelines operate in memory, and should be very fast, depending on how
you write the filtering program. You would need to dump the data without
compression, then compress it coming out the other end (maybe split it up
too). Something like this:
pg_dump | myfilter | gzip | split --bytes=2000M - mydump.

Also, you can't expect to have speed if you have no disk space.
Reading/writing to the same disk will kill you. If you could set up some
temp space over NFS on the local network, that should gain you some speed.




---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Back to top
Google

Back to top
Display posts from previous:   
Post new topic   Reply to topic Page 1 of 1 [6 Posts] View previous topic :: View next topic
The time now is Mon Dec 01, 2008 6:51 pm | All times are GMT
navigation Forum index » Databases » PostgreSQL
Jump to:  

Similar Topics
Topic Author Forum Replies Last Post
No new posts Function Pointer Sikandar C 3 Fri Jul 21, 2006 1:23 pm
No new posts Arbitrary function with parameter darknails@gmail.com C++ 2 Fri Jul 21, 2006 9:58 am
No new posts Is there C/C++ corresponding function in Linux for Java's... xiebopublic@gmail.com apps 4 Fri Jul 21, 2006 3:22 am
No new posts Is there C/C++ corresponding function in Linux for Java's... xiebopublic@gmail.com C++ 1 Fri Jul 21, 2006 2:44 am
No new posts can I call a internal function directly? minrobin@gmail.com shell 2 Fri Jul 21, 2006 2:17 am

Mobile Phones | Remortgaging | iKobo | Mortgage Calculator | Loans
Copyright © 2004-2005 DeniX Solutions SRL
 
Other DeniX Solutions sites: Unix/Linux blog |  electronics forum |  medicine forum |  science forum | 
Privacy Policy


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.1887s ][ Queries: 16 (0.0991s) ][ GZIP on - Debug on ]