niXforums Forum Index
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   PreferencesPreferences   Log in to check your private messagesLog in to check your private messages   Log inLog in 
·  nixdoc.net ·  man pages ·  Linux HOWTOs ·  FreeBSD Tips ·  Forums
navigation Forum index » Databases » Berkeley DB
The fastest way to build a large hash DB
Post new topic   Reply to topic Page 1 of 1 [5 Posts] View previous topic :: View next topic
Author Message
Sakagami Hiroki
*nix forums beginner


Joined: 14 Feb 2005
Posts: 7

PostPosted: Sun Mar 13, 2005 12:56 pm    Post subject: Re: The fastest way to build a large hash DB Reply with quote

Michael Cahill wrote:
Quote:
I have tested a simple {open->put,put,...,put->close} program and
it took 6 minutes to put 1,000,000 key-value pairs. When I called
set_cachesize(>DBSIZE), it improved to 2 minutes. Is there any way
to improve performance furthermore?

The optimal way to build a hash database, if you know the number of
elements in advance and the average number that fit on a page, is to
preallocate the database using the DB->set_h_nelem and
DB->set_h_ffactor. For example, if 100 of your key-value pairs fit
on
a page, you will end up with 10,000 hash buckets.

When I tried to preset 1,000,000 nelem and 50 ffactor it improved to 20
seconds, which is acceptable for me.

Thanks for your advice.
Back to top
Michael Cahill
*nix forums Guru Wannabe


Joined: 26 May 2005
Posts: 219

PostPosted: Thu Mar 10, 2005 3:49 am    Post subject: Re: The fastest way to build a large hash DB Reply with quote

Quote:
I have tested a simple {open->put,put,...,put->close} program and
it took 6 minutes to put 1,000,000 key-value pairs. When I called
set_cachesize(>DBSIZE), it improved to 2 minutes. Is there any way
to improve performance furthermore?

The optimal way to build a hash database, if you know the number of
elements in advance and the average number that fit on a page, is to
preallocate the database using the DB->set_h_nelem and
DB->set_h_ffactor. For example, if 100 of your key-value pairs fit on
a page, you will end up with 10,000 hash buckets.

Then insert in order of the hash bucket, which is calculated by calling
__ham_func5 on each key, then taking the result modulo the number of
hash buckets. If you pre-sort your records in this way, you will see
optimal insert behavior.

To get an idea of what's possible, do the calculation of records per
page and call DB->set_h_nelem and DB->set_h_ffactor as described above,
then run db_dump on your database, followed by db_load. If running
db_load is significantly faster than your application, then this
technique should improve the insert performance.

Regards,
Michael.
Back to top
Sakagami Hiroki
*nix forums beginner


Joined: 14 Feb 2005
Posts: 7

PostPosted: Thu Mar 10, 2005 12:57 am    Post subject: Re: The fastest way to build a large hash DB Reply with quote

Ron wrote:
Quote:
What is the fastest way to build a large hash DB?

I have tested a simple {open->put,put,...,put->close} program and
it
took 6 minutes to put 1,000,000 key-value pairs. When I called
set_cachesize(>DBSIZE), it improved to 2 minutes. Is there any way
to
improve performance furthermore?

Thanks in advance.

By your comment, do you mean you have already tried a cache size
greater than the final database size? I was going to suggest trying
different very large cache sizes.

Yes. Final database size is ~42Mbytes. I tried 150Mbytes cache size.

Quote:
I assume you are not running transactions correct?

I believe I don't use transactions because `dbenv' argument of
db_create() function and `txnid' argument of other functions is NULL.

Thanks,
Back to top
Ron
*nix forums Guru Wannabe


Joined: 01 Apr 2005
Posts: 157

PostPosted: Wed Mar 09, 2005 6:57 pm    Post subject: Re: The fastest way to build a large hash DB Reply with quote

Sakagami Hiroki wrote:
Quote:
What is the fastest way to build a large hash DB?

I have tested a simple {open->put,put,...,put->close} program and it
took 6 minutes to put 1,000,000 key-value pairs. When I called
set_cachesize(>DBSIZE), it improved to 2 minutes. Is there any way
to
improve performance furthermore?

Thanks in advance.

By your comment, do you mean you have already tried a cache size
greater than the final database size? I was going to suggest trying
different very large cache sizes.

I assume you are not running transactions correct?
Back to top
Sakagami Hiroki
*nix forums beginner


Joined: 14 Feb 2005
Posts: 7

PostPosted: Wed Mar 09, 2005 2:00 pm    Post subject: The fastest way to build a large hash DB Reply with quote

What is the fastest way to build a large hash DB?

I have tested a simple {open->put,put,...,put->close} program and it
took 6 minutes to put 1,000,000 key-value pairs. When I called
set_cachesize(>DBSIZE), it improved to 2 minutes. Is there any way to
improve performance furthermore?

Thanks in advance.
Back to top
Google

Back to top
Display posts from previous:   
Post new topic   Reply to topic Page 1 of 1 [5 Posts] View previous topic :: View next topic
The time now is Thu Jan 08, 2009 4:24 am | All times are GMT
navigation Forum index » Databases » Berkeley DB
Jump to:  

Similar Topics
Topic Author Forum Replies Last Post
No new posts MySQL Max Build Policy Kaj Arnö MySQL 0 Fri Jul 21, 2006 2:08 pm
No new posts new installation not finding large memory Miles Fidelman Debian 8 Thu Jul 20, 2006 9:00 pm
No new posts invalid hash entry in internal DNS code causes assertion ... Bertold Kolics Squid 0 Thu Jul 20, 2006 7:56 pm
No new posts re-build physical standby in DG setup EdStevens Server 3 Thu Jul 20, 2006 4:37 pm
No new posts wxPython: wxStaticBitmap and large images Roger Miller python 1 Wed Jul 19, 2006 11:27 pm

MPAA | CreditCards | Internet Advertising | Free Credit Report | Lingerie
Copyright © 2004-2005 DeniX Solutions SRL
 
Other DeniX Solutions sites: Unix/Linux blog |  electronics forum |  medicine forum |  science forum | 
Privacy Policy


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.1261s ][ Queries: 20 (0.0441s) ][ GZIP on - Debug on ]