|
|
|
|
|
|
| Author |
Message |
Sakagami Hiroki *nix forums beginner
Joined: 14 Feb 2005
Posts: 7
|
Posted: Sun Mar 13, 2005 12:56 pm Post subject:
Re: The fastest way to build a large hash DB
|
|
|
Michael Cahill wrote:
| Quote: | I have tested a simple {open->put,put,...,put->close} program and
it took 6 minutes to put 1,000,000 key-value pairs. When I called
set_cachesize(>DBSIZE), it improved to 2 minutes. Is there any way
to improve performance furthermore?
The optimal way to build a hash database, if you know the number of
elements in advance and the average number that fit on a page, is to
preallocate the database using the DB->set_h_nelem and
DB->set_h_ffactor. For example, if 100 of your key-value pairs fit
on
a page, you will end up with 10,000 hash buckets.
|
When I tried to preset 1,000,000 nelem and 50 ffactor it improved to 20
seconds, which is acceptable for me.
Thanks for your advice. |
|
| Back to top |
|
 |
Michael Cahill *nix forums Guru Wannabe
Joined: 26 May 2005
Posts: 219
|
Posted: Thu Mar 10, 2005 3:49 am Post subject:
Re: The fastest way to build a large hash DB
|
|
|
| Quote: | I have tested a simple {open->put,put,...,put->close} program and
it took 6 minutes to put 1,000,000 key-value pairs. When I called
set_cachesize(>DBSIZE), it improved to 2 minutes. Is there any way
to improve performance furthermore?
|
The optimal way to build a hash database, if you know the number of
elements in advance and the average number that fit on a page, is to
preallocate the database using the DB->set_h_nelem and
DB->set_h_ffactor. For example, if 100 of your key-value pairs fit on
a page, you will end up with 10,000 hash buckets.
Then insert in order of the hash bucket, which is calculated by calling
__ham_func5 on each key, then taking the result modulo the number of
hash buckets. If you pre-sort your records in this way, you will see
optimal insert behavior.
To get an idea of what's possible, do the calculation of records per
page and call DB->set_h_nelem and DB->set_h_ffactor as described above,
then run db_dump on your database, followed by db_load. If running
db_load is significantly faster than your application, then this
technique should improve the insert performance.
Regards,
Michael. |
|
| Back to top |
|
 |
Sakagami Hiroki *nix forums beginner
Joined: 14 Feb 2005
Posts: 7
|
Posted: Thu Mar 10, 2005 12:57 am Post subject:
Re: The fastest way to build a large hash DB
|
|
|
Ron wrote:
| Quote: | What is the fastest way to build a large hash DB?
I have tested a simple {open->put,put,...,put->close} program and
it
took 6 minutes to put 1,000,000 key-value pairs. When I called
set_cachesize(>DBSIZE), it improved to 2 minutes. Is there any way
to
improve performance furthermore?
Thanks in advance.
By your comment, do you mean you have already tried a cache size
greater than the final database size? I was going to suggest trying
different very large cache sizes.
|
Yes. Final database size is ~42Mbytes. I tried 150Mbytes cache size.
| Quote: | I assume you are not running transactions correct?
|
I believe I don't use transactions because `dbenv' argument of
db_create() function and `txnid' argument of other functions is NULL.
Thanks, |
|
| Back to top |
|
 |
Ron *nix forums Guru Wannabe
Joined: 01 Apr 2005
Posts: 157
|
Posted: Wed Mar 09, 2005 6:57 pm Post subject:
Re: The fastest way to build a large hash DB
|
|
|
Sakagami Hiroki wrote:
| Quote: | What is the fastest way to build a large hash DB?
I have tested a simple {open->put,put,...,put->close} program and it
took 6 minutes to put 1,000,000 key-value pairs. When I called
set_cachesize(>DBSIZE), it improved to 2 minutes. Is there any way
to
improve performance furthermore?
Thanks in advance.
|
By your comment, do you mean you have already tried a cache size
greater than the final database size? I was going to suggest trying
different very large cache sizes.
I assume you are not running transactions correct? |
|
| Back to top |
|
 |
Sakagami Hiroki *nix forums beginner
Joined: 14 Feb 2005
Posts: 7
|
Posted: Wed Mar 09, 2005 2:00 pm Post subject:
The fastest way to build a large hash DB
|
|
|
What is the fastest way to build a large hash DB?
I have tested a simple {open->put,put,...,put->close} program and it
took 6 minutes to put 1,000,000 key-value pairs. When I called
set_cachesize(>DBSIZE), it improved to 2 minutes. Is there any way to
improve performance furthermore?
Thanks in advance. |
|
| Back to top |
|
 |
Google
|
|
| Back to top |
|
 |
|
|
The time now is Thu Jan 08, 2009 4:24 am | All times are GMT
|
|
MPAA | CreditCards | Internet Advertising | Free Credit Report | Lingerie
|
|
Copyright © 2004-2005 DeniX Solutions SRL
|
|
|
|
Other DeniX Solutions sites:
Unix/Linux blog |
electronics forum |
medicine forum |
science forum |
|
|
Privacy Policy
|
Powered by phpBB © 2001, 2005 phpBB Group
|
|