|
|
|
|
|
|
| Author |
Message |
Attilio Rao *nix forums beginner
Joined: 26 Feb 2006
Posts: 17
|
Posted: Tue Jul 25, 2006 3:13 pm Post subject:
[PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
|
|
|
Hi,
Intel documentation points out that having a 128-bytes aligned
syncronizing primitive (which fits in a cache line) will minimize the
traffic for cache bus, so this patch implements an alignment for i386
on turnstiles.
Any comments, feedbacks?
Attilio
PS: Using __aligned on MI code is usually a bad practice, but please
note that the case !__i386__ is not affected (as you can see in the
patch)
--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-arch@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" |
|
| Back to top |
|
 |
Attilio Rao *nix forums beginner
Joined: 26 Feb 2006
Posts: 17
|
Posted: Tue Jul 25, 2006 3:14 pm Post subject:
Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
|
|
|
2006/7/25, Attilio Rao <attilio@freebsd.org>:
| Quote: | Hi,
Intel documentation points out that having a 128-bytes aligned
syncronizing primitive (which fits in a cache line) will minimize the
traffic for cache bus, so this patch implements an alignment for i386
on turnstiles.
Any comments, feedbacks?
|
Oh, sorry, I've unforgotten the diff.
Attilio
--
Peace can only be achieved by understanding - A. Einstein |
|
| Back to top |
|
 |
John Baldwin *nix forums Guru Wannabe
Joined: 27 Mar 2002
Posts: 278
|
Posted: Tue Jul 25, 2006 4:32 pm Post subject:
Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
|
|
|
On Tuesday 25 July 2006 11:14, Attilio Rao wrote:
| Quote: | 2006/7/25, Attilio Rao <attilio@freebsd.org>:
Hi,
Intel documentation points out that having a 128-bytes aligned
syncronizing primitive (which fits in a cache line) will minimize the
traffic for cache bus, so this patch implements an alignment for i386
on turnstiles.
Any comments, feedbacks?
Oh, sorry, I've unforgotten the diff.
Attilio
|
I think a better approach would be to stick turnstiles (and sleepqueues) in a
UMA zone and specify cache-size alignment to the zone. However, turnstiles
aren't really sychronization primitives in that you don't spin on a variable
inside the structure, and I think it's the spinning and avoiding bouncing
cache lines around that Intel's documentation is really about. In that case,
the things you want aligned are things like mutexes, rwlocks, etc.
--
John Baldwin
_______________________________________________
freebsd-arch@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" |
|
| Back to top |
|
 |
Attilio Rao *nix forums beginner
Joined: 26 Feb 2006
Posts: 17
|
Posted: Tue Jul 25, 2006 5:04 pm Post subject:
Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
|
|
|
2006/7/25, John Baldwin <jhb@freebsd.org>:
| Quote: | On Tuesday 25 July 2006 11:14, Attilio Rao wrote:
2006/7/25, Attilio Rao <attilio@freebsd.org>:
Hi,
Intel documentation points out that having a 128-bytes aligned
syncronizing primitive (which fits in a cache line) will minimize the
traffic for cache bus, so this patch implements an alignment for i386
on turnstiles.
Any comments, feedbacks?
Oh, sorry, I've unforgotten the diff.
Attilio
I think a better approach would be to stick turnstiles (and sleepqueues) in a
UMA zone and specify cache-size alignment to the zone. However, turnstiles
aren't really sychronization primitives in that you don't spin on a variable
inside the structure, and I think it's the spinning and avoiding bouncing
cache lines around that Intel's documentation is really about. In that case,
the things you want aligned are things like mutexes, rwlocks, etc.
|
Well, I think that this is referred in particular to the latter issue
you mentioned.
Spinning is not really concerned to cache bus issues (more, in
particular, datapath latency).
With this point of view, turnstiles (as sleepqueues) are passed around
CPUs more than a mutex/rwlock (or a cv), I guess, so I was thinking
that it's better optimizing turnstile than the real syncronizing
primitive itself.
Attilio
--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-arch@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" |
|
| Back to top |
|
 |
Attilio Rao *nix forums beginner
Joined: 26 Feb 2006
Posts: 17
|
Posted: Wed Jul 26, 2006 6:27 pm Post subject:
Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
|
|
|
2006/7/25, Attilio Rao <attilio@freebsd.org>:
| Quote: | 2006/7/25, John Baldwin <jhb@freebsd.org>:
On Tuesday 25 July 2006 11:14, Attilio Rao wrote:
2006/7/25, Attilio Rao <attilio@freebsd.org>:
Hi,
Intel documentation points out that having a 128-bytes aligned
syncronizing primitive (which fits in a cache line) will minimize the
traffic for cache bus, so this patch implements an alignment for i386
on turnstiles.
Any comments, feedbacks?
Oh, sorry, I've unforgotten the diff.
Attilio
I think a better approach would be to stick turnstiles (and sleepqueues) in a
UMA zone and specify cache-size alignment to the zone. However, turnstiles
aren't really sychronization primitives in that you don't spin on a variable
inside the structure, and I think it's the spinning and avoiding bouncing
cache lines around that Intel's documentation is really about. In that case,
the things you want aligned are things like mutexes, rwlocks, etc.
Well, I think that this is referred in particular to the latter issue
you mentioned.
Spinning is not really concerned to cache bus issues (more, in
particular, datapath latency).
With this point of view, turnstiles (as sleepqueues) are passed around
CPUs more than a mutex/rwlock (or a cv), I guess, so I was thinking
that it's better optimizing turnstile than the real syncronizing
primitive itself.
|
This is a patch which let turnstiles/sleepqueues using an UMA zone.
I've tried in my 6.1R branch and it works quite fine, so this HEAD
version might be alright (I've not tried yet, so please test):
http://users.gufi.org/~rookie/works/patches/uma_sync.diff
It, obviously, set default alignment for i386 at 128 bytes.
Any comments, feedbacks, ideas, are welcome.
Attilio
PS: I know that I could simplify *_alloc(), *_free() routines
implementing init/fini but it is simpler and more optimized having
things like so.
--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-arch@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" |
|
| Back to top |
|
 |
Attilio Rao *nix forums beginner
Joined: 26 Feb 2006
Posts: 17
|
Posted: Fri Jul 28, 2006 5:04 pm Post subject:
Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
|
|
|
2006/7/26, Attilio Rao <attilio@freebsd.org>:
| Quote: | 2006/7/25, Attilio Rao <attilio@freebsd.org>:
2006/7/25, John Baldwin <jhb@freebsd.org>:
On Tuesday 25 July 2006 11:14, Attilio Rao wrote:
2006/7/25, Attilio Rao <attilio@freebsd.org>:
Hi,
Intel documentation points out that having a 128-bytes aligned
syncronizing primitive (which fits in a cache line) will minimize the
traffic for cache bus, so this patch implements an alignment for i386
on turnstiles.
Any comments, feedbacks?
Oh, sorry, I've unforgotten the diff.
Attilio
I think a better approach would be to stick turnstiles (and sleepqueues) in a
UMA zone and specify cache-size alignment to the zone. However, turnstiles
aren't really sychronization primitives in that you don't spin on a variable
inside the structure, and I think it's the spinning and avoiding bouncing
cache lines around that Intel's documentation is really about. In that case,
the things you want aligned are things like mutexes, rwlocks, etc.
Well, I think that this is referred in particular to the latter issue
you mentioned.
Spinning is not really concerned to cache bus issues (more, in
particular, datapath latency).
With this point of view, turnstiles (as sleepqueues) are passed around
CPUs more than a mutex/rwlock (or a cv), I guess, so I was thinking
that it's better optimizing turnstile than the real syncronizing
primitive itself.
This is a patch which let turnstiles/sleepqueues using an UMA zone.
I've tried in my 6.1R branch and it works quite fine, so this HEAD
version might be alright (I've not tried yet, so please test):
http://users.gufi.org/~rookie/works/patches/uma_sync.diff
It, obviously, set default alignment for i386 at 128 bytes.
Any comments, feedbacks, ideas, are welcome.
Attilio
PS: I know that I could simplify *_alloc(), *_free() routines
implementing init/fini but it is simpler and more optimized having
things like so.
|
After some thinking, I think it's better using init/fini methods
(since they hide the sizeof(struct turnstile) with size parameter).
Feedbacks and comments are welcome:
http://users.gufi.org/~rookie/works/patches/uma_sync_init.diff
Thanks,
Attilio
--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-arch@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" |
|
| Back to top |
|
 |
Google
|
|
| Back to top |
|
 |
|
|
The time now is Thu Aug 28, 2008 5:51 pm | All times are GMT
|
|
Loans | Mortgages | Credit Cards | New York Hotels | Loans
|
|
Copyright © 2004-2005 DeniX Solutions SRL
|
|
|
|
Other DeniX Solutions sites:
Unix/Linux blog |
electronics forum |
medicine forum |
science forum |
|
|
Privacy Policy
|
Powered by phpBB © 2001, 2005 phpBB Group
|
|