niXforums Forum Index
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   PreferencesPreferences   Log in to check your private messagesLog in to check your private messages   Log inLog in 
·  nixdoc.net ·  man pages ·  Linux HOWTOs ·  FreeBSD Tips ·  Forums
navigation Forum index » *nix » Solaris
T2000 performance Vs V240
Post new topic   Reply to topic Page 1 of 2 [17 Posts] View previous topic :: View next topic
Goto page:  1, 2 Next
Author Message
barts@smaalders.net
*nix forums addict


Joined: 03 Feb 2005
Posts: 74

PostPosted: Wed Jan 11, 2006 6:06 am    Post subject: Re: T2000 performance Vs V240 Reply with quote

Mark wrote:
Quote:
Hi all,

I was wondering if anyone out there has managed to take the T2000
"CoolThreads" servers for a spin yet ? I have requested the free
evaluation that was touted at the end of last year [1], but haven't yet
heard back from a Sun sales bod.

We (me and my co-workers in the kernel group at Sun) have...


Quote:
This year, the company I work for is going to upgrade and expand our
web infrastructure - currently based around V240s as our "workhorse"
web servers. Great boxes, never had a problem with them - but the new
T2000 boxes look very tempting, and compare price-wise : The 6-core
T2000 with 8Gb of RAM is around the same as a dual-proc V240 with 8Gbs,
as an example [2].

Our current V240s are running Apache 1.3.x, and PHP - from what I
gather, it's not just multi-threaded applications that would benefit
from the UltraSPARC T1s, as they present themselves as multiple
physical processors [3]. So a traditional forking webserver such as
Apache 1.3.x would be able to take advantage of this. We also run MySQL
4.x as our database backend - given that this is a multithreaded
application, and already scales very way on our existing 2- and 4-way
boxes, this should also benefit from the new T1 systems.

Has anyone out there had any experience (subjective, or otherwise) with
the new boxes, and how they compare to a V240, running the above
mentioned workloads ?


They scale very well,. lots of memory bandwidth. Individual
CPUs aren't super fast, but there sure are a lot of them,
and memory isn't very far away (~100 ns). Cache-to-cache
transfers are very fast so context switching, passing data from
thread to thread, etc, is fast. Applications that mix memory
access w/ some computation just fly; very simple codes that
need lots more integer ops than memory refs aren't as impressive.
Many commercial benchmarks are very fast on this box; it runs
most DBs very well indeed.

Downsides - many applications don't scale well to 24 or
32 cpus; we've used multiple zones to run multiple instances
of an application that didn't otherwise support it to get sufficient
scalability. Don't bother running FP-heavy code; there's a single
FPU on this processor.

Quote:
[3] = side note : I am unsure as to whether this is true for each
individual core, or for each potential "multiple thread". So, would a
6-core system which can run 4 threads per core be viewed as a
6-processor system, or a 24-processor box ?

24 processor. Very nice for doing large parallel builds.

- Bart
Back to top
Thomas Nau
*nix forums beginner


Joined: 11 Jan 2006
Posts: 3

PostPosted: Wed Jan 11, 2006 11:46 am    Post subject: Re: T2000 performance Vs V240 Reply with quote

Mark <mark.round@gmail.com> wrote:
| Hi all,
|
| I was wondering if anyone out there has managed to take the T2000
| "CoolThreads" servers for a spin yet ? I have requested the free
| evaluation that was touted at the end of last year [1], but haven't yet
| heard back from a Sun sales bod.

We did actually get three of boxes (8 core version)

| Has anyone out there had any experience (subjective, or otherwise) with
| the new boxes, and how they compare to a V240, running the above
| mentioned workloads ?

First we noticed that single thread performance of non memory bound
integer code such as encryption is pretty slow. In a test scenario
we used "john" password guesser. We estimated that a 1GHz T2000
would deliver a comparable performace as a 1GHz V240 but actually
we even didn't get as much as a 500Mhz V100. It seems that the
1Ghz clock is somehow multiplexed between the 4 threads of a core.
Running the same thing 8 times got us perfect scaling.

Result 1: forget single thread performance, think about throughput!


Second thing: you really wanna use trapstat to check how the TLBs is
doing. UltraSPARC-IIIi CPUs have much much larger TLB caches, I think
roughly 1000 entries, than the new CPU which offers about 60 per core
(I guess) . The latest OS patches add a number of improvements like
picking larger pagesizes to map text and shared library segments
but this might not be enough for your app. Use Intimate Shared
Memory for shared memory maped segments as it comes with 4m pages
by default. Don't know if you could use even larger pages here.
Else read mpss.so.1(1) and think about putting this into your apps
startup script to make use of 4m pages

| [3] = side note : I am unsure as to whether this is true for each
| individual core, or for each potential "multiple thread". So, would a
| 6-core system which can run 4 threads per core be viewed as a
| 6-processor system, or a 24-processor box ?

It shows up as a 24 CPU system.


If your code fits into the scheme like it mostly does for us
the boxes are a great choice but don't let marketing fool you in
believing you have 24/32 1GHz CPUs in a single box.

Hope this helps,
Thomas

-----------------------------------------------------------------
GPG fingerprint: B1 EE D2 39 2C 82 26 DA A5 4D E0 50 35 75 9E ED
Back to top
Casper H.S. Dik
*nix forums Guru


Joined: 20 Feb 2005
Posts: 1634

PostPosted: Wed Jan 11, 2006 12:02 pm    Post subject: Re: T2000 performance Vs V240 Reply with quote

Thomas Nau <Thomas.Nau@kiz.uni-ulm.de> writes:

Quote:
If your code fits into the scheme like it mostly does for us
the boxes are a great choice but don't let marketing fool you in
believing you have 24/32 1GHz CPUs in a single box.

Not even marketing claims that.

Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.
Back to top
Jim Prescott
*nix forums beginner


Joined: 26 May 2005
Posts: 28

PostPosted: Wed Jan 11, 2006 8:11 pm    Post subject: Re: T2000 performance Vs V240 Reply with quote

In article <1136901602.522552.310770@g43g2000cwa.googlegroups.com>,
Mark <mark.round@gmail.com> wrote:
Quote:
Has anyone out there had any experience (subjective, or otherwise) with
the new boxes, and how they compare to a V240, running the above
mentioned workloads ?

Does anyone know if Sun will be releasing SPEC CPU2000 results for
the T2000/T1000? In particular SPECint_rate2000 would seem to be
very intesting to those wondering how it compares against Sun's
other MP systems.

Even the single-thread numbers would be helpful. Just because they
don't excell at a particular task doesn't mean customers don't want
to try and calculate price/performance info for that task.
--
Jim Prescott - Computing and Networking Group jgp@seas.rochester.edu
School of Engineering and Applied Sciences, University of Rochester, NY
Back to top
Darren Dunham
*nix forums Guru


Joined: 22 Feb 2005
Posts: 1120

PostPosted: Wed Jan 11, 2006 8:31 pm    Post subject: Re: T2000 performance Vs V240 Reply with quote

Mark <mark.round@gmail.com> wrote:
Quote:
Hi all,

I was wondering if anyone out there has managed to take the T2000
"CoolThreads" servers for a spin yet ? I have requested the free
evaluation that was touted at the end of last year [1], but haven't yet
heard back from a Sun sales bod.

There's a few benchmarks in blog with a T2000 and a few other servers...
http://blogs.sun.com/roller/page/mrbenchmark/

--
Darren Dunham ddunham@taos.com
Senior Technical Consultant TAOS http://www.taos.com/
Got some Dr Pepper? San Francisco, CA bay area
< This line left intentionally blank to confuse you. >
Back to top
Mark Round
*nix forums addict


Joined: 24 Feb 2005
Posts: 86

PostPosted: Thu Jan 12, 2006 2:08 pm    Post subject: Re: T2000 performance Vs V240 Reply with quote

Interesting. We're also looking at the T1000 servers - the disk is
obviously a limiting factor in those boxes, but we are considering
booting them from a SAN anyway, which would solve that issue.

Aside from disks and expansion capabilities, are there any other
differences between the T1000 and T2000 (memory architecture etc) ?

-Mark
Back to top
Logan Shaw
*nix forums Guru


Joined: 21 Feb 2005
Posts: 474

PostPosted: Fri Jan 13, 2006 1:23 am    Post subject: Re: T2000 performance Vs V240 Reply with quote

Thomas Nau wrote:
Quote:
First we noticed that single thread performance of non memory bound
integer code such as encryption is pretty slow. In a test scenario
we used "john" password guesser. We estimated that a 1GHz T2000
would deliver a comparable performace as a 1GHz V240 but actually
we even didn't get as much as a 500Mhz V100. It seems that the
1Ghz clock is somehow multiplexed between the 4 threads of a core.
Running the same thing 8 times got us perfect scaling.

What I understand from the whitepaper (or whatever it was I read)
is that each core has four hardware thread contexts, and it flips
through them (1,2,3,4,1,2,3,4,1,2,3,4) on each cycle so that the
pipeline has a mix of instructions from various threads on it all
simultaneously.

But, there is a bit for each context, so that if the context doesn't
have a runnable thread on it, it is skipped. Therefore, you could
have 1,2,4,1,2,4,1,2,4,1,2,4 or even 1,1,1,1,1,1 as well.

Just how well the core is set up to handle the case of 1,1,1,1,1,1
I don't know. One of the things about any pipeline is that it can
have hazards which cause delays. For example, if you add multiply
numbers and store the result in a register, then then the next
instruction tests the result of that (say, to see if it's zero),
the second instruction has to wait for the first instruction to
finish computing the result. In contrast, there is no hazard if
the two instructions are computing things which aren't related to
each other.

This is where I'm starting to get out of my depth, but I think
there may be some leeway that you have when designing a processor
to put in extra circuitry that gets around hazard problems and
reduces delays due to them. But if Sun is going on the assumption
that there will be lots of runnable threads, it may have not been
worth it to try to optimize for this: if I understand right, a
big part of the point of the continuous flipping of threads thing
is that different threads are by definition different contexts,
and an instruction from thread 1 obviously doesn't care anything
about what's in some register in thread 2's context. Which means
that hazards are a non-issue when continuously flipping through
multiple threads, or at least a much smaller issue than they would
be otherwise.

Anyway, in case I'm actually out in left field here, I'll stop and
just say that I wouldn't be surprised at all to hear that Sun has
designed each core to have the best throughput when it has several
runnable threads on the one core, and I wouldn't be at all surprised
to hear that they compromised a core's throughput in the case of only
one runnable thread in order to achieve this. Which actually makes
perfect sense if you are trying to achieve throughput on certain
kinds of workloads.

- Logan
Back to top
Logan Shaw
*nix forums Guru


Joined: 21 Feb 2005
Posts: 474

PostPosted: Fri Jan 13, 2006 1:28 am    Post subject: Re: T2000 performance Vs V240 Reply with quote

Logan Shaw wrote:
Quote:
Just how well the core is set up to handle the case of 1,1,1,1,1,1
I don't know. One of the things about any pipeline is that it can
have hazards which cause delays. For example, if you add multiply
numbers and store the result in a register, then then the next
instruction tests the result of that (say, to see if it's zero),
the second instruction has to wait for the first instruction to
finish computing the result. In contrast, there is no hazard if
the two instructions are computing things which aren't related to
each other.

Oh, I posted before I remembered that I was going to ask: if I am
on the right track here, is it possible that a compiler that targets
Niagara specifically could schedule instructions for its pipeline
and get a significant speedup by doing that? It seems like the
scheduling that's best for Niagara might actually be a bit different
than the scheduling that's best for most SPARC processors made in
the last 5 or even 10 years, if Niagara is a leaner, simpler core
that isn't so focused on doing fancy stuff to optimize a single
instruction stream.

- Logan
Back to top
barts@smaalders.net
*nix forums addict


Joined: 03 Feb 2005
Posts: 74

PostPosted: Fri Jan 13, 2006 6:24 am    Post subject: Re: T2000 performance Vs V240 Reply with quote

Actually, the Niagara core is relatively simple. Fancy instruction
scheduling not needed or helpful.... If a core has 3 memory bound
threads and 1 thread trying to count to infinity, the latter will
get almost all of the execution time on the cpu. This makes
capacity planning a little more challenging than normal,
since a single CPU-bound thread can soak up all the available
cycles. However, this really doesn't reflect most normal workloads
so most server apps scale pretty linearly.

- Bart
Back to top
Thomas Nau
*nix forums beginner


Joined: 11 Jan 2006
Posts: 3

PostPosted: Fri Jan 13, 2006 10:21 am    Post subject: Re: T2000 performance Vs V240 Reply with quote

Logan Shaw <lshaw-usenet@austin.rr.com> wrote:
| Logan Shaw wrote:
| > Just how well the core is set up to handle the case of 1,1,1,1,1,1
| > I don't know. One of the things about any pipeline is that it can
| > have hazards which cause delays. For example, if you add multiply
| > numbers and store the result in a register, then then the next
| > instruction tests the result of that (say, to see if it's zero),
| > the second instruction has to wait for the first instruction to
| > finish computing the result. In contrast, there is no hazard if
| > the two instructions are computing things which aren't related to
| > each other.
|
| Oh, I posted before I remembered that I was going to ask: if I am
| on the right track here, is it possible that a compiler that targets
| Niagara specifically could schedule instructions for its pipeline
| and get a significant speedup by doing that? It seems like the
| scheduling that's best for Niagara might actually be a bit different
| than the scheduling that's best for most SPARC processors made in
| the last 5 or even 10 years, if Niagara is a leaner, simpler core
| that isn't so focused on doing fancy stuff to optimize a single
| instruction stream.

Thansk for the in-depth information you shared with us. It makes
perfect sense in my (stupid) test scenario. Nevertheless, useing
the latest Studio 11 compilers with the appropriate target setup
for the UltraSPARC T processor didn't change any of the numbers
we got

Thomas

-----------------------------------------------------------------
GPG fingerprint: B1 EE D2 39 2C 82 26 DA A5 4D E0 50 35 75 9E ED
Back to top
Seongbae Park
*nix forums Guru Wannabe


Joined: 28 Feb 2005
Posts: 102

PostPosted: Fri Jan 13, 2006 7:42 pm    Post subject: Re: T2000 performance Vs V240 Reply with quote

Logan Shaw <lshaw-usenet@austin.rr.com> wrote:
...
Quote:
Oh, I posted before I remembered that I was going to ask: if I am
on the right track here, is it possible that a compiler that targets
Niagara specifically could schedule instructions for its pipeline
and get a significant speedup by doing that? It seems like the

No. US T1 is an in-order pipeline with all long latency instructions
blocking the issue of the strand (the pipeline switches to a different strand,
hence, from the switched-out strand point of view, it's effectively blocking).
So there's nothing compiler can do in terms of instruction scheduling.

There are couple of things Studio 11 does for US T1,
but those are mostly about instruction selection (which instructions to use)
and tuning down aggressive optimizations
(e.g. code size is considered much more important in US T1,
so almost no pipelining or unrolling,
except when they help reducing the code size).

Thomas Nau <Thomas.Nau@kiz.uni-ulm.de> wrote:
....
Quote:
perfect sense in my (stupid) test scenario. Nevertheless, useing
the latest Studio 11 compilers with the appropriate target setup
for the UltraSPARC T processor didn't change any of the numbers
we got

Even though you may not see any gain with -xchip=ultraT1,
you may still want to use the flag if you intend to run your program
*mostly* on US T1 because:
1) You may not know when you're hit by the instruction selection issue later.
e.g. normally Studio compiler uses sqrt instruction instead of calling a library
but on US T1, it doesn't do that because sqrt instruction is slower
than a library call (which uses only integer operations on US T1).
2) It does some extra stuff that will help on Niagara2
(which doesn't make any difference on US T1).
The difference won't be huge but noticeable.
--
#pragma ident "Seongbae Park, compiler, http://blogs.sun.com/seongbae/"
Back to top
Jaime Cardoso
*nix forums beginner


Joined: 09 Aug 2005
Posts: 33

PostPosted: Sun Jan 15, 2006 12:25 am    Post subject: Re: T2000 performance Vs V240 Reply with quote

Mark wrote:
Quote:
Hi all,

I was wondering if anyone out there has managed to take the T2000
"CoolThreads" servers for a spin yet ? I have requested the free
evaluation that was touted at the end of last year [1], but haven't yet
heard back from a Sun sales bod.

This year, the company I work for is going to upgrade and expand our
web infrastructure - currently based around V240s as our "workhorse"
web servers. Great boxes, never had a problem with them - but the new
T2000 boxes look very tempting, and compare price-wise : The 6-core
T2000 with 8Gb of RAM is around the same as a dual-proc V240 with 8Gbs,
as an example [2].

Our current V240s are running Apache 1.3.x, and PHP - from what I
gather, it's not just multi-threaded applications that would benefit
from the UltraSPARC T1s, as they present themselves as multiple
physical processors [3]. So a traditional forking webserver such as
Apache 1.3.x would be able to take advantage of this. We also run MySQL
4.x as our database backend - given that this is a multithreaded
application, and already scales very way on our existing 2- and 4-way
boxes, this should also benefit from the new T1 systems.

Has anyone out there had any experience (subjective, or otherwise) with
the new boxes, and how they compare to a V240, running the above
mentioned workloads ?

Thanks,

-Mark

[1] = http://blogs.sun.com/roller/page/jonathan/20051218

[2] = Well, they did anyway. I can't currently get a quote on the V240s
now from Sun's UK catalogue - are they being replaced, or phased out ?!

[3] = side note : I am unsure as to whether this is true for each
individual core, or for each potential "multiple thread". So, would a
6-core system which can run 4 threads per core be viewed as a
6-processor system, or a 24-processor box ?



Well, the discussion went in another path and, I'm thinking most of your
questions were left unanswered.

- The V240 is still being sold (and it will keep being sold for some
time. Fist because not all customers can/want to move to Solaris 10 and,
secondly, because there are some tasks that perform better in a v240
than in a T1 box

- The tests I did with the T2000 were very quick (and with Java Web
server) so, all I did was to increase the number of listeners but, I got
over 4 times the performance of a v240.
I would expect you to easly double the load you can manage but, be
carefull because this machines, usually, have higher RAM demands to
"feed" all the CPU strands.

If memory serves me, Apache (by default) starts with 5 processes, I
would change that value to 10 (or 2 times what you have right now) and
keep adding processes in increments of 5 until an optimum number was
reached.
Back to top
Robert Milkowski
*nix forums addict


Joined: 24 Feb 2005
Posts: 96

PostPosted: Tue Feb 07, 2006 7:46 pm    Post subject: Re: T2000 performance Vs V240 Reply with quote

Mark <mark.round@gmail.com> wrote:
Quote:
Hi all,

Has anyone out there had any experience (subjective, or otherwise) with
the new boxes, and how they compare to a V240, running the above
mentioned workloads ?

Different workload her but I find it interesting.

We compare T2000 (8x, 1GHz) to v240 (2x 1.5GHz) with LDAP
in a quite large database in a production.

Right now T2000 handles 2x more requests (in production) than v240
and it looks like there's a looooot spare CPU still available!


I would bet that T2000 would be much faster in web serving than v240.

--
Robert Milkowski
rmilkowskiASFSDF@wp-sa.pl
http://milek.blogspot.com
Back to top
Robert Milkowski
*nix forums addict


Joined: 24 Feb 2005
Posts: 96

PostPosted: Mon Feb 20, 2006 5:51 pm    Post subject: Re: T2000 performance Vs V240 Reply with quote

Mark <mark.round@gmail.com> wrote:
Quote:
Has anyone out there had any experience (subjective, or otherwise) with
the new boxes, and how they compare to a V240, running the above
mentioned workloads ?

http://milek.blogspot.com/2006/02/t2000-real-web-performance.html

I would say that v240 wouldn't stay a chance.

--
Robert Milkowski
rmilkowskiCASCA@wp-sa.pl
http://milek.blogspot.com
Back to top
Glenn
*nix forums beginner


Joined: 20 Apr 2006
Posts: 2

PostPosted: Thu Apr 20, 2006 4:53 am    Post subject: Re: T2000 performance Vs V240 Reply with quote

Hi,

the company I work for is about to replace a series of v240 with T2000s
this will all be done by June30


Currently we are in dev and testing phases and all looks good they seem
to be all they are hyped up to be.

Regards,
glenn

Mark wrote:
Quote:
Hi all,

I was wondering if anyone out there has managed to take the T2000
"CoolThreads" servers for a spin yet ? I have requested the free
evaluation that was touted at the end of last year [1], but haven't yet
heard back from a Sun sales bod.

This year, the company I work for is going to upgrade and expand our
web infrastructure - currently based around V240s as our "workhorse"
web servers. Great boxes, never had a problem with them - but the new
T2000 boxes look very tempting, and compare price-wise : The 6-core
T2000 with 8Gb of RAM is around the same as a dual-proc V240 with 8Gbs,
as an example [2].

Our current V240s are running Apache 1.3.x, and PHP - from what I
gather, it's not just multi-threaded applications that would benefit
from the UltraSPARC T1s, as they present themselves as multiple
physical processors [3]. So a traditional forking webserver such as
Apache 1.3.x would be able to take advantage of this. We also run MySQL
4.x as our database backend - given that this is a multithreaded
application, and already scales very way on our existing 2- and 4-way
boxes, this should also benefit from the new T1 systems.

Has anyone out there had any experience (subjective, or otherwise) with
the new boxes, and how they compare to a V240, running the above
mentioned workloads ?

Thanks,

-Mark

[1] = http://blogs.sun.com/roller/page/jonathan/20051218

[2] = Well, they did anyway. I can't currently get a quote on the V240s
now from Sun's UK catalogue - are they being replaced, or phased out ?!

[3] = side note : I am unsure as to whether this is true for each
individual core, or for each potential "multiple thread". So, would a
6-core system which can run 4 threads per core be viewed as a
6-processor system, or a 24-processor box ?
Back to top
Google

Back to top
Display posts from previous:   
Post new topic   Reply to topic Page 1 of 2 [17 Posts] Goto page:  1, 2 Next
View previous topic :: View next topic
The time now is Sat Nov 22, 2008 1:25 pm | All times are GMT
navigation Forum index » *nix » Solaris
Jump to:  

Similar Topics
Topic Author Forum Replies Last Post
No new posts Performance and Consistency ?? likun.navipal@gmail.com Berkeley DB 4 Fri Jul 21, 2006 4:24 am
No new posts AIX performance tuning jpzhai@gmail.com AIX 5 Fri Jul 21, 2006 2:27 am
No new posts Performance problem News AIX 1 Wed Jul 19, 2006 9:55 am
No new posts Antw: Performance problem with query Christian Rengstl PostgreSQL 10 Tue Jul 18, 2006 6:24 pm
No new posts performance considerations (looong) Pavel Stratil Apache 2 Tue Jul 18, 2006 3:14 pm

Magazine Subscriptions | Mobile Phones | Landscape Photos | Mobile Phones | MPAA
Copyright © 2004-2005 DeniX Solutions SRL
 
Other DeniX Solutions sites: Unix/Linux blog |  electronics forum |  medicine forum |  science forum | 
Privacy Policy


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 1.3000s ][ Queries: 16 (1.1405s) ][ GZIP on - Debug on ]