Welcome Guest ( Log In | Register )

Bump Topic Topic Closed RSS Feed

Outline · [ Standard ] · Linear+

 K8L(K10) details :)

views
     
TSgOJDO
post Feb 12 2007, 01:12 AM, updated 19y ago

New Member
*
Junior Member
13 posts

Joined: Feb 2007


user posted image

K8L(K10) Rev. B (approx 300mm^2) LARGE DIE SHOT

Quad-core
- Native quad-core design
- Redesigned and improved crossbar(northbridge)
- Improved power management
- New level of cache added, L3 VICTIM
Power management - DICE(Dynamic Independent Core Engagement)
- Supports separate CPU core and memory controller power planes to allow CPU to lower its power state while the memory controller is running full bore
- Enhanced AMD's PowerNow - allows individual core frequencies to lower while other cores may be running full bore
- Power management state invariant time stamp counter (TSC)
Virtualization improvements
- Nested Paging(NP):
* Guest and Host page tables both exist in memory.(The CPU walks both page tables)
* Nested walk can have up to 24 memory acesses! (Hardware caching accelerates the walk)
* "Wire-to-wire" translations are cached in TLBs
* NP eliminates Hypervisor cycles spent managing shadow pages(As much as 75% Hypervisor time)
- Reduced world-switch time by 25%:
* World-switch time: round-trup to Hypervisor and back
Dedicated L1 cache
- 256bit 128kB (64kB instruction/64kB data), 2-way associative
- 2 x 128bit loads/cycle
- lowest latency
Dedicated L2 cache
- 128bit 512kB, 16-way associative
- 128bit bus to northbridge
- reduced latency
- eliminates conflicts common in shared caches - better for virtualization
Shared L3 cache
- 128bit 2MB
- Victim-cache architecture maximizes efficiency of cache hierarchy
- Fills from L3 leave likely shared lines in the L3
- Sharing-aware replacement policy
- Expandable
Independent DRAM controllers
- Concurrency
- More DRAM banks reduces page conflicts
- Longer burst length improves command efficiency
- Dual channel unbuffered 1066 support(applies to socket AM2+ and s1207+ QFX only)
- Channel Interleaving
Optimized DRAM paging
- Increase page hits
- Decrease page conflicts
Re-architect northbridge for higher bandwidth
- Increase buffer sizes
- Optimize schedulers
- Ready to support future DRAM technologies
Write bursting
- Minimize Rd/Wr Turnaround
DRAM prefetcher
- Track positive and negative, unit and non-unit strides
- Dedicated buffer for prefetched data
- Aggressively fill idle DRAM cycles
Core prefetchers
- DC Prefetcher fills directly to L1 Cache
- IC Prefetcher more flexible
* 2 outstanding requests to any address
HyperTransport 3
- Up to three 16bit cHT links
- Up to 5200MT/s per link
- Un-ganging mode: each 16bit HT link can be divided in two 8bit virutal links
- Can dynamically adjust frequency and bit width to save power
- AC mode (higher latency mode) to allow longer communications distances
- Hot pluggable

K8L(K10) pipeline: user posted image

CPU Core IPC Enhancements:
Advanced branch prediction
- Dedicated 512-entry Indirect Predictor
- Double return stacksize
- More branch history bits and improved branch hashing
History-based pattern predictor
32B instruction fetch
- Benefits integer code too
- Reduced split-fetch instruction cases
Sideband Stack Optimizer
- Perform stack adjustments for PUSH/POP operations "on the side"
- Stack adjustments don't occupy functional unit bandwidth
- Breaks serial dependence chains for consecutive PUSH/POPs
Out-of-order load execution
- New technology allows load instructions to bypass:
* Other loads
* Other stores which are known not to alias with the load
- Significantly mitigates L2 cache latency
TLB Optimisations
- Support for 1G pages
- 48bit physical address (256TB)
- Larger TLBs key for:
* Virtualized workloads
* Large-footprint databases and
* transaction processing
- DTLB:
* Fully-associative 48-way TLB (4K, 2M, 1G)
* Backed by L2 TLBs: 512 x 4K, 128 x 2M
- ITLB:
* 16 x 2M entries
Data-dependent divide latency
Additional fastpath instructions
- CALL and RET-Imm instructions
- Data movement between FP & INT
Bit Manipulation extensions
- LZCNT/POPCNT
SSE extensions
- EXTRQ/INSERTQ (SSE4A)
- MOVNTSD/MOVNTSS (SSE4A)
- MWAIT/MONITOR (SSE3)
Comprehensive Upgrades for SSE
- Dual 128-bit SSE dataflow
- Up to 4 dual precision FP OPS/cycle
- Dual 128-bit loads per cycle
- New vector code, SSE128
- Can perform SSE MOVs in the FP "store" pipe
- Execute two generic SSE ops + SSE MOV each cycle (+ two 128-bit SSE loads)
- FP Scheduler can hold 36 Dedicated x 128-bit ops
- SSE Unaligned Load-Execute mode:
* Remove alignment requirements for SSE ld-op instructions
* Eliminate awkward pairs of separate load and compute instructions
* To improve instruction packing and decoding efficiency

This post has been edited by gOJDO: Feb 12 2007, 01:17 AM
ikanayam
post Feb 12 2007, 01:31 AM

there are no pacts between fish and men
********
Senior Member
10,544 posts

Joined: Jan 2003
From: GMT +8:00

Hurray for reposts and copypastas. Sauce pls!
mdzaboy
post Feb 12 2007, 01:32 AM

CuChee MunK KuK
*******
Senior Member
2,061 posts

Joined: Jan 2003
From: Jabaronie to Astaka Status: のあ..?



warghh...plagiat...
sniper69
post Feb 12 2007, 01:34 AM

.: One Shot One Kill :. .+|Level 9 Type Shit|+.
*******
Senior Member
7,173 posts

Joined: Jan 2003
From: PCH


put aside plagiat/copy-paste thing...

whoah...what a nice K8L tongue.gif...anyway, TS should place the source...otherwise, it's just a rumour icon_idea.gif
TSgOJDO
post Feb 12 2007, 01:54 AM

New Member
*
Junior Member
13 posts

Joined: Feb 2007


@ikanayam
Thank you. I made the list and most(95%) of the info is from AMD presentations(most of it from Ben Sander's presentation 10/10/2006 AMD FPF).

ikanayam
post Feb 12 2007, 02:08 AM

there are no pacts between fish and men
********
Senior Member
10,544 posts

Joined: Jan 2003
From: GMT +8:00

Yes, and 95% of the info is not new, although it is nicely compiled here.

Link to the presentation slides?
TSgOJDO
post Feb 12 2007, 02:13 AM

New Member
*
Junior Member
13 posts

Joined: Feb 2007


You can't find the whole presentation online, but here are the slides I have:
user posted image
user posted image
user posted image
user posted image
user posted image
user posted image
user posted image
user posted image
user posted image
user posted image

ikanayam
post Feb 12 2007, 02:14 AM

there are no pacts between fish and men
********
Senior Member
10,544 posts

Joined: Jan 2003
From: GMT +8:00

Thank you sir
TSgOJDO
post Feb 12 2007, 02:14 AM

New Member
*
Junior Member
13 posts

Joined: Feb 2007


user posted image
user posted image
user posted image
jcliew
post Feb 12 2007, 07:55 AM

Retired Enthusiast
*****
Senior Member
889 posts

Joined: Oct 2006
From: Johor Bahru


Oh my goodness... sweat.gif
hypertransport3 technology? native quad-core design?
sounds good! can't wait 2 c real hardware being tested.. thumbup.gif
komag
post Feb 12 2007, 08:20 AM

Casual
***
Junior Member
300 posts

Joined: Mar 2005


what the proc. Ghz speed .
charge-n-go
post Feb 12 2007, 09:42 AM

Look at all my stars!!
*******
Senior Member
4,060 posts

Joined: Jan 2003
From: Penang / PJ

In average it should be slightly higher than K8 according to AMD. Of course the 1st silicon out in market will be slightly slower than the fastest Brisbane in market.
Anyway, who cares about clock speed nowadays? As long as the CPU is faster than its older generation, clock doesnt matter.
elico
post Feb 12 2007, 10:01 AM

Enthusiast
*****
Senior Member
749 posts

Joined: Oct 2006
agree with charge-n-go....clock speed faster some times dun mean better performance....

Intel is coming up with 45nm, while AMD is producing it's true Quad core on a die...who will supreme on next match?? cant wait for the show down...
arjuna_mfna
post Feb 12 2007, 02:21 PM

**Towards Justice World**
******
Senior Member
1,496 posts

Joined: Jan 2006
From: Baling, Kedah



45nm from intel... this proc shoild be capable oc higher then current c2d based on 65nm.... intel already pown amd, with new 45mn c2d @ c2q will go higher clock and performe better... but still don't know could and new proc able to beat c2d architecture
edwin3210
post Feb 12 2007, 04:06 PM

lll
*****
Senior Member
808 posts

Joined: Jan 2007
QUOTE(charge-n-go @ Feb 12 2007, 09:42 AM)
In average it should be slightly higher than K8 according to AMD. Of course the 1st silicon out in market will be slightly slower than the fastest Brisbane in market.
Anyway, who cares about clock speed nowadays? As long as the CPU is faster than its older generation, clock doesnt matter.
*
cant really agree. a 2.6GHz barcelona VS 3.6GHz C2Q (penryn) at the (if) same price range, which one will u get? speed as always "the more the merrier". u see, there are still alot of room for intel to push its clockspeed, and Intel had already announce they will release faster cpu later this year, esp when intel release Penryn. that is why the aspect of "how much performance gain Barcelona from Core architechture, clock for clock" is very important for AMD to gain market sales.


Added on February 12, 2007, 4:08 pmby the way, thanx to gOJDO for this nicely compile information. i wonder, are u the gOJDO from tomshardware? u both almost the same, and identical nick name too.

This post has been edited by edwin3210: Feb 12 2007, 04:08 PM
AceCombat
post Feb 12 2007, 04:25 PM


Group Icon
Elite
5,434 posts

Joined: Dec 2006


is this the 4X4 system that reviewed by custompc this feb issue?
edwin3210
post Feb 12 2007, 04:57 PM

lll
*****
Senior Member
808 posts

Joined: Jan 2007
QUOTE(AceCombat @ Feb 12 2007, 04:25 PM)
is this the 4X4 system that reviewed by custompc this feb issue?
*
4x4 has nothing to do with this. this is a new CPU from AMD.
charge-n-go
post Feb 12 2007, 05:30 PM

Look at all my stars!!
*******
Senior Member
4,060 posts

Joined: Jan 2003
From: Penang / PJ

QUOTE(edwin3210 @ Feb 12 2007, 04:06 PM)
cant really agree. a 2.6GHz barcelona VS 3.6GHz C2Q (penryn) at the (if) same price range, which one will u get? speed as always "the more the merrier". u see, there are still alot of room for intel to push its clockspeed, and Intel had already announce they will release faster cpu later this year, esp when intel release Penryn. that is why the aspect of "how much performance gain Barcelona from Core architechture, clock for clock" is very important for AMD to gain market sales.
No, i dont refer to any of the Intel CPU here. I was just comparing K8L and K8 in previous post. Besides this, I mentioned Barcelona will b running slightly faster than Brisbane but the initial release might be slower than the fastest Brisbane. <-- hopefully u get my meaning here.

For the bolded part, if lower clock can outperform higher clock, is clock speed really that important? Friend, processor speed is affected by many factors, clock is just one of them.
TSgOJDO
post Feb 12 2007, 06:19 PM

New Member
*
Junior Member
13 posts

Joined: Feb 2007


Barcelona is for DP/MP servers, Brisbane is for desktop, and the K8L variants for desktop/workstation wouldn't be available this year. The sAM2 variants, code-name Agena will be clocked 2.4GHz and 2.5GHz. The dualcore desktop, Kuma will be clocked at 2.1GHz to 2.9GHz. There will be a QuadFX varaint(2 quadcores), known as AgenaFX(quadcore), with frequency between 2.7 and 2.9GHz.
So, K8L CPUs will be clocked higher than Brisbane, which at that time will be used for the next-generation Sempron successor, renamed as Rana.

Yorkfield 3.5GHz vs Agena 2.5GHz, will be a clear win for Yorkfield in every application known to mankind.

@edwin Yes, I am the same guy from THG.
Kagaya
post Feb 12 2007, 08:54 PM

Bad-Badtz Maru FREAK !!!
Group Icon
Elite
2,396 posts

Joined: Jan 2003
From: Pandan Perdana, Cheras, KL



Clockspeed matters but not as the sole determining factor of the overall performance benchmark.

The example of comparing a 2.6GHz barcelona VS 3.6GHz C2Q (penryn) itself is something concrete that he debunked his own statement. It's the Core2 microarchitecture that pwned AMD big time, not even Barcelona could turn the table, just yet. The speed, well, it's BONUS. Try downclock that C2Q to 2.6GHz, you might still see Core2 having an edge over Barcelona.

2 Pages  1 2 >Top
Topic ClosedOptions
 

Change to:
| Lo-Fi Version
0.0207sec    0.45    5 queries    GZIP Disabled
Time is now: 20th December 2025 - 10:06 PM