Help - Search - Member List - Calendar
Full Version: :: Hyper Transport Technology ::
Lowyat.NET > Computers > Hardware
Pages: 1, 2, 3
charge-n-go
Well, I'm not sure if my concept on HTT is correct. From what i know, not all the review sites gave the same concept about the HTT on Athlon64 system. Some said the HTT has something to do with memory and some said no. Even the DFI-street mentioned this : " Basically HTT is your memory controller's communication speed". However, I found out from AMD datasheet that HTT is not related to memory controller at all, as they are 2 independent units.

So I would really appreciate yr opinions and thoughts regarding HTT.


Some simple rules in discussion:

1. When you post any facts, state the source. (links or screenshots)
2. NO bashing or flaming.
3. You can make comparison of HTT with FSB.
4. No discussion that is not related to topic.
5. Pls stick to HTT on K8 system only.
6. Technical discussion is welcomed.
charge-n-go
EXPERIMENT DONE, THANX TO ALL WHO CONTRIBUTED notworthy.gif

Now, it's experiment time. We'll try to test if HTT clock has any effect on performance, when CPU and RAM are kept at stock speed.

Please contribute screenshots with the following settings :

TEST 1

1. CPU and RAM are set to 1800MHz, DDR400 respectively with different HTT values.

2. The settings are :
--> 9.0 x 200 (CPU: 1800MHz, RAM 1:1 @ DDR400)
--> 7.5 x 240 (CPU: 1800MHz, RAM 6:5 @ DDR400)
--> 6.0 x 300 (CPU: 1800MHz, RAM 3:2 @ DDR400)
--> 4.5 x 400 (CPU: 1800MHz, RAM 2:1 @ DDR400)




TEST 2 (ikanayam's suggestion wink.gif )

1. CPU, RAM and base HTT speed are set to 1800MHz, DDR400 and 200MHz.

2. The settings are :
--> 9.0 x 200 (HTT multiplier = 3.0x)
--> 9.0 x 200 (HTT multiplier = 4.0x)
--> 9.0 x 200 (HTT multiplier = 5.0x)
--> 9.0 x 200 (HTT multiplier = 6.0x)


*Sandra CPU, Sandra Memory and SuperPI result is compulsory

*CPUz (CPU n RAM) reading is compulsory.

* 400MHz HTT base is not necessary if your board cant do it biggrin.gif


fx_53_xt
well, from what i know is that HTT is a bus that connect the processor to the other hardware on motherboard. HTT is FSB replacement.

well, i think memory unit is another thing from HTT cuz from wat i read from AMD website, HTT have 8.0GBps effective bandwith while processor-to-memory have independent 6.4GBps bandwith which makes total effective 14.4GBps.

While HTT is generally a high speed bus developed by Hyper Transport Consortium, i think you might find more info regarding HTT on the hyper Transport Consortium website.

HTT is not owned by AMD alone, it's developed by the HT consortium SIG by quite a number of manufacturer.
charge-n-go
QUOTE(fx_53_xt @ Sep 20 2005, 04:10 AM)
well, i think memory unit is another thing from HTT cuz from wat i read from AMD website, HTT have 8.0GBps effective bandwith while processor-to-memory have independent 6.4GBps bandwith which makes total effective 14.4GBps.

HTT is not owned by AMD alone, it's developed by the HT consortium SIG by quite a number of manufacturer.
*



Memory unit is confirmed not another thing from HTT. See the screenshot in post #2. It's from AMD Athlon64 datasheet. Pls give the link for the facts u claimed. Thx wink.gif

The current HTT discussion is only on AMD system, coz it will benefit most of the forumers who are PC users.


QUOTE
well, from what i know is that HTT is a bus that connect the processor to the other hardware on motherboard. HTT is FSB replacement.

Yes, it is true. One thing to add: HTT is not an enhancement of FSB system. From official HTT website, i found some facts:

HyperTransport chip-to-chip interconnect technology is a highly optimized, high performance and low latency board-level architecture for embedded and open- architecture systems. It provides up to 22.4 Gigabyte/second aggregate CPU to I/O or CPU to CPU bandwidth in a highly efficient chip-to-chip technology that replaces existing complex multi-level buses.

* the Northbridge and Southbridge system tat uses FSB is a complex multilevel bus system.

Source
almostthere
Temporarily moved to OC United to gain more feedback. I'll move it back to H/w once you get enough info and opinions
ikanayam
I think this is more of a hardware discussion than it is an OC discussion.

Anyway an easy way to test if memory bandwith is directly tied to HTT bandwith to decrease the HTT LTD multiplier to 1x and see if the dual channel memory bandwith decreases accordingly.

From everything gathered so far i think it's quite clear that HTT is only for I/O other than RAM access. That's what the onboard RAM controller is for in the 1st place.
antonio
from the way ikanayam describe it it seems that HTT is the 'other route' or connection towards I/O and not to the memory unit itself? My question is when the HTT is raised how does it effects the memory controller itself to give more grunt...Undoubtably is the Memory Controller doesnt reacts to the HTT it seems that it is on its own no matter how fast we clocked it via HTT...am i right or wrong here???The only thing that matters is the muttiplier itself which is the source key which depends on how fast should the processor runs and also determined the processing speed.

If HTT is no way connected to the CPU (FSB theory) then raising the HTT will increase the speed of transfer rate between I/O to the CPU and not to the ram??Then how come the ram connect to retrieve data and process it simultaneously???
charge-n-go
user posted image

This is wat i get from AMD spec sheet and i cut and paste the PLL table in another page to merge with the block diagram.

The HTT clock_out pin isnt connected to MEM_clock_in pin at all. PLL is known as the clock generator inside the CPU, and it provides reference clock for other parts in CPU. CPU core cloc, HTT clock and memory controller (not RAM) is derived from PLL clock with multiplier. Let's say an Athlon64 3200+ running at 2.0GHz co and 2.0GHz HTT.

1. The base clock for PLL is 200MHz, CPU has a multiplier of 10x and HTT has a multiplier of 5x. HTT = 2000MHz because it's a single 16-bit link config rather than 2x 8-bit HTT links.

2. Memory controller is running at the same clock as CPU (2000MHz). The RAM is clocked in reference with memory controller. That's y in CPU-z u see the divider as CPU/10x for a DDR400 RAM, if you hv an Athlon64 3200+.

3. The so called 'HTT' overclocking in motherboard is actually raising the PLL clock. Hence increasing PLL affects the HTT, CPU and RAM speed.

4. Since HTT bus can't take very high clock, when u set the PLL to 250MHz, it's better to decrease the HTT multiplier to 4x so that it remains 1000MHz per HTT link (effectively 2000MHz for 16-bit).

5. Basically setting the RAM to DDR400 and increase the PLL (or HTT in mobo BIOS) has nothing to do in increasing the bandwidth between RAM to CPU. I've said before overclocking the PLL (HTT) with all the other parts like CPU, HTT and RAM remains at stock speed doesnt bring any good,and not an overclocking. For example, if you able to push PLL up to 400MHz, but the HTT multiplier is set to 2.5x, CPU is set to 5x, everything is running at stock and it's useless coz no performance boost.

6. the memory databus is indicated as MEM_DATA [63:0] from the printscreen attached. It's a 64-bit bus which means a single channel memory interface. Dual channel will have MEM_DATA [127:0] pins. This clearly states the data from memory doesnt go through HTT, and the RAM is directly connected to CPU memory controller with another dedicated bus.



more on memory controller divider can be found in Anandtech

more on HTT used in Athlon64 platform can be found in Xbitlabs

Quote from Xbitlabs (Opteron which has three 16-bit HTT links):


QUOTE
Connection between chipset components and the processor or between CPUs in multiprocessor systems is implemented by means of up to three integrated HyperTransport bus controllers (16bit wide with 3.2GB/sec bandwidth each way).

AMD-8151 graphics AGP tunnel is an AGP bus controller, supporting AGP 4x and AGP 8x graphics cards. This chip also features two HyperTransport bus controllers: 16bit input one and 8bit output one. Thanks to that, the AGP tunnel can receive data at the speed of 3.2GB/sec and transfer it further (in AMD-8000 - to the South Bridge) at 0.8GB/sec. The remaining 2.4GB/sec are used for "controller's own purposes". Quite enough for AGP 8x with 2.1GB/sec bandwidth, isn't it?

AMD-8131 PCI-X tunnel, as well as AMD-8151, has two HyperTransport controllers with 16 and 8bit bus widths each way. The bandwidth of the buses is 3.2 and 1.6GB/sec each way respectively. But the "filling" of the chip is different as it has two PCI-X bridges.

AMD-8111 input/output tunnel, unlike AMD-8151 and AMD-8131, only has one 8bit 400MB/sec HyperTransport bus controller. It's supposed to be always at the end of the HyperTransport chain. AMD-8111 supports ordinary 33MHz 32bit PCI 2.2 bus, AC'97 and 10/100 Ethernet interfaces, two USB 2.0 hubs and ATA/133 IDE-controller.
charge-n-go
QUOTE(almostthere @ Sep 20 2005, 05:05 AM)
Temporarily moved to OC United to gain more feedback. I'll move it back to H/w once you get enough info and opinions
*



Aiks... I need some hardware expert in hardware section to comment it technically.


It's because I kinda confused by mobo manufacturer as they put the term HTT as PLL, so i wanna double confirm how the HTT actually works. From the info gathered so far, i didnt see any relation and this might prove 2 things :

1. The mobo manufacturers have different concept on HTT.

2. The mobo manufacturers don't wanna confuse the overclockers since we are so used to FSB. Thus making HTT like a FSB will be easier for overclockers to play the settings.

Well, if we overclock a system, we dun need to understand the operation of HTT. As long as we see performance boost, it's good enough. However, freak like me wanted to know how the HTT works so that I know which settings can be optimized during overclocking tongue.gif
almostthere
My goal here is to increase the understanding of the true nature HTT technically as well as the effects that it has on OC'ed systems. However this is just a trial to see what can we gain by sharing this between the OC guys and the techy guys. It's not a big deal if I move it back to H/w but was thinking that maybe it maybe dispel more myth about HTT since someone did say the termination over at DFi-Street was flawed and should be corrected
charge-n-go
Can u pls make a link to hardware section as well? Some techies might not enter O/C united forum. I need them to correct me if anything goes wrong bcoz i m not 100% sure about my assumption. Of course i dun want ppl here be misinformed by my assumptions.

Thx.


* edit : I've put a link in AMD vs Intel to this thread.
Evogenix
QUOTE(http://www.devx.com/amd/Article/17437)
The HyperTransport bus has benefits that go beyond the obvious. It uses a "packetized" design, which means that addresses, data, and commands are sent along the same wires, allowing for a much narrower link.

Another problem that the Opteron processor addresses is the relatively slow connection between the processor and supporting circuitry. This connection, commonly called the "front-side bus," is the transport mechanism for all data traveling between the processor and main memory, graphics card, and all types of I/O devices. The front-side bus transfer rate on prior generation AMD processors is on the order of 2.1GB/s—fast, but still capable of being saturated by the demands of a server configured with multi-processors, high-speed network cards, and fast storage devices. So the Opteron processor replaces the front-side bus with a HyperTransport connection that dramatically extends communication bandwidth up to 6.4GB/s.

Application performance is further enhanced by fact that the Opteron processor has a direct connection to main memory—no bus needed. The integration of a memory controller into the processor core significantly reduces memory latency because it alleviates the need for memory transactions to traverse the traditional memory access path through the "Northbridge" chip. The effect of the reduction in memory latency, coupled with the additional increase in memory bandwidth available directly to the processor, cannot be overstated, as it tremendously benefits system performance across all application segments.


user posted image
Figure 1: HyperTransport technology introduces a new bus architecture that offers higher
performance and greater flexibility; its packetized approach also simplifies system
hardware design. Source: AMD By: devx.com



user posted image
Figure 2: HyperTransport technology can provide a streamlined interconnect for SMP
systems, reducing bottlenecks between processors, memory hubs, and memory
DRAMs. Source: AMD By: devx.com


main source : http://www.devx.com/amd/Article/17437

Evogenix
winc87
QUOTE(charge-n-go @ Sep 20 2005, 10:46 AM)
2. Memory controller is running at the same clock as CPU (2000MHz). The RAM is clocked in reference with memory controller. That's y in CPU-z u see the divider as CPU/10x for a DDR400 RAM, if you hv an Athlon64 3200+.

3. The so called 'HTT' overclocking in motherboard is actually raising the PLL clock. Hence increasing PLL affects the HTT, CPU and RAM speed.

I agree. Someone with higher CPU clockspeed but with a same RAM speed as others has higher memory bandwidth. The CPU clockspeed plays an important role in OCing an AMD64 system.

Please refer to dinster's Sandra Memory Bandwidth thread.
http://forum.lowyat.net/index.php?showtopic=154304&st=0

Here is an article about HTT from wikipedia.
http://en.wikipedia.org/wiki/Hypertransport

QUOTE
HyperTransport is packet-based, with each packet always consisting of a set of 32-bit words, regardless of the physical width of the bus interconnect. The first word in a packet is always a command word. If a packet contains an address, the last 8 bits of the command word are chained with the next 32-bit word to make a 40-bit address. The remaining 32-bit words in a packet are the data payload. Transfers are always padded to a multiple of 32 bits, regardless of their actual length. HyperTransport revision 1.05 contains an option allowing an additional 32-bit control packet to be prepended when 64 bit addressing is required.

Hypertransport packets come out onto the bus in segments known as bit times. How many bit times it takes depends on the width of the bus. HT can be used for generating system management messages, signaling interrupts, issuing probes to adjacent devices or processors, and general I/O and data transactions. There are usually two different kinds of write commands that can be used, posted and non-posted. Posted writes are ones that do not require a response from the target. This is usually used for high bandwidth devices such as UMA traffic or DMA transfers. Non-posted writes require a response from the receiver in the form of a target done. Reads also cause the receiver to generate a read response.

charge-n-go
QUOTE(winc87 @ Sep 20 2005, 09:01 PM)
Someone with higher CPU clockspeed but with a same RAM speed as others has higher memory bandwidth. The CPU clockspeed plays an important role in OCing an AMD64 system.

Actually my AthlonXP system scores better in Sandra too, if the CPU clock speed is increased. At 2.0GHz DDR510, the score is around 150MB/s less than 2.4GHz. I guess it's due to the faster memory request from CPU to RAM (in AXP case).

In Athlon64, memory bandwidth won't increase as CPU clock increased (if the divider is set correctly to limit the RAM at DDR400). However, the latency to access the mem controller should be faster, bcoz latency = 1/ clockspeed.

FYI, Athlon64's mem divider is a bit weird. Let's say u hv a weird CPU clock (eg. 2222MHz), the RAM clock might not reach DDR400 (200MHz) but around 193MHz only. Here's the scenario :

at 2200MHz, RAM works at 200MHz (DDR400) when divider is set to 11x.
at 2222MHz, RAM works at 202MHz (DDR404) with the same divider as 2200MHz, but AMD doesnt allow the RAM to work higher than specified bcoz it might caused instability. Hence, it will lower the RAM speed with a higher divider, maybe 11.5x, so that the RAM works at 2222MHz / 11.5 = 193.22MHz.

Well, I'm not sure how many sets of divider supported. Hope somebody can find and post here biggrin.gif

* if not mistaken, the memory divider settings article is from Aceshardware. I cant find it though, need to search back when I have time wink.gif
winc87
QUOTE
In Athlon64, memory bandwidth won't increase as CPU clock increased (if the divider is set correctly to limit the RAM at DDR400). However, the latency to access the mem controller should be faster, bcoz latency = 1/ clockspeed.

It does increase in sandra memory bandwidth test. Read dinster's thread AMD64 section.
QUOTE
11. tampoi - 7404 PTS - Kingston VR BH6 @ 5-2-2-2 @ 1T @ 260 MHZ @ 3.6 VDIMM
18. winc87 - 6905 PTS - OCZ VX Value CH5 UTT @ 6-2-2-2 @ 1T @ 260 MHZ @ 3.6 VDIMM

tampoi scored higher than mine because his clockspeed is higher than mine. Can you explain this? smile.gif
charge-n-go
QUOTE(winc87 @ Sep 20 2005, 11:05 PM)
It does increase in sandra memory bandwidth test. Read dinster's thread AMD64 section.

tampoi scored higher than mine because his clockspeed is higher than mine. Can you explain this? smile.gif
*


lol, i oso dunno the exact cause ler.

Memory bandwidth = memory speed x data width. Eg. DDR 400 dual channel has 400MHz x 128-bit = 51200Mbit/s = 6.4GB/s. If your RAM remains at DDR400 while scaling up the CPU, the memory bandwidth is fixed.

Yeah, the score is higher, no doubt bout it. Even my AthlonXP can do it ler. I got 3757 @ 8.0 x 255MHz and 3900 @ 9.5x255MHz. As mentioned b4, i suspect it's due to the faster mem request from CPU to RAM. I might be wrong though. Siapa mau betulkan saya? notworthy.gif
empire23
QUOTE(winc87 @ Sep 20 2005, 10:05 PM)
It does increase in sandra memory bandwidth test. Read dinster's thread AMD64 section.

tampoi scored higher than mine because his clockspeed is higher than mine. Can you explain this? smile.gif
*



not really, it's because his timings were superior more like it.
skyther
Eh? Do I have anything to do with the starting of this thread? laugh.gif

user posted image

You're probably right in the sense that HTT only concerns I/O.

QUOTE
HyperTransport operates as a fully integrated front-side bus that relieves system designers from the requirements of a NorthBridge function.

In other instances, such as in Apple's G5-class systems, HyperTransport is used as an integrated, high performance I/O bus that pipes PCI, PCI-X, USB, Firewire and audio/video links through the system. In all cases, HyperTransport replaces the overlapping processor and local I/O buses of earlier generation systems with a unified, high bandwidth, low latency, and low-cost architecture that is scalable and extensible to future product generations.



BUT...

QUOTE
The move in PCs is to faster processors with higher clock speeds, blazing fast dual data rate (DDR) memories, a requirement to support legacy I/O devices and, of course, the need to respond the market's move to lower costs and system prices. As CPUs blaze past previous untouchable GHz milestones and new memories make possible mega-data transfers, it is clear that system design must approach old problems in a new way - especially if cost and space constraints are to be met.


So.. this is the technical verdict lar. Everyone's intepretation of HTT can be different I guess.
charge-n-go
yo skyther, not bcoz of u la tongue.gif
I'm receiving msg regarding HTT actually, and many seemed confused with HTT even myself laugh.gif

It's good to have an open discussion. But then this time we limit to A64 system only bcoz most of us here are PC users. wink.gif

btw, can u give us the link to your quotes? thanx
skyther
^ thumbup.gif

All from hypertransport.org lar...
http://www.hypertransport.org/applications/apps_PCs.cfm
http://www.hypertransport.org/consortium/cons_faqs.cfm << this one's pretty good

QUOTE
5. Who are the Founding Members of the Consortium?
Advanced Micro Devices (thumbup.gif), Alliance Semiconductor, Apple Computer, Broadcom Corporation, Cisco Systems, NVIDIA (thumbup.gif), PMC-Sierra, Sun Microsystems, and Transmeta.


Come to think of it, Intel seems to be ignoring the whole thing. I wonder if they're going to come up their own version of a bus revamp for their next gen processors.

winc87
QUOTE
As processor micro architecture capabilities have advanced, one of the greatest performance limitations has become the system architecture's ability to provide sufficient low-latency memory bandwidth to the processor core. The AMD Athlon(tm) 64 processor and the AMD Opteron(tm) processor directly addresses this bottleneck by integrating a DDR memory controller into the processor, revolutionizing the way x86-based processors access main memory. By running at the processor's core frequency, an integrated memory controller greatly increases bandwidth directly available to the processor at significantly reduced latencies. The performance-enhancing effect is even more dramatic within an AMD Opteron(tm) multiprocessing environment, because each additional processor has its own memory controller thereby scaling over all memory bandwidth.

Well, I've got the answer. With the increased of the CPU clock speed, the internal memory bandwidth is increased but not the RAM speed. smile.gif

Source : http://www.cmpe.boun.edu.tr/courses/cmpe51...4%20FX%2055.doc
empire23
Here's empire23's not so superior guide to HTB or hyper transport bus/lorry.

Might be wrong, but this is what i can derive from the pinouts and chats with Charge-n-go, enjoy.

user posted image

user posted image
silkworm
HT aka LDT uses differential signalling, and each "channel" is made of two unidirectional (in and out) links. With 16bits each direction, and two lines for differential signals, that makes 2*2*16 = 64 pins for data/address/commands. The LDT clock and control lines are also differential, and there are one for each direction, so that's another 8 lines. That makes a total of 72 lines for HyperTransport/LDT.

Differential signalling improves signal integrity by immunizing against skew and common mode noise, so you can pump data at ridiculously high clocks. This buys you bandwidth even with a reduced pin-count. Encapsulating information into packets saves the number of pins because you can have commands,addresses and data on the same wire. However, encoding and decoding packets increases latency.

The Intel FSB uses GTL+ single ended signalling, and is a bi-directional bus. Data comes in 64 bits at a time, so there are 64pins. P4 uses split data and address busses, so there's another 32 pins for addressing. On the control side, there are 5 pins for transaction types, 2 pins for address strobing and 8 pins for data strobing. That gives at least 111 pins related to the FSB. I probably missed some pins along the way tongue.gif

All of the above have very little to do with overclocking. When you "bus overclock" you're changing the "base" clock which goes into the CPU. Because the CPU's I/O is synchronous, the FSB/HT/DDR clocks increase in proportion to the base clock. The I/O has to communicate with "something". Even if your CPU can handle the increased I/O rate, the device on the other end (ie. the northbridge or DRAM) may not. That's when you mess with the multipliers/dividers to bring the I/O or memory bus clocks down to levels that are in-spec.

AMD64 doesn't have much granularity in the HT multplier, only 0.5x, 1x, 2x, 3x, 4x or 5x are selectable. The specific control register is covered in the AMD64 "BIOS and Kernel Developer's Guide", document no #26094, pg 49. Likewise for memory controller, there are only 1/2, 2/3, 5/6, and 1/1 dividers selectable (pg 88,89).

Bus clocking is constrained by capability of the devices on the bus and signal integrity. Signal integrity is influenced by a lot of things, including the type of IC packaging, the electrical signalling scheme, the materials used in the PCB, and the routing of the circuit.
charge-n-go
laugh.gif

i oso come out with own drawings now.

This is the difference between FSB and HTT system.

user posted image

FSB transfer data from RAM, AGP and the I/O to CPU on the same bus. The total transfer rate is only 6.4GB/s on dual channel DDR400. The control and address use different lines.

On HTT, I/O and PCI-e data/address/command are packeted and communicate with CPU on a 4.0GB/s HTT bus (2000MHz 16-bit) in either direction. RAM communicates with the integrated memory controller on another bus running at 6.4GB/s (DDR400 dual channel). So we have a total of 10.4GB/s on HTT bus.
charge-n-go
QUOTE(silkworm @ Sep 21 2005, 12:25 PM)
HT aka LDT uses differential signalling, and each "channel" is made of two unidirectional (in and out) links. With 16bits each direction, and two lines for differential signals, that makes 2*2*16 = 64 pins for data/address/commands. The LDT clock and control lines are also differential, and there are one for each direction, so that's another 8 lines. That makes a total of 72 lines for HyperTransport/LDT.

*



Silky you are finally here huh? tongue.gif

I was about to write something on the quote above, but u r faster than me now laugh.gif

What i can do is to attach image only, hahah.

user posted image

Direction into the CPU : L0_CADIN_H/L[15:0] <--- 16-pins / 16-bit
Direction out from CPU : L0_CADOUT_H/L[15:0] <--- 16-pins / 16-bit

Other pins are for clock and controlling wink.gif
skyther
Eh, btw... the SB -> NB link is 1Gb/s per direction. tongue.gif
ikanayam
QUOTE(skyther @ Sep 20 2005, 11:07 PM)
Eh, btw... the SB -> NB link is 1Gb/s per direction. tongue.gif
*


That depends on the chipsets involved. It's not a set rule.
charge-n-go
QUOTE(skyther @ Sep 21 2005, 01:07 PM)
Eh, btw... the SB -> NB link is 1Gb/s per direction. tongue.gif
*


haha, it should be both direction, but i miss out the arrows at another end. Actually all the buses drawn in the diagram is bi-direction.
skyther
I can do 8x and 4x for you tongue.gif

No 3DMark though.
skyther
A64 3000+ @ 1.6GHz, x8
HTT: 200, x4, 1.2V
RAM: 200, 2.5-2-2-0 1T 1:1

SuperPI 1M:
user posted image

Sandra Multimedia:
user posted image

Sandra Memory:
user posted image

3DMark2001 SE Build 330:
user posted image

Pathetic benchmarks, I think it's my refresh rate tongue.gif
ikanayam
QUOTE(charge-n-go @ Sep 21 2005, 10:02 PM)
Anybody would like to share some screenshot with me? I don't have an Athlon64 system here, so cant test it on my own.

May I have some benchie to compare the effect of yr HTT o/c? Let's say one at 2.0GHz, DDR400 and stock 200MHz HTT(CPU : 10x200MHz, FSB:RAM = 1:1). Another one is 2.0GHz, DDR400, and 400MHz HTT (CPU : 5x 400MHz, FSB:RAM = 2:1). Of course you can have other settings too, such as 8x250, 7x285 and etc.

The benchie can be Super PI, Sandra CPU, 3D mark and etc.

Thx !

*


I don't think many mainboards can do 400MHz HTT easily. Anyway that won't help much, if at all. All you have to do is bump up the LTD multiplier and it will do more per clock anyway.

what would be interesting to test is how memory bandwith scales with the LTD multiplier from 1x to 5x while keeping everything else the same. This will confirm or deny the theory that the memory controller is not tied to the HTT bus.
Evogenix
on request :


TestBed :
AMD 3200+ Venice 0518
DFI Ultra-D BIOS 7.04bta
TwinMOS UTT [ch-5] @ 2-2-2-5


Here is my steps :
1. setting up the BIOS
2. save & reboot, load into Windows xp pro sp2
3. terminate all unused background applications
4. double check the setting thro cpu-z v1.30
5. close cpu-z, run SPImod1.4
6. benchmark from 16k to 1M, 3 times for each to get the best score
7. close SPImod1.4, run SandraLite2005
8. benchmark CPU Arithmetic Benchmark, CPU Multi-Media Benchmark, Memory Bandwidth Benchmarak, and Cache & Memory Benchmark
9. after benchmarking finished, load back SPImod1.4 and take screenies

---=== FINISH ===---


Here is the results that i collected :

Benchmark & screenshot #1
200MHz[htt] x 10[multiplier] @ 2.0GHz, LDT frequency 4x :

user posted image

Benchmark & screenshot #2
400MHz[htt] x 5[multiplier] @ 2.0GHz, LDT frequency 3x :

user posted image




-----=====-----=====-----=====-----=====-----=====-----=====-----=====-----=====-----=====-----=====-----=====-----=====

If you watch out carefully, you will found out that i used different LDT frequency for both test laugh.gif laugh.gif
my mistake, so for more accurate result, i benchmark again @ 200MHz[htt] x 10[multiplier] with LDT frequency 3x

Benchmark & screenshot #3
200MHz[htt] x 10[multiplier] @ 2.0GHz, LDT frequency 3x :

user posted image


End of test :
From the benchmarked results that i collected, oviosly we can see that marks for
200MHz[htt] x 10[multiplier] > 400MHz[htt] x 5[multiplier]

So, i concluded that by incressing htt frequency wont gain any of the system bandwidth nor performance at all.


regards,
Evogenix
skyther
Whoaaa got LDT x 10 meh?

I tried using 200 x 4 and 400 x 2, but 400 couldn't POST. My mobo tops out at 360ish HTT AFAIK.
Evogenix
QUOTE(skyther @ Sep 22 2005, 03:49 PM)
Whoaaa got LDT x 10 meh?

I tried using 200 x 4 and 400 x 2, but 400 couldn't POST. My mobo tops out at 360ish HTT AFAIK.
*



x10 is multiplier
3x is LDT frequency

faild to boot into BIOS @ high htt frequency might cause by
1. mainboard limitation
2. proc's memory controller limitation

Evogenix
skyther
Nah I tried upping voltages but it didn't help.
winc87
It's the limitation of the mobo. It doesn't related to the proc memory controller. smile.gif
antonio
user posted image

This is the most highest HTT i've ever achieve with my board and the new E6 Venice...b4 this my winnie and venice a few months back managed to get 370 and that is the most out of it....

390 is the best for me....no benchies how ever....sorry... doh.gif
charge-n-go
QUOTE(ikanayam @ Sep 22 2005, 04:22 PM)
I don't think many mainboards can do 400MHz HTT easily. Anyway that won't help much, if at all. All you have to do is bump up the LTD multiplier and it will do more per clock anyway.

what would be interesting to test is how memory bandwith scales with the LTD multiplier from 1x to 5x while keeping everything else the same. This will confirm or deny the theory that the memory controller is not tied to the HTT bus.
*


Good suggestion dude.

I do this bcoz from wat i noticed, some people will js o/c their HTT base clock and tune the multiplier within safe region of 800MHz to 1200MHz, while leaving their CPU and RAM at stock speed, and then overjoy bcoz the mobo can o/c so high and have performance gain. AFAIK, this works with FSB only bcoz higher FSB bandwidth decrease the latency.

I'll come out with new methods to prove our assumptions and theory earlier wink.gif


Thanx skyther, Evogenix, antonio_zth for the screenies wub.gif
charge-n-go
Experiment procedures updated biggrin.gif

pls refer to post #2

THanx for joining ppl wink.gif


* anybody can come out with better procedure, pls suggest here biggrin.gif
Evogenix
QUOTE(charge-n-go @ Sep 22 2005, 06:07 PM)
Experiment procedures updated biggrin.gif

pls refer to post #2

THanx for joining ppl wink.gif
* anybody can come out with better procedure, pls suggest here biggrin.gif
*


i think test#2 already shown in my screenie
the benchmark #1 and benchmark #3 accually the same setting.
the only difference is the LDT frequency 4x[benchmark #1] and 3x[benchmark #3]

there is slightly decressment with the score as shown in the srceenie from 4x to 3x

Evogenix
charge-n-go
QUOTE(Evogenix @ Sep 22 2005, 07:14 PM)
i think test#2 already shown in my screenie
the benchmark #1 and benchmark #3 accually the same setting.
the only difference is the LDT frequency 4x[benchmark #1] and 3x[benchmark #3]

there is slightly decressment with the score as shown in the srceenie from 4x to 3x

Evogenix
*


icic biggrin.gif

Anybody wanna post up screenshots? I need more to prove the theory. Thanx wink.gif

* antonio_zth, me still waiting for yr new screenshots thumbup.gif
maianeh
QUOTE(charge-n-go @ Sep 21 2005, 11:38 AM)
user posted image
*


A question...

Refering to this diagram, on the HT bus system, where would the potential bottleneck be, given all the parts are the highest-end ones currently available in the market?
Evogenix
QUOTE(maianeh @ Sep 26 2005, 06:19 PM)
A question...

Refering to this diagram, on the HT bus system, where would the potential bottleneck be, given all the parts are the highest-end ones currently available in the market?
*


if all highest end device that available in the market put into a system, i think the processor should be the bottleneck of a system. The proc cant really optimize the HyperTransport speed yet.

Evogenix
soulfly
i'll try to contribute some HTT comparison screenies for nF3 s-754. my board with 338mhz ht can run with 2.5x or 3x LDT.

but what benchmark should i run? 3dmark01 ok? shy shy want to run 3dmark03 coz me only haf fx5200 tongue.gif
charge-n-go
//maianeh

imo, the limitation would be the latest component, as 10 pieces of Raptor running at peak transfer rate can't even saturate the 16-bit HTT bus. Besides, RAM is another limitation too bcoz the on-die memory controller is actually working a lot more faster than RAM.

the HTT is able to transfer up to 4GB/s, and 10 Raptors might give u around 2GB/s in a short burst. Other I/O devices such as LAN, Sound and etc doesn't require too much bandwidth too. The data transfer into GPU at PCI-e is not very big too as modern graphic cards have plenty of VRAM. Most of the textures rendered are stored in the VRAM and the processor usually sends and receives control signals.



//soulfly

thanx dude, you can read post #2 for the benchies, but I'll post here la

Benchies: Sandra CPU, RAM and SuperPI. CPUz reading on 'CPU' and 'RAM' need to be in the screenshot too.

Waiting for u biggrin.gif
soulfly
Test bed:
Sempron64 2800+ @ 2704mhz, 338MHz(HTT) x 8
DFI nF3 250Gb
TCC5 @DDR600 CL2.5-4-3-7-1T
WinXP SP1

*all tests were using the same setting, except for LDT multiplier
*each test ran one by one
*tests for LDT 3x was done first, then reboot before proceeding tests for LDT 2.5x
*A64Tweaker shown just for guide on memory timings

LDT 3x
user posted image
SiSandra2005 CPU Multimedia benchmark

user posted image
SiSandra2005 Memory Bandwidth benchmark

user posted image
SuperPi 1M

LDT 2.5x
user posted image
SiSandra2005 CPU Multimedia benchmark

user posted image
SiSandra2005 Memory Bandwidth benchmark

user posted image
SuperPi 1M
almostthere
Moved back to Hardware
ShinAsuka
yesterday i saw in a forum said HTT * LDT multiplier cannot more than 1k
izzit true?
charge-n-go
QUOTE(ShinAsuka @ Oct 19 2005, 12:33 PM)
yesterday i saw in a forum said HTT * LDT multiplier cannot more than 1k
izzit true?
*


Yup, it is suggested to keep it below 1000MHz (not 1k tongue.gif), so that the system is stable enough.
blwong
refer to link info on hypertransport one of the chapter in the athlon 64 bilble. Hope this info is useful. thks

http://66.218.71.231/language/translation/...8428%26page%3d4
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
All Rights Reserved 2003-2006 Vijandren Ramadass
Invision Power Board © 2001-2009 Invision Power Services, Inc.