Jump to content
xisto Community
osknockout

X86 Clocking

Recommended Posts

EA applies to all other processors. However, for the 286 it takes only 1 clock cycle no matter what, and the number decreases from there. So the 286 is listed at three because it is the two of the 808x plus the one from EA. THe 808x is listed at 2 plus EA because it is a minimum of 3, but it can be more. You might be right about the pentiums, but they continually get faster, and the minimum for a clock cycle without hyper-threading is 1. Actually it is 1 with hyperthreading as well, but hyperthreading allows more than one to be executed at the same time giving the illusion of one. The method I gave you for calculation was only a rough estimate that doesn't apply as well to pentium chips as it does to 486 and below because the pentium chip totally changed the architecture. The other 8086 type chips simply added more instructions and more processing capability and more temporary storage to allow for faster clock cycles. Also, the built in math coprocessor that comes with most 486 chips helps decrease the clock cycles.

Share this post


Link to post
Share on other sites

Wow... that's going to take me some time to digest... thanks.

Actually it is 1 with hyperthreading as well, but hyperthreading allows more than one to be executed at the same time giving the illusion of one.

Alright, I'm interested. How does it actually do that?

Share this post


Link to post
Share on other sites

Okay, clock cycles only count what the true processor does. This is the 86 part of an 808x+ processor. Starting with the 386 (or maybe the 286), a math coprocessor became available. THis would be labeled xxx87 instead of xxx86. The processor sends data and an instruction to the math coprocessor for many math instructions such as MUL perhaps. THen the processor continues processing other things as long as possible while the coprocessor takes care of the math instruction. If the processor can't go any further and the math instruction has multiple independant parts, it may help out on the processing of the math instruction. When the math instruction is finished, the math coprocessor sends the data back to the processor to deal with. Once again this is only a rough sketch of how the system works. I don't know all of the details yet, but now I feel like researching them. and thanks for the reputation point.

Share this post


Link to post
Share on other sites

Jeez, I tried to reply to this twice and my power failed both times. Thanks for taking the time to research and you're welcome for the rep. point, you deserved it. So, besides FPU operations , can you directly control the *87 coprocessor? or is it just the FPU? I heard on comp.lang.x86.asm (might be jumbled up, typing off memory) that it was called by several different names and seemed like it could do more than just floating point operations. By the way, are the clock cycles for the *87 the same as what the sites you've suggested report? I'm just full of questions ;) , and my PERL learning is taking time off of assembly research.

Share this post


Link to post
Share on other sites

That's perfectly okay, I enjoy the research because this is stuff I need to know as well, and Perl is an awesome language. I'm having some problems with networking in it, but that should go in the Perl forum. It looks the only control you have is over FPU, but in the 486+ line of Intel chips, that is really what the x87 is built for. As a matter of fact, the math coprocessor has been renamed by Intel to the x87 FPU. Incase you didn't know, FPU is the floating point equivalent of CPU. The processor on those chips is capable of handling most basic math instructions just as fast as with a coprocessor, so the x87 is really only needed for higher math functions. You are right that the x87 does more than floating point, it also handles sine, cosine, partial tangent, partial arctangent, 2^x - 1 instructions, y*log to the base 2 of x, and other functions. However, these instructions return floating point numbers, and probably also take floating point numbers. All of this information came out of the IntelArchitecture Software Developer's Manual.

Share this post


Link to post
Share on other sites

Alright, so they just renamed it the FPU. What's interesting is that FPU operations take longer than processor equivalents for older processors, but the reverse is true for newer processors. Is there an assembly instruction which converts float to ints? or should I just do it the old guru way? Thanks for the links.

Share this post


Link to post
Share on other sites

There is an instruction to convert from floating point to integer: FRNDINT. I'm not 100% sure of the format because I don't use FPU instructions yet. However, if you looked at the intel documents I told you about, it is covered in part A of volume 2.

Share this post


Link to post
Share on other sites

Thanks, I haven't bothered to use FPU yet, but I'll be using it soon enough. By the way, do you have any idea what a PARTIAL tangent is? I don't.

Share this post


Link to post
Share on other sites

I have an idea, but I'm not sure. If you look at the graph of tangent, it has many vertical asymptotes at multiples of pi. I think the partial tangent limits the domain to between 0 and pi. i'm not sure though, and I'm trying to find out because I am curious as well.

Share this post


Link to post
Share on other sites

Really? Because that would just produce the same results as any other tan function since the period is pi. Were the FPU developers THAT lazy to add a few cycles to subtract pi from the darn thing if it's greater than pi? Hmm... FRNDINT... seems that there's no ceil float->int instruction.

Share this post


Link to post
Share on other sites

It seems like partial tangent is the full blown tangent function. Partial tangent takes the tangent of the value in ST0 with a domain of -2 x 10 ^63 to 2 x 10^63. It also pushes 1 onto the FPU stack. I see no reason to call it a partial tangent, but whatever floats there boats. And the guru's method of dealing with the rnd function is easy to do. Simply add the #P exception, because this exception is set if any truncation had to take place. There is also something known as a rounding mode. This is set by the RC field in the FPU control word. it is two bits, and setting them to 00 rounds to the nearest (based on the .5 rule), setting them to 10 rounds up (the ceiling function), setting them to 11 truncates (floor function), and setting them to 01 rounds down (My guess is that this changes the .5 rule so that .5 rounds down, but I'm not sure).

Share this post


Link to post
Share on other sites

I'll reply to both posts in this topic.  To the first one, I don't know of any programs that do it, but I know many sites that list the CPU clocks that each instruction takes one various chips ranging from 2 or 386 to pentium and sometimes beyond.  This is one of the better ones: https://packetstormsecurity.com/programbly/opcode.html.  You can pair this with the intel instruction set opcode reference found here, http://www.intel.com/content/www/us/en/design/resource-design-center.html;, to figure out how long each instruction will take.  I can also try and write one for you but my best guess as to when it would be done is at least a year if my friend will help me.  I know I once had a program that recorded the number of clock ticks a program took but can't for the life of me remember what it was, sorry.

 

For the second post, x86 asm (properly known as x86 assembly) is the most basic mnemonic set for machine code instructions (the ones and zeros a computer actually uses).  The x86 implies that it is run on chips that support the intel 86 architecture which means any of their chips whose real names end in 86.  A pentium, for example, is really an 80586, while the first of the set is an 8086.  AMD chips also support this architecture, while Apple computers do not.  Each instruction requires a certain amount of clock ticks to be carried out.  A clock tick is like the second hand of the clock jumping from one second to two, then three, etc.  Every time it jumps, it makes a clock tick.  But on computers clock ticks occur several times a second, a 1 Mhz computer theoretically undergoes 1048576 clock ticks per second.  It has nothing to do with the operating system the computer is running on.

51251[/snapback]

Let's get one thing straight here. Mega = 10 to the power 6. That's 1,000,000 because we count it that way.

Which implies that 1MHz = 1,000,000 clock cycles (Hz)

Now 1MB (that's one Mega byte) = 1048576 bytes.

Why so? Ok, here it is.

We humans calculate in decimal that is a increase in one digit (length wise) for 10 increments in the previous number (counting wise). That's decimal. For computers there are only 2 digits 1,0 so there is a digit increase for evey two increments, like this:

0

1

10

11

100 see?

So the 10 to the power 6 in decimal doesn't work for computers. Because computers use binary.

For computers one Mega byte is 1 kilo X 1 kilo. But one kilo is 2 to the power 10 = 1024.

Thats why 1MB = 1024 KB = 1048576 bytes.

That's half of what I wanted to tell you. The other half:

When we talk about data rate say 10 Mbps LAN we say 10 million bits cross a reference in one second. But this is what we count not what the computer counts so effectively 10Mbps = 10,000,000. Now when we actually measure the data transferred in one second it is: 1.192 MB/s. Now look closely you will see the difference in units. Mbps is not equal to MB/s.

You will see that 10Mbps = 1.192 MB/s. How do you get this? Simple:

10,000,000 bits = 10,000,000/8 bytes = 1,250,000 bytes but not 1.25 MB.

Why you ask? Because

1,250,000 bytes = 1,250,000/1024 KB = 1220.7 KB

= 1,250,000/1024/1024 = 1220.7/1024 = 1.192MB

So data rate of 10Mbps (in bits) will be 1.192 MB/s (in bytes).

Got it? :D

Anything else? Just post it. I'm watching.

By the way there is no use wasting your time on clock cycles unless you work at device level, I mean if you are working with device programming and assembly. It's a total waste of time. I gave up because I got fed up with every processor having different clocks and cycles. It's not of much use to me. If you are interested you're welcome. :D Carry on.

Actually I also thought that this had something to do with overclocking. :D

Share this post


Link to post
Share on other sites

Okay, you're right about the fact that a 1Mhz computer runs at 10^6 clock cycles per second, and I knew that, so I don't know why I posted the Megabyte number. Second of all, I doln't know why you brought up the Mbps piece of information. But thank you for reminding us of it. Thirdly, this is the forum for assembly and Delphi programming, so the people in here are interested in assembly programming for their various reasons, and I personally feel that no one language is ever superior to any others. So when a person in this forum asks about clock cycles, they are probaby working in assembly and would have to worry about the device level information. If no one worried about device level information, none of the higher level programming would work either. But thank you for obviously reading this thread (which is apparent because you quoted me) and for attempting to fix my mistakes, keep up the good work.

Share this post


Link to post
Share on other sites

It seems like partial tangent is the full blown tangent function. Partial tangent takes the tangent of the value in ST0 with a domain of -2 x 10 ^63 to 2 x 10^63. It also pushes 1 onto the FPU stack. I see no reason to call it a partial tangent, but whatever floats there boats.

That's quite odd. The FPU stack's 80 bits right? Alright, so rounding standards are easier to edit than I believed... thought I'd need to make my own process. I don't get why it pushes 1 to the FPU stack though. By the way, if you search google for 'x86 clocking' this thread is #1.
Um... thanks canute24, that was a nice reminder of what I learned back in my printf("Hello World!\n"); days. However, I'm trying to create the finest operating system that I can, so every cycle matters.

Hey, I'm trying to optimize reading and writing to the hard disk. should I first get the processor type & speed, and hard disk RPM from BIOS (last one I'm not sure about) and wait for the computed avg. access time and then check the hard disk status, or should I just let the PIC handle this one?

Share this post


Link to post
Share on other sites

I have absolutely no idea why 1 is pushed onto the FPU stack, we should ask intel. And it's really cool that we are number 1, I've always liked being number 1. But for the hard disk access, since I assume you are still using real mode (you earlier stated that you hadn't dabbled in protected mode), I would simply let the Interrupts take care of it. There is a way, and I haven't needed the hard drive for anything other than opening text files (which it is easiest to use DOS interrupts for), to set up the hard drive with a task and then continue processing until the task is completed, and then finish up with that task using interrupts. If you enter protected mode, I would let the PIC handle it, because the PIC sends an interrupt when the hard drive finishes, so just create an interrupt to handle that. That would be the most efficient for an OS, but the harder method. And I apologize for not placing a tutorial up on Sunday, I have been really busy, so I'm working on it now, and I'm going to post at least one other this week, and, if I have time, a special VGA intro.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.