osknockout 0 Report post Posted February 14, 2005 I know this is almost a fundamental topic, but does anyone know a good instruction clocking method in x86 asm? I know there's a million programs which do that, but any source code? Share this post Link to post Share on other sites
lordofthecynics 0 Report post Posted February 15, 2005 I know this is almost a fundamental topic, but does anyone know a good instruction clocking method in x86 asm? I know there's a million programs which do that, but any source code? 50505[/snapback] I saw this post and thought it was about OVERclocking CPU chips. Whooops. And no, I don't know anything about clocking in x86 ASM. What is that, anyway? I'm kinda curious now. Is that like, Linux-related, or something? If it is, I'm pretty much lost on Linux, so I'm no help AT ALL. Sorry. Share this post Link to post Share on other sites
vizskywalker 0 Report post Posted February 16, 2005 I'll reply to both posts in this topic. To the first one, I don't know of any programs that do it, but I know many sites that list the CPU clocks that each instruction takes one various chips ranging from 2 or 386 to pentium and sometimes beyond. This is one of the better ones: https://packetstormsecurity.com/programbly/opcode.html. You can pair this with the intel instruction set opcode reference found here, http://www.intel.com/content/www/us/en/design/resource-design-center.html;, to figure out how long each instruction will take. I can also try and write one for you but my best guess as to when it would be done is at least a year if my friend will help me. I know I once had a program that recorded the number of clock ticks a program took but can't for the life of me remember what it was, sorry. For the second post, x86 asm (properly known as x86 assembly) is the most basic mnemonic set for machine code instructions (the ones and zeros a computer actually uses). The x86 implies that it is run on chips that support the intel 86 architecture which means any of their chips whose real names end in 86. A pentium, for example, is really an 80586, while the first of the set is an 8086. AMD chips also support this architecture, while Apple computers do not. Each instruction requires a certain amount of clock ticks to be carried out. A clock tick is like the second hand of the clock jumping from one second to two, then three, etc. Every time it jumps, it makes a clock tick. But on computers clock ticks occur several times a second, a 1 Mhz computer theoretically undergoes 1048576 clock ticks per second. It has nothing to do with the operating system the computer is running on. Share this post Link to post Share on other sites
osknockout 0 Report post Posted February 16, 2005 To the first one, I don't know of any programs that do it, but I know many sites that list the CPU clocks that each instruction takes one various chips ranging from 2 or 386 to pentium and sometimes beyond.Yeah, I know. If you happen to know ones for the Pentium II+ do tell. Ah well, how is it done directly anyway? I mean there's no way to get the clock cycle amount without executing instructions to count it (unless you're somehow overriding the PCI with some crazy code). But, thanks for your help.What is that, anyway? I'm kinda curious now. Is that like, Linux-related, or something?Well good. Most of the code I see nowadays is with Linux drivers esp. those using a nice _inline in C++. But like vizskywalker said, it's os independent. Try to learn x86 asm if you can and join us Share this post Link to post Share on other sites
vizskywalker 0 Report post Posted February 17, 2005 The clock ticks can be determined by looking at assembler code, and making use of the page that shows clock ticks to count the number of ticks each instruction takes. The Intel manual is to determine how many bytes each instruction will be if necessary. And I will keep my eyes out for a Pentium II+ listing and/or a program to generate the clock cycles. Share this post Link to post Share on other sites
osknockout 0 Report post Posted February 17, 2005 The clock ticks can be determined by looking at assembler code, and making use of the page that shows clock ticks to count the number of ticks each instruction takes. The Intel manual is to determine how many bytes each instruction will be if necessary. And I will keep my eyes out for a Pentium II+ listing and/or a program to generate the clock cycles.I know, I know. I was asking how THEY measured it. Thanks for the links by the way. Share this post Link to post Share on other sites
vizskywalker 0 Report post Posted February 17, 2005 If by they you mean the people who arrive at the clock ticks then it works like this. A basic computer, if it starts with a blank processor does something like this:Tick # Action 1 Action 21 Load Instruction #1 N/A2 Load Data for Instruction #1 Preread Instruction #23 Perform Instruction #1 Preread Data for Instruction #24 Perform Instruction #2 Preread Instruction #35 Load Data for Instruction #3 Preread Instruction #46 Perform Instruction #3 Preread Data for Instruction #4etc.Each step in either action one or action two that an instruction requires adds one to the tick count that instruction requires. Some instructions don't require data or require two clock ticks to get the data or perform the operation. If a preread needs to access data that is being accessed by the current instruction, it must wait until that instruction finishes. That is why optimization tutorials say to not write code like this:mov ax, 0a000hmov es, axmov si, 0000hbut like this:mov ax, 0a000hmov si, 0000hmov es, ax Share this post Link to post Share on other sites
osknockout 0 Report post Posted February 18, 2005 Each step in either action one or action two that an instruction requires adds one to the tick count that instruction requires. Some instructions don't require data or require two clock ticks to get the data or perform the operation. If a preread needs to access data that is being accessed by the current instruction, it must wait until that instruction finishes.All right, that makes sense. I have read AoA by the way, so I'm familiar with your load information - I just didn't realize they did the same thing.By the way... isn't it mov ax, a000h instead of mov ax, 0a000h? Technicalities Share this post Link to post Share on other sites
lordofthecynics 0 Report post Posted February 19, 2005 Oh, ASSEMBLY, ok. The last time I tried anything with that was when I was in the PC lab in college and the prog I was using crashed, and Win2000 asked if I wanted to "debug it," so I thought, why not? So I clicked OK and it brought up VisualBasic, then gave me another error, and closed. WTF. So yeah. I let it go and forgot about it. Too confusing. Share this post Link to post Share on other sites
vizskywalker 0 Report post Posted February 20, 2005 As far as mov ax, 0a000h goes, adding leading zeros never affects data in most compilers. I use TASM (well, I'm writing my own but that's what I use now), and TASM treats anything beginning with a letter as a symbol for a variable. So I have to preface all hex numbers beginning with a letter with 0. And If you want to learn assembly, lordofthecynics, there are plenty of good tutorials out there, such as Art of Assembly. And I'm planning on putting some assembly tutorials in the Tutorial section of this forum soon. Share this post Link to post Share on other sites
osknockout 0 Report post Posted February 21, 2005 That's one odd incident lordofthecynics. Â As far as mov ax, 0a000h goes, adding leading zeros never affects data in most compilers. I use TASM (well, I'm writing my own but that's what I use now), and TASM treats anything beginning with a letter as a symbol for a variable. So I have to preface all hex numbers beginning with a letter with 0. And If you want to learn assembly, lordofthecynics, there are plenty of good tutorials out there, such as Art of Assembly. And I'm planning on putting some assembly tutorials in the Tutorial section of this forum soon.Ok, so 0a000h is acceptable at least to TASM. I'd like to see your new compiler and your tutorials, vizskywalker. Are they going to be newbie, intermediate, or advanced? By the way, what's the base number of clock cycles needed for a LEA assuming reg,mem? Share this post Link to post Share on other sites
vizskywalker 0 Report post Posted February 21, 2005 The tuts are going to start at newbie level and go as far as I can, hopefully including VESA, but I'm having a problem with refresh rates. For LEA it takes 2 + EA for an 808x, 3 for a 286, 2 for a 386, and 1 for a 486, and 1 for Pentium. EA is the number of cycles it takes to calculate the effective address: 5 for base, 5 for index, 6 for displacement, 7 for BP + DI or BX + SI, 8 for BX + DI or BP + SI, 11 for BP + DI + displacement or BX + SI + displacement, 12 for BX + DI + displacement or BP + SI + displacement. Add an extra 2 for a segment override. Share this post Link to post Share on other sites
osknockout 0 Report post Posted February 23, 2005 In that case, I'll need to read those tuts Can't help you with the refresh rates though. 5 for base, 5 for index, 6 for displacement, 7 for BP + DI or BX + SI, 8 for BX + DI or BP + SI, 11 for BP + DI + displacement or BX + SI + displacement, 12 for BX + DI + displacement or BP + SI + displacement. Add an extra 2 for a segment override.Now when you say segment override, do you mean a 64 K boundary? Sorry if I sound like I'm high, but what do you mean by it? It doesn't make sense that it would take a different number of clock cycles for different registers, but I'll take that info as valid as I'm not too familiar with BP, not ever having to really study it yet. Share this post Link to post Share on other sites
vizskywalker 0 Report post Posted February 23, 2005 I'm not too familiar with the segment override feature as I have never used it, but I think it works something like this. The lodsw instruction uses ds:si as a pointer to where the wrd of data is. But if for whatever reason you would rather use fs:si, you can use the fs segment override to change the pointer from ds:si to fs:si. And I don't understad why it takes more time to calculate the offset for certain arrangements of registers either. It probably has to do with the way Intel set up there processors and may not apply on newer Pentiums. Share this post Link to post Share on other sites
osknockout 0 Report post Posted February 23, 2005 Is EA applicable for the other processors as well? (Seems like it, but it would not explain why an 286 takes more time than a 808x) hmm... I think the new Pentiums might change the number of clock cycles interchangably because of a new architecture setup everytime. Share this post Link to post Share on other sites