Chapter 1
Syllabus
Catalog Description: Computer structure, machine representation of data,
addressing and indexing, computation and control instructions, assembly
language and assemblers; procedures (subroutines) and data segments,
linkages and subroutine calling conventions, loaders; practical use of an
assembly language for computer implementation of illustrative examples.
Course Goals
0 Knowledge of the basic structure of microcomputers – registers, mem-
ory, addressing I/O devices, etc.
1 Knowledge of most non-privileged hardware instructions for the Ar-
chitecture being studied.
2 Ability to write small programs in assembly language
3 Knowledge of computer representations of data, and how to do simple
arithmetic in binary & hexadecimal, including conversions
4 Being able to implementing a moderately complicated algorithm in
assembler, with emphasis on efficiency.
5 Knowledge of procedure calling conventions and interfacing with high-
level languages.
Optional Text: Kip Irvine, Assembly Language for the IBM PC, Prentice
Hall, 4th or 5th edition
1
Additional References: Intel and DOS API documentation as presented
in Intel publications and online at www.x86.org; lecture notes (to be sup-
plied as we go).
Prerequisites by Topic. Working knowledge of some programming lan-
guage (102/103: C/C++); Minimal programming experience
Major Topics Covered in the Course:
1 Low-level and high-level languages; why learn assembler?
2 How does one study a new computer: the CPU, memory, addressing
modes, operation modes.
3 History of the Intel family of microprocessors.
4-5 Registers; simple arithmetic instructions; byte ; Arithmetic and
logical operations.
6 Implementing longer integer type support; carry and overflow.
7 Shifts, multiplication and division.
8 Memory layout.
9 Direct video memory access; discussion of the first project.
10 Assembler syntax; how to use the tools.
11-13 Conditional & unconditional jumps; loops; emulating high-level lan-
guage constructions; Stack; call and return; procedures
14-15 String instructions: effcient memory-to-memory operations.
16 Interrupts overview: interrupt table; how do interrupts work; classif-
cation.
17 Summary of the most important interrupts.
18-20 DOS interrupt; File I/O functions; file-copy program; discussion of
the second project
21 Interrupt handlers; keyboard drivers; timer-driven processes; viruses
and virus-protection software.
2
22 Debug interrupts; how do debuggers and profilers work.
23-24 (Optional).interfacing with high level languages; Protected mode fun-
damentals
Grading The grading is based on two projects, midterm project is 49%
and the final is 51%. Please note that the projects are individual, submitting
projects that are similar to submissions of others and/or are essentially
downloads from the Web would result in a fail.
Office Hours My hours this term for CSc 210 will be 3:45 ¶Ł 4:45 on
Mondays.
Zoom links:
11am https://ccny.zoom.us/j/85378437821
2pm https://ccny.zoom.us/j/87625527827
3
https://ccny.zoom.us/j/87625527827
https://ccny.zoom.us/j/85378437821
Chapter 2
Preliminary material
4
: Why assembler?
• Why take this class?
• Why program assembler?
• Why know assembler?
5
: NOTE: think Binary!
Why binary?
Binary numbers (WIKI)
(brief answer: because this is easy to implement)
Why hex?
Hexadecimal numbers (WIKI)
(brief answer: because it is much easier to work with shorter strings)
What about DNA?
6
https://en.wikipedia.org/wiki/Hexadecimal
https://en.wikipedia.org/wiki/Binary_number
2.1 Introduction #1: looking at new hard-
ware
• CPU, general purpose (arithmetic) registers
– How large?
– How many?
– Are they all the same?
– Modes?
• Memory Model
– Is all memory the same?
– Flat?
– Segmented?
– Paged?
• Other hardware (peripherals)
• OS
• Special features
7
2.2 Introduction #2: History
Intel Processors Over the Years
The History Of Intel CPUs
-1971 before Intel
1971 4004
• Intention
• Name
• Usage
• What can you do with 4 bits?
1972 8008
– Doubling – what can you do with 8 bits
1974 8080
1975 8085
1975 Z80
1974 CP/M – Digital Research, Gary Kildall
1978
– 8086 – X86 architecture. 8 8bit registers, 8(+6) 16 bit registers. 1mb
limit. 1mb mystery?
1979 8088 – cost cutting
1981 iAPX 432 – an attempted 32 bit processor
1982 80186 – minor improvements/corrections
1981 IBM PC
8
https://en.wikipedia.org/wiki/CP/M
https://www.tomshardware.com/picturestory/710-history-of-intel-cpus.html
https://www.businessnewsdaily.com/10817-slideshow-intel-processors-over-the-years.html
2.3 Introduction #3: Fundamentals
Data types
1 bit
4 nibble
8 byte
16 word
32 dword, doubleword
64 qword, quadword
80 tenbyte
9
2.4 x86 CPU
10
Registers Overlap!
Problem: Let AH = 2,AL = 3. What is AX?
Solution:
00000010 00000011
AH=00000010b=02h
AL=00000011b=03h
AX=0000001000000011b = 0203h = 515d
Note: suffices b, h and d are part of the Assembly language syntax;
d(ecimal) is the default. Assignment syntax, however, is different, it is only
used for an illustration here.
Fast solution:
AX = 2*256+3
AX = (2<<8)+3
Problem: Let AX = 2020. What are AL and AH ?
Registers BX,CX, DX are divided similarly.
General purpose aka Arithmetic registers:
Sequence A,B,C,D is an illusion, these letters stand for Accumulator,
Base, Count, Data.
8 8bit registers: AH,AL,BH,BL,CH,CL,DH,DL 8 16bit registers: AX,BX,CX,DX
and SI,DI,BP(?),SP(??)
SP generally cannot be used for calculations, BP usually cannot be used
either.
(32 bit to be described later)
11
IP – Instruction pointer
points to the first byte of the current instruction.
Code:
B B B B B B B B B B B B B B B B B B
Code is essentially a one dimensional array of bytes (in C/C++ – un-
signed char type).
IP initially is 0, after one instruction is executed it should be 2, then 5,
then 6, ....
Simplified logic (one instruction)
byte code[MAXCODE];
byte opcode;
opcode=code[IP++];
switch (opcode) {
case 0x00: ...
case 0x01: ...
...
case 0xFF: ...
}
each subcase will read additional bytes if needed to complete reading of
the instruction.
12
Simplified logic (full execution)
byte code[MAXCODE];
byte opcode;
while(true) {
opcode=code[IP++];
switch(opcode) {
case 0x00: ...
....
case 0xFF: ...
}
}
Why is this simplified?
• CS is also used.
• how do we terminate?
• how do we change the executed sequence?
What do we do with this?
Is switch efficient?
Question: what would IP=k do (if such instruction exists).
13
FLAGS register
Should be seen not as a single 16-bit register but as a collection of 16
1-bit registers.
More important ones: ZF, SF, CF, DF
Neither FLAGS nor the names above are keywords.
14
Segment registers : CS, DS, SS, ES – specify where segments (“parts”) of
the program are located.
• CS Code Segment
• DS Data Segment
• SS Stack Segment
• ES Extra Segment
15
2.5 8086 registers – full list
• AX Accumulator eXtended
• AL Accumulator Low
• AH Accumulator High
• BX Base eXtended
• BL Base Low
• BH Base High
• CX Count eXtended
• CL Count Low
• CH Count High
• DX Data eXtended
• DL Data Low
• DH Data High
• SI Source Index
• DI Destination Index
• BP Base Pointer
• SP Stack Pointer
• CS Code Segment
• DS Data Segment
• SS Stack Segment
• ES Extra Segment
• IP Instruction Pointer (not a keyword)
• Flags Flags (not a keyword)
16
2.6 General addressing scheme
Three distinct ways to address memory:
• Absolute address : mem[offset] (flat model–generally cannot be done)
• Segmented address : mem[f(seg,offset)] (done by hardware). Usual
notation: ssss:oooo (hex digits)
• Expressing segmented address in assembly syntax – to be covered
later
The f(seg,offset) function is mode-dependent.
In real mode, f(seg,offset)=seg*16+offset.
This allows to build 20 bit numbers out of 16 bit quantities.
Examples
0000:0000 =⇒ 00000
1234:5678 =⇒ 179B8
+ 12340
05678
--------
179B8
The mapping is not one-to-one! Different (seg,offset) pairs may point to
the same address.
0000:0100 =⇒ 00100
0010:0000 =⇒ 00100
Puzzle
FFFF:FFFF =⇒ ?????
(ref: A10 address line)
==
17
Code segment is effectively mem[f(CS,i)], Data segment is effectively
mem[f(DS,i)]
Protected memory addressing function uses Segment Descriptor Table
lookup. Fields include Base, Limit, Access Rights.
Implication: instructions Segment<-value are very costly in protected
mode.
18
2.7 Back to History: Original IBM PC (1981)
Distorted:
Timeline
IBM’s brand recognition, along with a massive marketing
campaign, ignites the fast growth of the personal computer mar-
ket with the announcement of its own personal computer (PC).
The first IBM PC, formally known as the IBM Model 5150, was
based on a 4.77 MHz Intel 8088 microprocessor and used Mi-
crosofts MS-DOS operating system. The IBM PC revolutionized
business computing by becoming the first PC to gain widespread
adoption by industry. The IBM PC was widely copied (“cloned”)
and led to the creation of a vast “ecosystem” of software, pe-
ripherals, and other commodities for use with the platform.
Better:
WIKIPEDIA article
Additional link (on reaction):
Orson Scott Card’s novel
19
https://en.wikipedia.org/wiki/Lost_Boys_(novel)
https://en.wikipedia.org/wiki/IBM_Personal_Computer
https://www.computerhistory.org/timeline/1981/
No OS !
Three options:
• CP/M-86 (Control program for Microcomputers), see also DR page
• UCSD p-System
• PC DOS/MS DOS, see also 86-DOS
See also: PL/M
Introduction #2: History (cont)
1982 80186, 80188
1982-1991 80286
1985-2007 80386
80186 : almost not used in PC’s, many improvements in instructions
(kept).
80286 : 16mb protected mode–promise not fullfilled.
Real mode −→−→ Prot mode
XENIX
20
https://en.wikipedia.org/wiki/Xenix
https://en.wikipedia.org/wiki/Intel_80386
https://en.wikipedia.org/wiki/Intel_80286
https://en.wikipedia.org/wiki/Intel_80186
https://en.wikipedia.org/wiki/PL/M
https://en.wikipedia.org/wiki/86-DOS
https://en.wikipedia.org/wiki/IBM_PC_DOS
https://en.wikipedia.org/wiki/UCSD_Pascal
http://www.digitalresearch.biz/CPM.HTM
https://en.wikipedia.org/wiki/CP/M
80386
• 32 bit
• 2 additional modes
• misc enhancements (debugging)
21
Doubling of registers again
EAX = xxxxxxxxxxxxxxxx ahahahah alalalal
22
Flags register becomes EFLAGS :
Additionally:
• Control Registers CR0..CR7 (CR0=MSW(Machine Status Word) on
80286)
• Test Registers TR0..TR7
• Debug Registers DR0..DR7
64 bit mode adds RAX,...
23
24
On paging
Virtual memory allows to execute programs larger than physical mem-
ory.
Generally cannot be controlled by the programmer, paging algorithms
are implemented by the OS
Page replacement algorithms
Application algorithms can be tailored for paging environment.
Example:
#define N 1024
int x[N][N],y[N][N],z[N][N];
int i,j;
for (int i=0; i
[
where
the opcode
;comment – any text, ignored up to the EOL.
Trivial example:
30
lab: ; this line does not do anything
Symbolic representation of instructions corresponds to particular se-
quence of bytes which are actually executed.
3.2 The NOP instruction
NOP (do nothing)
Binary representation: one byte, hex value 90h.
Execution:
Before:
bb bb bb bb bb bb bb bb bb
↑IP
90 bb bb bb bb bb bb
After:
bb bb bb bb bb bb bb bb bb 90
↑IP
bb bb bb bb bb bb
IP is incremented by 1; no other register is changed
31
WHY have it?
• delay?
• padding for sloppy compilers
• patching (code deletion)
• reserving space for patching(code addition)
32
3.3 The MOV instruction
MOV dst,src (copy src to dst)
Example:
MOV AL,BL
;
; before : AL=3 BL=7
; after : AL=7 BL=7
Example:
MOV DL,CH
MOV DL,DL
MOV AX,CX
MOV AX,SP
MOV SP,CX ; very dangerous
MOV EDI,EDI
MOV EDI,ESP
MOV AL,CX ; illegal
MOV EDI,CX ; illegal
MOV IP,AX ; illegal
MOV AX,CS ; ok, special case (see below)
MOV DS,AX ; ok, special case (see below)
MOV CS,DX ; special case, illegal
MOV DS,EDI ; illegal
MOV CR0,EAX ; priveleged
MOV DR0,EAX ; ok, special case (see below)
RULE #1: size of src and dst must match
Most instructions support only gp regis-
ters
33
Argument types:
• (r)egister
• (m)emory
• (i)mmediate
• (s)pecial register
Argument size:
• (b)yte
• (w)ord
• (d)oubleword
• …
MOV DL,CH ; brr instruction
34
General template for 2-arg instructions:
r m i
r . . .
m . . .
i . . .
Move-specific template:
r m i s
r . . . .
m . . . .
i . . . .
s . . . .
35
Right now:
r m i
r X . .
m . . .
i . . .
Examples:
MOV AL,[100] ; brm
MOV BX,[200] ; wrm
MOV EDI,[400] ; drm
MOV [100],AL ; bmr
MOV [200],BX ; wmr
MOV [400],EDI ; dmr
Thus
r m i
r X X .
m X . .
i . . .
What does [#] really mean?
Answer: bytes beginning with byte #.
in
MOV AX,[100]
which byte goes where?
36
Examples:
MOV AL,1 ; bri
MOV DX,2 ; wri
MOV EDI,4 ; dri
r m i
r X X X
m X . .
i . . .
Examples:
MOV AL,97 ;
MOV AL,61h ; all four lines are equivalent
MOV AL,01100001b
MOV AL,’a’ ;
…
MOV AL,1000 ; ???
37
No storing into immediates, this would be like
1=x;
in C.
Thus:
r m i
r X X X
m X . .
i × × ×
Important: MOV with immediate is a fundamentally different operation
from the rr,rm, mr forms.
38
RULE #2: no memory-to-memory
(2 exceptions later)
Thus:
r m i
r X X X
m X × ?
i × × ×
MOV [100],1 ; should not compile
RULE #3: size must be known
Correct syntax:
MOV byte ptr [100],1
MOV word ptr [100],1
MOV dword ptr [100],1
MOV qword ptr [100],1 ; 64 bit only
MOV tbyte ptr [100],1 ; ???
What about
MOV [100],AL
MOV byte ptr [100],AL ; unneeded
MOV word ptr [100],AL ; will not compile
Final result:
r m i
r X X X
m X × X
i × × ×
39
Full table (MOV only):
r m i s
r X X X X
m X × X X
i × × × ×
s X X × ×
3.3.1 Examples
Here is how C/C++ assignments may be compiled:
char c1,c2; c1=c2;
——————-
MOV AL,c2
MOV c1,AL
short s1,s2; s1=s2;
——————-
MOV AX,s2
MOV s1,AX
int x,y; x=y;
——————-
MOV EAX,y;
MOV x,EAX;
40
int x,y,z; x=y=z;
——————-
MOV EAX,z;
MOV x,EAX;
MOV y,EAX;
int x; x=0;
——————-
MOV x,0;
int x,y,z; x=y=z=0;
——————-
MOV x,0
MOV y,0
MOV z,0
perhaps, a better implementation?
MOV EAX,0 ; could be even better
MOV x,EAX
MOV y,EAX
MOV z,EAX
41
Exercise: Exchange bytes in [100] and [101]
MOV AL,[100]
MOV AH,[101]
MOV [100],AH
MOV [101],AL
can this be done in fewer lines of code?
MOV AX,[100]
MOV [100],AH
MOV [101],AL
Note: Byte matters.
42
3.3.2 Byte
Consider:
MOV [100],AX
Does
LE,reversed AL go into [100] and AH into [101] or, instead:
BE,normal AH go into [100] and AL into [101]
More than you want to know on Endianness
LE,reversed : Intel, Dec
BE,normal : IBM mainframe, Motorola, Sun
Practical implications:
• it is important to know the endiness of the hardware and the data.
• it is important to be able to swap.
• it is important to be able determine the endiness. How?
Specific example of byte importance:
short s=1;
FILE *f=fopen(“try.dat”,”wb”);
if (!f) { … error handling … }
fwrite(&s,1,sizeof(s),f);
fclose(f);
Should create a 2-byte file try.dat.
Now,
43
https://en.wikipedia.org/wiki/Endianness
short s;
FILE *f=fopen(“try.dat”,”rb”);
if (!f) { … error handling … }
fread(&s,1,sizeof(s),f);
fclose(f);
cout << s; should print the value of s – indeed 1. But: what will happen if we run the Writing program on an Intel comp, move the data file to a Sun, and run the reading program there? Exercise: Can a high-level program be written that determines the of bytes? 44 3.4 The XCHG instruction XCHG dst,src (exchange src with dst) XCHG r m i r X X × m X × × i × × × Segment and other non-gp registers are not supported. The syntax and examples from MOV apply, except for non-use of non-gp registers and immediates. Examples (which of the following are valid?) XCHG AL,AH XCHG AX,SP XCHG EAX,EDI XCHG AL,[400] XCHG [400],AL ;same as above XCHG AL,DI XCHG DI,DS XCHG EAX,7 XCHG [100],[101] XCHG AX,AX ; nop? XCHG DI,DI ; nop? XCHG CL,CL ; nop? 45 Can a better version of byte swap program be now written? Better: MOV AX,[100] XCHG AL,AH MOV [100],AX Yet better: XCHG AX,[100] XCHG AL,AH XCHG [100],AX Q: can a shorter program be written (perhaps with another instruc- tion)? 46 3.4.1 Binary encoding of XCHG We only consider accumulator exchanges now. Instructions XCHG AX,reg are extra optimized in the intel architecture. 90h XCHG AX,AX 91h XCHG AX,CX 92h XCHG AX,DX 93h XCHG AX,BX 94h XCHG AX,SP 95h XCHG AX,BP 96h XCHG AX,SI 97h XCHG AX,DI Q: Why the # of registers is a power of 2 ? A: Because this allows to represent registers as in a fixed number of bits. 47 16-bit register representation: 000b AX 001b CX 010b DX 011b BX 100b SP 101b BP 110b SI 111b DI An emulator may use code like unsigned short regs[8]; #define AX regs[0] #define CX regs[1] #define DX regs[2] #define BX regs[3] #define SP regs[4] #define BP regs[5] #define SI regs[6] #define DI regs[7] Notes: • this is just an example! • 8 bit registers have their own 3-bit keys • 32 bit registers parallel 16 bit registers • 64 bit registers use 4-bit keys • The above code should define 8-bit regs properly (f.e. setting AX should set AL,AH too! 48 • The above code should be modified to support 32 bit registers texttt{XCHG AX,AX} is NOP. General encoding scheme of XCHG (with accumulator): 1 0 0 1 0 r e g This idea is used in other instructions. XCHG without accumulator uses a l8engthier encoding, with first byte 86h/87h. XCHG encoding 49 https://c9x.me/x86/html/file_module_x86_id_328.html NOTE: MOV has several different forms, including optimized forms for the accumulator. Similar scheme is used for the segment registers: 00b ES 01b CS 50 10b SS 11b DS 3.5 The ADD instruction ADD dst,src (dst += src) (proper name should be increment by.) General 2-operand instruction layout applies: ADD r m i r X X X m X × X i × × × Given that syntax of ADD is largely similar to MOV, the examples are sim- ilar: ADD AX,BX ADD EAX,ESP ADD DL,CL ADD AX,[100] ADD [150],EAX ADD AX,DS ; illegal ADD AX,DL ; illegal ADD [10],5 ; syntax error ADD word ptr [10],5 ; fine C example: int x,y,z; x=y+z; ----- MOV EAX,y 51 ADD EAX,z MOV x,EAX int x,y,z; x=x+y; ----- MOV EAX,y ADD x,EAX int x,y,z; x=x+25; ----- ADD x,25 (NOTE: size specification is not required if x is declared to be a double word) 52 Consider ADD AL,AL Generally, multiplication by 2 should not be done as multiplication (generally about 3x slower than addition). Writing int x; x=2*x; is wrong! One should use either addition or a shift (if available). (What is better depends on the situation and hardware). Q: Should we replace multiplication by addition in : int f(int); int x; x=2*f(x); More simple examples: Consider ADD AL,0 ; nop ? ADD AL,1 ; increment ? ADD AL,-1 ; decrement ? ADD AL,AL ; double 53 MOV AL,1 ; AL=1 ADD AL,AL ; AL=2 ADD AL,AL ; AL=? ADD AL,AL ; AL=? ADD AL,AL ; AL=? ADD AL,AL ; AL=? ADD AL,AL ; AL=? ADD AL,AL ; AL=? ADD AL,AL ; AL=? ADD AL,AL ; AL=? MOV AL,1 ; AL=1 binary |00000001 ADD AL,AL ; AL=2 binary |00000010 ADD AL,AL ; AL=4 binary |00000100 ADD AL,AL ; AL=8 binary |00001000 ADD AL,AL ; AL=16 binary |00010000 ADD AL,AL ; AL=32 binary |00100000 ADD AL,AL ; AL=64 binary |01000000 ADD AL,AL ; AL=128 binary |10000000 ADD AL,AL ; AL=0 binary 1|00000000 << overflow ADD AL,AL ; AL=0 binary 00000000 Is this an assembler problem ? unsigned char c; c=1; printf("%d",c); c=c+c; printf("%d",c); c=c+c; printf("%d",c); c=c+c; printf("%d",c); c=c+c; printf("%d",c); c=c+c; .... Note: if you like C++ and cout<<, make sure to cast! 54 Q: what would be the output if we use char rather than unsigned char? Is this a size problem ? Try MOV AX,1 ADD AX,AX ... OR MOV EAX,1 ADD EAX,EAX ... OR C/C++ versions. 55 Unlike MOV and XCHG, ADD is an arithmetic instruction: it sets flags. Warning: the discussion of the flags is slightly simplified, I’m not con- sidering the OF. Thus there are slight differences between the behavior described and the actual behavior of the processor. This makes no differ- ence for most programs, but there are rare instances where this matters. In particular, I will consider JS and JL as equivalent, in reality they are not exactly the same. ZF Zero Flag SF Sign Flag CF Carry Flag OF Overflow Flag ; ZF SF CF MOV AL,1 ; AL=1 binary |00000001 ? ? ? ADD AL,AL ; AL=2 binary |00000010 0 0 0 ADD AL,AL ; AL=4 binary |00000100 0 0 0 ADD AL,AL ; AL=8 binary |00001000 0 0 0 ADD AL,AL ; AL=16 binary |00010000 0 0 0 ADD AL,AL ; AL=32 binary |00100000 0 0 0 ADD AL,AL ; AL=64 binary |01000000 0 0 0 ADD AL,AL ; AL=128 binary |10000000 0 1 0 ADD AL,AL ; AL=0 binary 1|00000000 1 0 1 << overflow ADD AL,AL ; AL=0 binary 00000000 1 0 0 WARNING: This is slightly simplified (there is also OF) Flags can be used to • implement conditionals (IF, WHILE,...) • implement “long” arithmetic • check for overflow 56 3.5.1 Overflow detection unsigned int x,y,z; .... x=y+z; // concern about overflow unsigned int x,y,z; .... y=0x90000000; z=0x90000000; x=y+z; // overflow will occur here, result will be incorrect. can we check for it like this? unsigned int x,y,z; .... if (y+z>0xFFFFFFFF)
error(“overflow”);
x=y+z;
Correct way:
unsigned int x,y,z;
….
if (y>0xFFFFFFFF-z)
error(“overflow”);
x=y+z;
57
Exercise: what about signed types?
A: you will need to check both for “positive” overflow (adding two large
positive number) and for the “negative;; overflow (adding two large nega-
tive numbers).
In assembler, flags report overflow condition – no need for extra check-
ing!
3.6 The SUB instruction
SUB dst,src (dst -= src)
(proper name should be decrement by.)
General 2-operand instruction layout applies:
SUB r m i
r X X X
m X × X
i × × ×
Given that syntax of SUB is identical to ADD, syntax examples are similar
and omitted.
ADD AX,100
SUB AX,-100 ; same as above
;
ADD AX,-100
SUB AX,100 ; same as above
What do these instructions do?
ADD AX,0
SUB AX,0
58
What does this instruction do?
SUB EAX,EAX
Answer: most efficient way to zero up a register.
What is the difference between the two instructions below?
SUB EAX,EAX
MOV EAX,0
Answer: the former is more efficient; the latter is rarely used, only in
the situations when flags must be preserved. (an example, involving an if,
will be given later.)
Revising example we saw above, more efficient code:
int x,y,z; x=y=z=0;
——————-
SUB EAX,EAX
MOV x,EAX
MOV y,EAX
MOV z,EAX
with SUB, Carry flag indicates borrowing.
59
3.7 The INC instruction
INC dst (dst++)
Do we write
ADD AX,1
ADD byte ptr [10],1
A: yes, we can, but usually we would use the optimized form
General 1-operand instruction layout applies:
INC r m i
X X ×
(Same format applies to three more instructions, explained later).
Register form is optimized to one-byte encoding:
40h INC AX
41h INC CX
42h INC DX
43h INC BX
44h INC SP
45h INC BP
46h INC SI
47h INC DI
Other forms of INC are encoded in lengthier way beginning with 0FFh
and 0FEh.
Warning: this encoding applies to BOTH 16 and 32 registers!
What is better?
60
INC AX
INC AX
;or
ADD AX,2
A: former. But do not do this with memory arguments.
61
3.8 The DEC instruction
DEC dst (dst- -)
DEC r m i
X X ×
Comments on INC above are applicable.
Optimized form:
48h DEC AX
49h DEC CX
4Ah DEC DX
4Bh DEC BX
4Ch DEC SP
4Dh DEC BP
4Eh DEC SI
4Fh DEC DI
Other forms of DEC are encoded in lengthier way beginning with 0FFh
and 0FEh.
62
3.9 The NEG instruction
NEG dst (dst=-dst)
NEG r m i
X X ×
How to negate without using NEG?
NEG EAX
;is the same as
SUB EBX,EBX
SUB EBX,EAX
MOV EAX,EBX
Solve equation x = −x ?
MOV AL,x
NEG AL
if AL did not change, does this mean x is 0?
No, it is either 0 or 128!
For short, the solutions are 0 and 215=8000h, for 32-bit they are ….
63
#include
#include
int main() {
short s,s1;
s=0; s1=-s; if (s!=s1) printf(“for s=%d, changes;n”,s);
else printf(“for s=%d, does not change;n”,s);
s=1000;s1=-s; if (s!=s1) printf(“for s=%d, changes;n”,s);
else printf(“for s=%d, does not change;n”,s);
s=0x8000;s1=-s; if (s!=s1) printf(“for s=%d, changes;n”,s);
else printf(“for s=%d, does not change;n”,s);
return 0;
}
}
results in
for s=0, does not change;
for s=1000, changes;
for s=-32768, does not change;
3.10 The CMP instruction
CMP dst,src (dst-src)
General 2-operand instruction layout applies:
CMP r m i
r X X X
m X × X
i × × ×
Conditionals (if,while) are implemented in two stages:
• compute flags
• do (or not) the operation depending on a flag(or flags) – this is later.
64
Examples:
; if (x==0) …
;
; compute ZF from value of x
; if (x!=0) …
;
; compute ZF from value of x
; if (x<0) ...
;
; compute SF from value of x
; if (x<=0) ...
;
; compute SF and ZF from value of x
What about
; if (x==y) ...
;
;
to set the flags, we compute the difference
; if (x==y) ...
;
MOV EAX,x
SUB EAX,y
; this sets ZF for our use
65
; if (x
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read more