assembly programming

Chapter 1

Syllabus

Catalog Description: Computer structure, machine representation of data,
addressing and indexing, computation and control instructions, assembly
language and assemblers; procedures (subroutines) and data segments,
linkages and subroutine calling conventions, loaders; practical use of an
assembly language for computer implementation of illustrative examples.

Course Goals

0 Knowledge of the basic structure of microcomputers – registers, mem-
ory, addressing I/O devices, etc.

1 Knowledge of most non-privileged hardware instructions for the Ar-
chitecture being studied.

2 Ability to write small programs in assembly language

3 Knowledge of computer representations of data, and how to do simple
arithmetic in binary & hexadecimal, including conversions

4 Being able to implementing a moderately complicated algorithm in
assembler, with emphasis on efficiency.

5 Knowledge of procedure calling conventions and interfacing with high-
level languages.

Optional Text: Kip Irvine, Assembly Language for the IBM PC, Prentice
Hall, 4th or 5th edition

1

Additional References: Intel and DOS API documentation as presented
in Intel publications and online at www.x86.org; lecture notes (to be sup-
plied as we go).

Prerequisites by Topic. Working knowledge of some programming lan-
guage (102/103: C/C++); Minimal programming experience

Major Topics Covered in the Course:

1 Low-level and high-level languages; why learn assembler?

2 How does one study a new computer: the CPU, memory, addressing
modes, operation modes.

3 History of the Intel family of microprocessors.

4-5 Registers; simple arithmetic instructions; byte ; Arithmetic and
logical operations.

6 Implementing longer integer type support; carry and overflow.

7 Shifts, multiplication and division.

8 Memory layout.

9 Direct video memory access; discussion of the first project.

10 Assembler syntax; how to use the tools.

11-13 Conditional & unconditional jumps; loops; emulating high-level lan-
guage constructions; Stack; call and return; procedures

14-15 String instructions: effcient memory-to-memory operations.

16 Interrupts overview: interrupt table; how do interrupts work; classif-
cation.

17 Summary of the most important interrupts.

18-20 DOS interrupt; File I/O functions; file-copy program; discussion of
the second project

21 Interrupt handlers; keyboard drivers; timer-driven processes; viruses
and virus-protection software.

2

22 Debug interrupts; how do debuggers and profilers work.

23-24 (Optional).interfacing with high level languages; Protected mode fun-
damentals

Grading The grading is based on two projects, midterm project is 49%
and the final is 51%. Please note that the projects are individual, submitting
projects that are similar to submissions of others and/or are essentially
downloads from the Web would result in a fail.

Office Hours My hours this term for CSc 210 will be 3:45 ¶Ł 4:45 on
Mondays.

Zoom links:

11am https://ccny.zoom.us/j/85378437821

2pm https://ccny.zoom.us/j/87625527827

3

https://ccny.zoom.us/j/87625527827

https://ccny.zoom.us/j/85378437821

Chapter 2

Preliminary material

4

: Why assembler?
• Why take this class?

• Why program assembler?

• Why know assembler?

5

: NOTE: think Binary!
Why binary?
Binary numbers (WIKI)
(brief answer: because this is easy to implement)
Why hex?
Hexadecimal numbers (WIKI)
(brief answer: because it is much easier to work with shorter strings)
What about DNA?

6

https://en.wikipedia.org/wiki/Hexadecimal

https://en.wikipedia.org/wiki/Binary_number

2.1 Introduction #1: looking at new hard-
ware

• CPU, general purpose (arithmetic) registers

– How large?

– How many?

– Are they all the same?

– Modes?

• Memory Model

– Is all memory the same?

– Flat?

– Segmented?

– Paged?

• Other hardware (peripherals)

• OS

• Special features

7

2.2 Introduction #2: History
Intel Processors Over the Years

The History Of Intel CPUs

-1971 before Intel

1971 4004

• Intention

• Name

• Usage

• What can you do with 4 bits?

1972 8008

– Doubling – what can you do with 8 bits

1974 8080

1975 8085

1975 Z80

1974 CP/M – Digital Research, Gary Kildall

1978

– 8086 – X86 architecture. 8 8bit registers, 8(+6) 16 bit registers. 1mb
limit. 1mb mystery?

1979 8088 – cost cutting

1981 iAPX 432 – an attempted 32 bit processor

1982 80186 – minor improvements/corrections

1981 IBM PC

8

https://en.wikipedia.org/wiki/CP/M

https://www.tomshardware.com/picturestory/710-history-of-intel-cpus.html

https://www.businessnewsdaily.com/10817-slideshow-intel-processors-over-the-years.html

2.3 Introduction #3: Fundamentals
Data types

1 bit

4 nibble

8 byte

16 word

32 dword, doubleword

64 qword, quadword

80 tenbyte

9

2.4 x86 CPU

10

Registers Overlap!
Problem: Let AH = 2,AL = 3. What is AX?
Solution:

00000010 00000011

AH=00000010b=02h

AL=00000011b=03h

AX=0000001000000011b = 0203h = 515d

Note: suffices b, h and d are part of the Assembly language syntax;
d(ecimal) is the default. Assignment syntax, however, is different, it is only
used for an illustration here.

Fast solution:

AX = 2*256+3

AX = (2<<8)+3 Problem: Let AX = 2020. What are AL and AH ? Registers BX,CX, DX are divided similarly. General purpose aka Arithmetic registers: Sequence A,B,C,D is an illusion, these letters stand for Accumulator, Base, Count, Data. 8 8bit registers: AH,AL,BH,BL,CH,CL,DH,DL 8 16bit registers: AX,BX,CX,DX and SI,DI,BP(?),SP(??) SP generally cannot be used for calculations, BP usually cannot be used either. (32 bit to be described later) 11 IP – Instruction pointer points to the first byte of the current instruction. Code: B B B B B B B B B B B B B B B B B B Code is essentially a one dimensional array of bytes (in C/C++ – un- signed char type). IP initially is 0, after one instruction is executed it should be 2, then 5, then 6, .... Simplified logic (one instruction) byte code[MAXCODE]; byte opcode; opcode=code[IP++]; switch (opcode) { case 0x00: ... case 0x01: ... ... case 0xFF: ... } each subcase will read additional bytes if needed to complete reading of the instruction. 12 Simplified logic (full execution) byte code[MAXCODE]; byte opcode; while(true) { opcode=code[IP++]; switch(opcode) { case 0x00: ... .... case 0xFF: ... } } Why is this simplified? • CS is also used. • how do we terminate? • how do we change the executed sequence? What do we do with this? Is switch efficient? Question: what would IP=k do (if such instruction exists). 13 FLAGS register Should be seen not as a single 16-bit register but as a collection of 16 1-bit registers. More important ones: ZF, SF, CF, DF Neither FLAGS nor the names above are keywords. 14 Segment registers : CS, DS, SS, ES – specify where segments (“parts”) of the program are located. • CS Code Segment • DS Data Segment • SS Stack Segment • ES Extra Segment 15 2.5 8086 registers – full list • AX Accumulator eXtended • AL Accumulator Low • AH Accumulator High • BX Base eXtended • BL Base Low • BH Base High • CX Count eXtended • CL Count Low • CH Count High • DX Data eXtended • DL Data Low • DH Data High • SI Source Index • DI Destination Index • BP Base Pointer • SP Stack Pointer • CS Code Segment • DS Data Segment • SS Stack Segment • ES Extra Segment • IP Instruction Pointer (not a keyword) • Flags Flags (not a keyword) 16 2.6 General addressing scheme Three distinct ways to address memory: • Absolute address : mem[offset] (flat model–generally cannot be done) • Segmented address : mem[f(seg,offset)] (done by hardware). Usual notation: ssss:oooo (hex digits) • Expressing segmented address in assembly syntax – to be covered later The f(seg,offset) function is mode-dependent. In real mode, f(seg,offset)=seg*16+offset. This allows to build 20 bit numbers out of 16 bit quantities. Examples 0000:0000 =⇒ 00000 1234:5678 =⇒ 179B8 + 12340 05678 -------- 179B8 The mapping is not one-to-one! Different (seg,offset) pairs may point to the same address. 0000:0100 =⇒ 00100 0010:0000 =⇒ 00100 Puzzle FFFF:FFFF =⇒ ????? (ref: A10 address line) == 17 Code segment is effectively mem[f(CS,i)], Data segment is effectively mem[f(DS,i)] Protected memory addressing function uses Segment Descriptor Table lookup. Fields include Base, Limit, Access Rights. Implication: instructions Segment<-value are very costly in protected mode. 18 2.7 Back to History: Original IBM PC (1981) Distorted: Timeline IBM’s brand recognition, along with a massive marketing campaign, ignites the fast growth of the personal computer mar- ket with the announcement of its own personal computer (PC). The first IBM PC, formally known as the IBM Model 5150, was based on a 4.77 MHz Intel 8088 microprocessor and used Mi- crosofts MS-DOS operating system. The IBM PC revolutionized business computing by becoming the first PC to gain widespread adoption by industry. The IBM PC was widely copied (“cloned”) and led to the creation of a vast “ecosystem” of software, pe- ripherals, and other commodities for use with the platform. Better: WIKIPEDIA article Additional link (on reaction): Orson Scott Card’s novel 19 https://en.wikipedia.org/wiki/Lost_Boys_(novel) https://en.wikipedia.org/wiki/IBM_Personal_Computer https://www.computerhistory.org/timeline/1981/ No OS ! Three options: • CP/M-86 (Control program for Microcomputers), see also DR page • UCSD p-System • PC DOS/MS DOS, see also 86-DOS See also: PL/M Introduction #2: History (cont) 1982 80186, 80188 1982-1991 80286 1985-2007 80386 80186 : almost not used in PC’s, many improvements in instructions (kept). 80286 : 16mb protected mode–promise not fullfilled. Real mode −→−→ Prot mode XENIX 20 https://en.wikipedia.org/wiki/Xenix https://en.wikipedia.org/wiki/Intel_80386 https://en.wikipedia.org/wiki/Intel_80286 https://en.wikipedia.org/wiki/Intel_80186 https://en.wikipedia.org/wiki/PL/M https://en.wikipedia.org/wiki/86-DOS https://en.wikipedia.org/wiki/IBM_PC_DOS https://en.wikipedia.org/wiki/UCSD_Pascal http://www.digitalresearch.biz/CPM.HTM https://en.wikipedia.org/wiki/CP/M 80386 • 32 bit • 2 additional modes • misc enhancements (debugging) 21 Doubling of registers again EAX = xxxxxxxxxxxxxxxx ahahahah alalalal 22 Flags register becomes EFLAGS : Additionally: • Control Registers CR0..CR7 (CR0=MSW(Machine Status Word) on 80286) • Test Registers TR0..TR7 • Debug Registers DR0..DR7 64 bit mode adds RAX,... 23 24 On paging Virtual memory allows to execute programs larger than physical mem- ory. Generally cannot be controlled by the programmer, paging algorithms are implemented by the OS Page replacement algorithms Application algorithms can be tailored for paging environment. Example: #define N 1024 int x[N][N],y[N][N],z[N][N]; int i,j; for (int i=0; i:] [] [;comment]

[:] [;comment]

where

– optional label (any identifier that is not a keyword or defined oth-
erwise).

– name of the instruction (keyword)

– comma-separated operands, if any; their number (0-3) depends on
the opcode

;comment – any text, ignored up to the EOL.

Trivial example:

30

lab: ; this line does not do anything

Symbolic representation of instructions corresponds to particular se-
quence of bytes which are actually executed.

3.2 The NOP instruction

NOP (do nothing)

Binary representation: one byte, hex value 90h.
Execution:
Before:
bb bb bb bb bb bb bb bb bb

↑IP

90 bb bb bb bb bb bb

After:
bb bb bb bb bb bb bb bb bb 90

↑IP

bb bb bb bb bb bb

IP is incremented by 1; no other register is changed

31

WHY have it?
• delay?

• padding for sloppy compilers

• patching (code deletion)

• reserving space for patching(code addition)

32

3.3 The MOV instruction

MOV dst,src (copy src to dst)

Example:

MOV AL,BL

;

; before : AL=3 BL=7

; after : AL=7 BL=7

Example:

MOV DL,CH

MOV DL,DL

MOV AX,CX

MOV AX,SP

MOV SP,CX ; very dangerous

MOV EDI,EDI

MOV EDI,ESP

MOV AL,CX ; illegal

MOV EDI,CX ; illegal

MOV IP,AX ; illegal

MOV AX,CS ; ok, special case (see below)

MOV DS,AX ; ok, special case (see below)

MOV CS,DX ; special case, illegal

MOV DS,EDI ; illegal

MOV CR0,EAX ; priveleged

MOV DR0,EAX ; ok, special case (see below)

RULE #1: size of src and dst must match

Most instructions support only gp regis-
ters

33

Argument types:

• (r)egister

• (m)emory

• (i)mmediate

• (s)pecial register

Argument size:

• (b)yte

• (w)ord

• (d)oubleword

• …

MOV DL,CH ; brr instruction

34

General template for 2-arg instructions:
r m i

r . . .
m . . .
i . . .

Move-specific template:
r m i s

r . . . .
m . . . .
i . . . .
s . . . .

35

Right now:

r m i
r X . .
m . . .
i . . .

Examples:

MOV AL,[100] ; brm

MOV BX,[200] ; wrm

MOV EDI,[400] ; drm

MOV [100],AL ; bmr

MOV [200],BX ; wmr

MOV [400],EDI ; dmr

Thus

r m i
r X X .
m X . .
i . . .

What does [#] really mean?
Answer: bytes beginning with byte #.
in

MOV AX,[100]

which byte goes where?

36

Examples:

MOV AL,1 ; bri

MOV DX,2 ; wri

MOV EDI,4 ; dri

r m i
r X X X
m X . .
i . . .

Examples:

MOV AL,97 ;

MOV AL,61h ; all four lines are equivalent

MOV AL,01100001b

MOV AL,’a’ ;

…

MOV AL,1000 ; ???

37

No storing into immediates, this would be like

1=x;

in C.
Thus:

r m i
r X X X
m X . .
i × × ×

Important: MOV with immediate is a fundamentally different operation
from the rr,rm, mr forms.

38

RULE #2: no memory-to-memory
(2 exceptions later)

Thus:

r m i
r X X X
m X × ?
i × × ×

MOV [100],1 ; should not compile

RULE #3: size must be known

Correct syntax:

MOV byte ptr [100],1

MOV word ptr [100],1

MOV dword ptr [100],1

MOV qword ptr [100],1 ; 64 bit only

MOV tbyte ptr [100],1 ; ???

What about

MOV [100],AL

MOV byte ptr [100],AL ; unneeded

MOV word ptr [100],AL ; will not compile

Final result:
r m i

r X X X
m X × X
i × × ×

39

Full table (MOV only):
r m i s

r X X X X
m X × X X
i × × × ×
s X X × ×

3.3.1 Examples

Here is how C/C++ assignments may be compiled:

char c1,c2; c1=c2;

——————-

MOV AL,c2

MOV c1,AL

short s1,s2; s1=s2;

——————-

MOV AX,s2

MOV s1,AX

int x,y; x=y;

——————-

MOV EAX,y;

MOV x,EAX;

40

int x,y,z; x=y=z;

——————-

MOV EAX,z;

MOV x,EAX;

MOV y,EAX;

int x; x=0;

——————-

MOV x,0;

int x,y,z; x=y=z=0;

——————-

MOV x,0

MOV y,0

MOV z,0

perhaps, a better implementation?

MOV EAX,0 ; could be even better

MOV x,EAX

MOV y,EAX

MOV z,EAX

41

Exercise: Exchange bytes in [100] and [101]

MOV AL,[100]

MOV AH,[101]

MOV [100],AH

MOV [101],AL

can this be done in fewer lines of code?

MOV AX,[100]

MOV [100],AH

MOV [101],AL

Note: Byte matters.

42

3.3.2 Byte

Consider:

MOV [100],AX

Does

LE,reversed AL go into [100] and AH into [101] or, instead:

BE,normal AH go into [100] and AL into [101]

More than you want to know on Endianness

LE,reversed : Intel, Dec

BE,normal : IBM mainframe, Motorola, Sun

Practical implications:

• it is important to know the endiness of the hardware and the data.

• it is important to be able to swap.

• it is important to be able determine the endiness. How?

Specific example of byte importance:

short s=1;

FILE *f=fopen(“try.dat”,”wb”);

if (!f) { … error handling … }

fwrite(&s,1,sizeof(s),f);

fclose(f);

Should create a 2-byte file try.dat.
Now,

43

https://en.wikipedia.org/wiki/Endianness

short s;

FILE *f=fopen(“try.dat”,”rb”);

if (!f) { … error handling … }

fread(&s,1,sizeof(s),f);

fclose(f);

cout << s; should print the value of s – indeed 1. But: what will happen if we run the Writing program on an Intel comp, move the data file to a Sun, and run the reading program there? Exercise: Can a high-level program be written that determines the of bytes? 44 3.4 The XCHG instruction XCHG dst,src (exchange src with dst) XCHG r m i r X X × m X × × i × × × Segment and other non-gp registers are not supported. The syntax and examples from MOV apply, except for non-use of non-gp registers and immediates. Examples (which of the following are valid?) XCHG AL,AH XCHG AX,SP XCHG EAX,EDI XCHG AL,[400] XCHG [400],AL ;same as above XCHG AL,DI XCHG DI,DS XCHG EAX,7 XCHG [100],[101] XCHG AX,AX ; nop? XCHG DI,DI ; nop? XCHG CL,CL ; nop? 45 Can a better version of byte swap program be now written? Better: MOV AX,[100] XCHG AL,AH MOV [100],AX Yet better: XCHG AX,[100] XCHG AL,AH XCHG [100],AX Q: can a shorter program be written (perhaps with another instruc- tion)? 46 3.4.1 Binary encoding of XCHG We only consider accumulator exchanges now. Instructions XCHG AX,reg are extra optimized in the intel architecture. 90h XCHG AX,AX 91h XCHG AX,CX 92h XCHG AX,DX 93h XCHG AX,BX 94h XCHG AX,SP 95h XCHG AX,BP 96h XCHG AX,SI 97h XCHG AX,DI Q: Why the # of registers is a power of 2 ? A: Because this allows to represent registers as in a fixed number of bits. 47 16-bit register representation: 000b AX 001b CX 010b DX 011b BX 100b SP 101b BP 110b SI 111b DI An emulator may use code like unsigned short regs[8]; #define AX regs[0] #define CX regs[1] #define DX regs[2] #define BX regs[3] #define SP regs[4] #define BP regs[5] #define SI regs[6] #define DI regs[7] Notes: • this is just an example! • 8 bit registers have their own 3-bit keys • 32 bit registers parallel 16 bit registers • 64 bit registers use 4-bit keys • The above code should define 8-bit regs properly (f.e. setting AX should set AL,AH too! 48 • The above code should be modified to support 32 bit registers texttt{XCHG AX,AX} is NOP. General encoding scheme of XCHG (with accumulator): 1 0 0 1 0 r e g This idea is used in other instructions. XCHG without accumulator uses a l8engthier encoding, with first byte 86h/87h. XCHG encoding 49 https://c9x.me/x86/html/file_module_x86_id_328.html NOTE: MOV has several different forms, including optimized forms for the accumulator. Similar scheme is used for the segment registers: 00b ES 01b CS 50 10b SS 11b DS 3.5 The ADD instruction ADD dst,src (dst += src) (proper name should be increment by.) General 2-operand instruction layout applies: ADD r m i r X X X m X × X i × × × Given that syntax of ADD is largely similar to MOV, the examples are sim- ilar: ADD AX,BX ADD EAX,ESP ADD DL,CL ADD AX,[100] ADD [150],EAX ADD AX,DS ; illegal ADD AX,DL ; illegal ADD [10],5 ; syntax error ADD word ptr [10],5 ; fine C example: int x,y,z; x=y+z; ----- MOV EAX,y 51 ADD EAX,z MOV x,EAX int x,y,z; x=x+y; ----- MOV EAX,y ADD x,EAX int x,y,z; x=x+25; ----- ADD x,25 (NOTE: size specification is not required if x is declared to be a double word) 52 Consider ADD AL,AL Generally, multiplication by 2 should not be done as multiplication (generally about 3x slower than addition). Writing int x; x=2*x; is wrong! One should use either addition or a shift (if available). (What is better depends on the situation and hardware). Q: Should we replace multiplication by addition in : int f(int); int x; x=2*f(x); More simple examples: Consider ADD AL,0 ; nop ? ADD AL,1 ; increment ? ADD AL,-1 ; decrement ? ADD AL,AL ; double 53 MOV AL,1 ; AL=1 ADD AL,AL ; AL=2 ADD AL,AL ; AL=? ADD AL,AL ; AL=? ADD AL,AL ; AL=? ADD AL,AL ; AL=? ADD AL,AL ; AL=? ADD AL,AL ; AL=? ADD AL,AL ; AL=? ADD AL,AL ; AL=? MOV AL,1 ; AL=1 binary |00000001 ADD AL,AL ; AL=2 binary |00000010 ADD AL,AL ; AL=4 binary |00000100 ADD AL,AL ; AL=8 binary |00001000 ADD AL,AL ; AL=16 binary |00010000 ADD AL,AL ; AL=32 binary |00100000 ADD AL,AL ; AL=64 binary |01000000 ADD AL,AL ; AL=128 binary |10000000 ADD AL,AL ; AL=0 binary 1|00000000 << overflow ADD AL,AL ; AL=0 binary 00000000 Is this an assembler problem ? unsigned char c; c=1; printf("%d",c); c=c+c; printf("%d",c); c=c+c; printf("%d",c); c=c+c; printf("%d",c); c=c+c; printf("%d",c); c=c+c; .... Note: if you like C++ and cout<<, make sure to cast! 54 Q: what would be the output if we use char rather than unsigned char? Is this a size problem ? Try MOV AX,1 ADD AX,AX ... OR MOV EAX,1 ADD EAX,EAX ... OR C/C++ versions. 55 Unlike MOV and XCHG, ADD is an arithmetic instruction: it sets flags. Warning: the discussion of the flags is slightly simplified, I’m not con- sidering the OF. Thus there are slight differences between the behavior described and the actual behavior of the processor. This makes no differ- ence for most programs, but there are rare instances where this matters. In particular, I will consider JS and JL as equivalent, in reality they are not exactly the same. ZF Zero Flag SF Sign Flag CF Carry Flag OF Overflow Flag ; ZF SF CF MOV AL,1 ; AL=1 binary |00000001 ? ? ? ADD AL,AL ; AL=2 binary |00000010 0 0 0 ADD AL,AL ; AL=4 binary |00000100 0 0 0 ADD AL,AL ; AL=8 binary |00001000 0 0 0 ADD AL,AL ; AL=16 binary |00010000 0 0 0 ADD AL,AL ; AL=32 binary |00100000 0 0 0 ADD AL,AL ; AL=64 binary |01000000 0 0 0 ADD AL,AL ; AL=128 binary |10000000 0 1 0 ADD AL,AL ; AL=0 binary 1|00000000 1 0 1 << overflow ADD AL,AL ; AL=0 binary 00000000 1 0 0 WARNING: This is slightly simplified (there is also OF) Flags can be used to • implement conditionals (IF, WHILE,...) • implement “long” arithmetic • check for overflow 56 3.5.1 Overflow detection unsigned int x,y,z; .... x=y+z; // concern about overflow unsigned int x,y,z; .... y=0x90000000; z=0x90000000; x=y+z; // overflow will occur here, result will be incorrect. can we check for it like this? unsigned int x,y,z; .... if (y+z>0xFFFFFFFF)

error(“overflow”);

x=y+z;

Correct way:

unsigned int x,y,z;

….

if (y>0xFFFFFFFF-z)

error(“overflow”);

x=y+z;

57

Exercise: what about signed types?

A: you will need to check both for “positive” overflow (adding two large
positive number) and for the “negative;; overflow (adding two large nega-
tive numbers).

In assembler, flags report overflow condition – no need for extra check-
ing!

3.6 The SUB instruction

SUB dst,src (dst -= src)

(proper name should be decrement by.)
General 2-operand instruction layout applies:

SUB r m i
r X X X
m X × X
i × × ×

Given that syntax of SUB is identical to ADD, syntax examples are similar
and omitted.

ADD AX,100

SUB AX,-100 ; same as above

;

ADD AX,-100

SUB AX,100 ; same as above

What do these instructions do?

ADD AX,0

SUB AX,0

58

What does this instruction do?

SUB EAX,EAX

Answer: most efficient way to zero up a register.

What is the difference between the two instructions below?

SUB EAX,EAX

MOV EAX,0

Answer: the former is more efficient; the latter is rarely used, only in
the situations when flags must be preserved. (an example, involving an if,
will be given later.)

Revising example we saw above, more efficient code:

int x,y,z; x=y=z=0;

——————-

SUB EAX,EAX

MOV x,EAX

MOV y,EAX

MOV z,EAX

with SUB, Carry flag indicates borrowing.

59

3.7 The INC instruction

INC dst (dst++)

Do we write

ADD AX,1

ADD byte ptr [10],1

A: yes, we can, but usually we would use the optimized form
General 1-operand instruction layout applies:

INC r m i
X X ×

(Same format applies to three more instructions, explained later).
Register form is optimized to one-byte encoding:

40h INC AX

41h INC CX

42h INC DX

43h INC BX

44h INC SP

45h INC BP

46h INC SI

47h INC DI

Other forms of INC are encoded in lengthier way beginning with 0FFh
and 0FEh.

Warning: this encoding applies to BOTH 16 and 32 registers!

What is better?

60

INC AX

;or

ADD AX,2

A: former. But do not do this with memory arguments.

61

3.8 The DEC instruction

DEC dst (dst- -)

DEC r m i
X X ×

Comments on INC above are applicable.
Optimized form:

48h DEC AX

49h DEC CX

4Ah DEC DX

4Bh DEC BX

4Ch DEC SP

4Dh DEC BP

4Eh DEC SI

4Fh DEC DI

Other forms of DEC are encoded in lengthier way beginning with 0FFh
and 0FEh.

62

3.9 The NEG instruction

NEG dst (dst=-dst)

NEG r m i
X X ×

How to negate without using NEG?

NEG EAX

;is the same as

SUB EBX,EBX

SUB EBX,EAX

MOV EAX,EBX

Solve equation x = −x ?

MOV AL,x

NEG AL

if AL did not change, does this mean x is 0?

No, it is either 0 or 128!
For short, the solutions are 0 and 215=8000h, for 32-bit they are ….

63

#include

int main() {

short s,s1;

s=0; s1=-s; if (s!=s1) printf(“for s=%d, changes;n”,s);

else printf(“for s=%d, does not change;n”,s);

s=1000;s1=-s; if (s!=s1) printf(“for s=%d, changes;n”,s);

else printf(“for s=%d, does not change;n”,s);

s=0x8000;s1=-s; if (s!=s1) printf(“for s=%d, changes;n”,s);

else printf(“for s=%d, does not change;n”,s);

return 0;

}

results in

for s=0, does not change;

for s=1000, changes;

for s=-32768, does not change;

3.10 The CMP instruction

CMP dst,src (dst-src)

General 2-operand instruction layout applies:
CMP r m i
r X X X
m X × X
i × × ×

Conditionals (if,while) are implemented in two stages:

• compute flags

• do (or not) the operation depending on a flag(or flags) – this is later.

64

Examples:

; if (x==0) …

;

; compute ZF from value of x

; if (x!=0) …

;

; compute ZF from value of x

; if (x<0) ... ; ; compute SF from value of x ; if (x<=0) ... ; ; compute SF and ZF from value of x What about ; if (x==y) ... ; ; to set the flags, we compute the difference ; if (x==y) ... ; MOV EAX,x SUB EAX,y ; this sets ZF for our use 65 ; if (x

Continue to order Get a quote

Calculate the price of your order

Type of paper needed:

Pages:

550 words

Academic level:

We'll send you the first draft for approval by September 11, 2018 at 10:52 AM

Total price:

$26

The price is based on these factors:

Academic level

Number of pages

Urgency

Basic features

Free title page and bibliography
Unlimited revisions
Plagiarism-free guarantee
Money-back guarantee
24/7 support

On-demand options

Writer’s samples
Part-by-part delivery
Overnight delivery
Copies of used sources
Expert Proofreading

Paper format

275 words per page
12 pt Arial/Times New Roman
Double line spacing
Any citation style (APA, MLA, Chicago/Turabian, Harvard)

assembly programming

Calculate the price of your order

Our guarantees

Money-back guarantee

Zero-plagiarism guarantee

Free-revision policy

Privacy policy

Fair-cooperation guarantee