Home Download Sample code

Rational PIC Assembler

Joe Bentley
rbentley@atlas.kennesaw.edu
31 October 2006

This assembler generates code compatible with Microchip's midline microcontrollers but is incompatible with their assembler. It should feel familiar to any PC assembly programmer. The instruction mnemonics and operand order are Intel style (i.e. 'right', as opposed to 'wrong'). The code is distributed under version 2.0 of the GPL license.

Table of Contents

Additional requirements

You will also need

Command line syntax

	ratasm [ options ] [ filename ]
	   options
	      -c | --console
	         read input from stdin
	         write object to stdout
	         write listing to stderr
	      -d | --device name
	         specify the target processor
	      -l | --listing filename
	         override the default listing filename
	      -l- | --no-listing
	         disable listing
	      -o | --object filename
	         override the default object filename
	      -I | --include directory
	         specify include directory
	      -v | --version
	         output assembler version
	      -h | --help
	         display help

Source file format

The input is a sequence of lines each of which contains one or more of the following fields

	label	instruction	expression	; comment

The label and comment are optional. The expression required depends on the instruction. Operands in an expression list are given target first and source second.

The assembler is case sensitive, even for instructions. However, both uppercase and lowercase versions of the reserved words are predefined.

Expressions

Operands

Valid operands to instructions include

Constants

Hex values can be specified with C-style "0x"[[:xdigit:]]+. Binary values can be specified with "0b"[01]+. Decimal values require no prefix as decimal is the default base. For compatibility with Microchip header files, hex values may also be specified with "H'[[:xdigit:]]+'". For example:

	; four copies of the same value
	dw	0x1fe, 0b000111111110, 510, H'01FE'

String constants are specified by enclosing zero or more characters and escaped characters within matching single or double quotes. Valid escape characters are '\n', '\r', '\t', '\b', '\0', '\x'[[:xdigit:]][[:xdigit:]], and '\\'. String constants generate one character constant for each character in the string. There is no trailing zero stored. String constants of one character can be used anywhere an integer can. For example:

	mov	w, 'h'
	add	w, "a" - "A"
	db	"Hello 'world'\n", 0, 'a', 'b', '\r', '\n', '\t'

Labels

A label is a sequence of alphanumeric characters (including underbar and period) that refers to a constant position in the emitted code. If a label is in the first column of a line, then that label will be defined equal to the org position of that line. Every nonlocal label must be defined exactly once. Labels do not have colons.

Labels can appear anywhere in an expression than an integer can. They evaluate to the org of the line that defines them. They can be referenced before their definition, and the assembler will make however many passes are necessary to resolve all the labels. The symbol '$' is a special label that refers to the org position of the current line. It can appear anywhere in an expression that a label can, but cannot appear in the first column of a line.

A label where the first character is a period is a local label. The scope of a local label is from the line with the previous nonlocal label up to, but not including, the line with the next nonlocal label. For example:

		jmp	.1
	.1	jmp	.1	; first local
	foo	jmp	.1
	.1	jmp	.1	; second local
		jmp	.1
	bar	jmp	.1
	.1	jmp	.1	; third local
		jmp	.1

In this example, the first local is only visible for the first two lines before the declaration of 'foo'. The second local is visible from the declaration of 'foo' to before the declaration of 'bar'. The third local is visible from the declaration of 'bar' to the end. Locals cannot be referenced outside of their scope.

Operators

To make the expression syntax easy to learn and remember, the operators and precedence levels were chosen to be as close as possible to C.

Directives

Data can be declared. The declarator takes the place of the opcode and is followed by a list of expressions. Each expression corresponds to one word in the output code regardless of the declarator type.

db
each operand is AND-ed with 0xff before being stored
dw
full 14 bit word definition
dt
each operand is AND-ed with 0xff and OR-ed with 0x3400 (the return-with-value opcode). This allows generation of case tables. You can add the accumulator ('w') to the offset of the table. The processor will branch to the location in the table and return with an eight bit result in the accumulator

For instance:

		db	1,2,3
		dw	0x3fff, 0x3ff * 16 + 15, -1
		dt	0b001, 0b010, 0b100

Equates are named expressions. They can be defined with 'equ'. For example:

	led_1	equ	0x100 | 1
	led_2	equ	0x100 | 2
	combo	equ	( led_1 ) | ( led_2 )

Defines are named sequences of tokens. They can be defined with 'define'. They differ from equates in that an equate must be a valid expression and is evaluated at the point of definition. A define is substituted into each target. For example:

		org	0
	msg1	db	'hello', 0
	size1	equ	$-msg1
	msg2	db	'world', 0
	size2	define	$-msg2

		org	0x100
		dw	size1, size2

In this example, 'size1' is evaluated at its point of definition to the integer 6. Every subsequent occurence of 'size1' will be replaced with the integer 6. But 'size2' is replaced with the text '$-msg2' leading to different values stored in the final declaration.

Other directives:

org expression
Set the target location to store emitted code
include "filename"
Save current assembly context and begin assembling specified file
messg string
Output string constant to stdout
error string
Output string constant to stderr
nolist
Turn listing output off
list
Turn listing output back on. 'nolist' / 'list' may be nested
noexpand
Turn off macro expansion
expand
Turn macro expansion back on. 'noexpand' / 'expand' may be nested
__maxram, __badram
Ignored so as not to generate an error assembling Microchip include files
if expression
Assemble block only if expression evaluates to true
ifdef symbol
Assemble block only if symbol is defined
ifndef symbol
Assemble block only if symbol is not defined
elif expression
Assemble block only if previous sibling blocks have not been assembled and expression is true
else
Assemble block only if previous sibling blocks have not been assembled
endif
End the block

Macros

The assembler has a relatively sophisticated pattern-matching macro system. Macros can be defined with the syntax:

	name	macro	expression_list
		.. body lines ..
		endm

They can be instantiated like regular instructions. They can be overloaded, and can even have the same name as instructions. When defined, unknown symbols in the expression list are taken to be arguments. When instantiated, the defining expression is matched against the instance expression to determine substitutions for the arguments. Nonargument tokens in the defining expression must match tokens exactly in the instance expression. For example:

	foo	macro	[x], y
		mov	w, [y]
		dw	x+y
		endm
		foo	[100],200
expands to
		mov	w, [200]
		dw	100+200

To make the assembler more comfortable to i86 assembly programmers, many instructions are defined as macros. For instance, there is no instruction in the PIC instruction set to increment the accumulator directly. So there is a macro defined in "include.asm":

	inc	macro	w
		add	w, 1
		endm

When the opcode 'inc' is encountered, the assembler will first match the expression list against the macro since it was declared more recently than the native instruction. Failing that, it will match against the native instruction.

Many of the instructions that are implemented as macros require equates defined in a Microchip include file. These files can be found in the gputils package or at Microchip.

Instructions

add
Add target to source

Examples:

	add	w, [123]	; w += memory[123]
	add	[123], w	; memory[123] += w
	add	w, 123		; w += 123
	
and
Binary AND the source to the target

Examples:

	and	w, [123]	; w &= memory[123]
	and	[123], w	; memory[123] &= w
	and	w, 123		; w &= 123
	
bclr
Clear a bit in a register

Example:

	bclr	[123], 7	; memory[123] &= 0x7f
	
bset
Set a bit in a register

Example:

	bset	[123], 7	; memory[123] |= 0x80
	
btsz
Skip the next instruction if a specified bit is false

Example:

	btsz	[123], 0	; if( !( memory[123] & 1 ) ) skip()
	
btsnz
Skip the next instruction if a specified bit is true

Example:

	btsnz	[123], 7	; if( memory[123] & 128 ) skip()
	
call
Save program counter, branch to target

Example:

	call	somewhere	; somewhere()
	
clc
Clear carry flag

Example:

	clc			; /* macro */
	
cli
Clear interrupt flag

Example:

	cli			; /* macro */
	
clr
Zero the target

Examples:

	clr	w		;  /* macro */
				; w = 0
	clr	[123]		; memory[123] = 0
	clr	w, [123]	; not useful
	
clrwdt
Clear the watchdog timer

Example:

	clrwdt
	
cmc
Complement carry flag

Example:

	cmc			; /* macro */
	
dec
Decrement the register

Examples:

	dec	w		;  /* macro */
				; --w
	dec	[123]		; --memory[123]
	dec	w, [123]	; w = memory[123] - 1
	
decsz
Decrement the register and skip if zero

Examples:

	decsz	w		; /* macro */
				; if( !--w ) skip()
	decsz	[123]		; if( !--memory[123] ) skip()
	decsz	w, [123]	; if( !( w = memory[123] - 1 ) skip()
	
inc
Increment the register

Examples:

	inc	w		;  /* macro */
				; ++w
	inc	[123]		; ++memory[123]
	inc	w, [123]	; w = memory[123] + 1
	
incsz
Increment the register and skip if zero

Examples:

	incsz	w		; /* macro */
				; if( !++w ) skip()
	incsz	[123]		; if( !++memory[123] ) skip()
	incsz	w, [123]	; if( !( w = memory[123] + 1 ) skip()
	
ja | jnbe
Jump if above

Example:

	ja	somewhere	; /* macro */
	
jae | jnb | jnc
Jump if above or equal

Example:

	jae	somewhere	; /* macro */
	
jb | jnae | jc
Jump if below

Example:

	jb	somewhere	; /* macro */
	
jbe | jna
Jump if below or equal

Example:

	jbe	somewhere	; /* macro */
	
je | jz
Jump if equal

Example:

	je	somewhere	; /* macro */
	
jne | jnz
Jump if not equal

Example:

	jne	somewhere	; /* macro */
	
jmp
Branch to target

Example:

	jmp	somewhere	; goto somewhere
	
loop
Decrement and jump to target if not zero

Examples:

	loop	w, somewhere		; /* macro */
					; if( !--w ) goto somewhere
	loop	[123], somewhere	; /* macro */
					; if( !--memory[123] ) goto somewhere
	
mov
Copy source to target

Examples:

	mov	w, [123]	; w = memory[123]
	mov	[123], w	; memory[123] = w
	mov	w,123		; w = 123
	mov	[123], 45	; /* macro */
				; memory[123] = 45
	
nop
Do nothing for one cycle

Example:

	nop
	
not
One's complement the register

Examples:

	not	w		;  /* macro */
				; w = ~w
	not	[123]		;  /* macro */
				; memory[123] = ~memory[123]
	not	w, [123]	;  /* macro */
				; w = ~memory[123]
	
or
Binary OR the source to the target

Examples:

	or	w, [123]	; w |= memory[123]
	or	[123], w	; memory[123] |= w
	or	w, 123		; w |= 123
	
ret
Return from subroutine, optionally with a value in accumulator

Examples:

	ret			; return
	ret	123		; w = 123
				; return
	
reti
Return from interrupt

Example:

	reti
	
rlc
Rotate the target left through the carry flag

Examples:

	rlc	w		;  /* macro */
				; new_carry = ( w & 0x80 ) != 0
				; w = ( w << 1 ) | carry_flag
				; carry_flag = new_carry
	rlc	[123]		; new_carry = ( memory[123] & 0x80 ) != 0
				; memory[123] = ( memory[123] << 1 ) | carry_flag
				; carry_flag = new_carry
	rlc	w, [123]	; new_carry = ( memory[123] & 0x80 ) != 0
				; w = ( memory[123] << 1 ) | carry_flag
				; carry_flag = new_carry
	
rrc
Rotate the target right through the carry flag

Examples:

	rrc	w		;  /* macro */
				; new_carry = w & 1
				; w = ( w >> 1 ) | ( carry_flag ? 0x80 : 0 )
				; carry_flag = new_carry
	rrc	[123]		; new_carry = memory[123] & 1
				; memory[123] = ( memory[123] >> 1 ) | ( carry_flag ? 0x80 : 0 )
				; carry_flag = new_carry
	rrc	w, [123]	; new_carry = memory[123] & 1
				; w = ( memory[123] >> 1 ) | ( carry_flag ? 0x80 : 0 )
	
shl
Shift left

Examples:

	shl	w		;  /* macro */
				; w <<= 1
	shl	[123]		;  /* macro */
				; memory[123] <<= 1
	shl	w, [123]	;  /* macro */
				; w = memory[123] << 1
	
shr
Shift right

Examples:

	shr	w		;  /* macro */
				; w >>= 1
	shr	[123]		;  /* macro */
				; memory[123] >>= 1
	shr	w, [123]	;  /* macro */
				; w = memory[123] >> 1
	
sleep
Put the processor to sleep until woken

Example:

	sleep
	
stc
Set carry flag

Example:

	stc			; /* macro */
	
sti
Set interrupt flag

Example:

	sti			; /* macro */
	
sub
Subtract the accumulator from a register or literal

Examples:

	sub	w, [123]	; w = memory[123] - w
	sub	[123], w	; memory[123] -= w
	sub	w, 123		; w = 123 - w
	
swap
Swap nibbles in source and store in target

Examples:

	swap	[123]		; hi = ( memory[123] >> 4 ) & 0x0f
				; lo = memory[123] & 0x0f
				; memory[123] = ( lo << 4 ) | hi
	swap	w, [123]	; hi = ( memory[123] >> 4 ) & 0x0f
				; lo = memory[123] & 0x0f
				; w = ( lo << 4 ) | hi
	
test
Set the Zero flag if the source is equal to zero

Example:

	test	[123]		;  /* macro */
				; z_flag = ( memory[123] == 0 )
	
xchg
Exchange the two operands

Examples:

	xchg	w, [123]	;  /* macro */
				; int i = memory[123]
				; memory[123] = i
				; w = i
	xchg	[123], w	;  /* macro */
				; int i = memory[123]
				; memory[123] = i
				; w = i
	
xor
Binary XOR the source to the target

Examples:

	xor	w, [123]	; w ^= memory[123]
	xor	[123], w	; memory[123] ^= w
	xor	w, 123		; w ^= 123
	

Known bugs, missing features and future direction

Changing the org in the middle of emitted code in a macro can cause the latter emitted code to not appear in the listing file if macro expansion is turned off. The assembler makes an effort to migrate emitted code from nonlisted lines to listed lines, but will get confused if there is more than one noncontiguous section of code. For example:

		org	0
	foo	macro
		dw	1
		org	100
		dw	2
		endm

		noexpand
		foo

In this example, '1' will show up in the listing as code for 'foo'. However, '2' will not appear. This bug does not affect the object output.

Strings are stored internally as C-strings. Embedded nuls in string constants terminate the string. Neither the first nul, nor anything after it, are stored. This could be fixed by using a separate string data type.

The preprocessor is currently integrated with the parser. I would like to make it a separate layer. Doing so will allow implementation of an 'undef' directive and 'defined' predicate.

Local labels defined inside of a macro are translated to have two leading periods. It is recommended that user-defined local labels not have more than one leading period to avoid confusing the assembler.

There is no capability for structures or other complex data types. I don't think this is a serious drawback for an assembler for microcontrollers.

A 'repeat' directive would be handy and not difficult to implement. I also intend to write a disassembler and instruction generator to automate testing.