Home and Links
 Your PC and Security
 Server NAS
 DVD making
 Raspberry Pi
 PIC projects
 Other projects
 Next >>

Why we need new PIC mnemonics

MPASM is the PIC code Assembler. It's available (only) as part of the MPLAB-X "IDE" (Integrated Development Environment) package (370Mb or so) from Microchip. You can run the Assembler without the rest of the IDE (which is the usual over-bloated commercial focused rubbish, but at least IT'S FREE !)

Few of the PIC instruction "names" (mnemonics) make any sense and there are way too many 'special cases' - for example, why is it necessary to define 3 separate 'CLR' instruction mnemonics (CLRF, CLRW, CLRWDT), when "CLR x" (where x= one of 'File'(register)/'W reg'(Accumerlator)/'WDT' (Watch-Dog-Timer)) would have sufficed ? There are also 3 different 'copy W to destination' instructions (MOVF, TRIS and OPTION) and a whole set of 'W reg' instructions that 'duplicate' the function of the 'f reg' instructions.

Some instructions are misleading (the 'W reg' is not a register, 'COMF' as actually INVERT (and not Compliment, which most programmers understand to mean "2's Complement") and SUBWF is 'W=reg-W' or 'reg=reg-W' and never W-f (see later for why this matters)) and some are just 'missing' (for example, you can bit set/clr or test and skip on a 'f' (register) bit, but not on the 'W register' - or can you ? (see AND/OR later)

Perhaps it was the 'marketing' department ? after all, '33 (baseline CPU) instructions' (or 35 or 49) seems so much more impressive than '20 with lots of special cases'

Well, it's my toy and I'm not putting up with it ! No more 'illogical' mnemonics for me - and whilst I'm at it, I want those 'missing' instructions too !

Fortunately, MPASM is a 'Macro Assembler', which means I can just define my own 'instruction set' by renaming and combining the '33'** non-intuitive originals == so that's exactly what I have done ! For more, read on below

**Note. The baseline CPU supports 12 bit op-codes with only 33 instructions - however this really only applies to the 'ancient' 4MHz limited 16F5x PIC's. Almost all other PIC's (including most 12F and 16Fxxx) have 14 bit op-codes using the 'mid-range' 35 instruction CPU or the 'enhanced mid-range' 49 instruction set CPU (most 12Fxxx and 16F1xx devices)
Note also, I ignore the 'advanced' PIC18Fxxx with it's 16 bit instruction (49 op-code) CPU's and all later devices. All the 16 bit op-code and (16bit ALU PIC's) have sufficient program space to support 'C' coding and it's highly unlikely you will ever need to program any of these in Assembler

Why replace the MPASM mnemonics ?

There's nothing much wrong with MPASM itself = it's the stupid, non-intuitive brain numbing 'mnemonics' (names) chosen for the assembly code instructions that make the PIC such a real pain to program by any 'English Language' speaker

I knew I would have problems with the Assembler mnemonics when I first saw the "Instruction set summary" (WTF are 'W' and 'f' ???). Well, it turns out that in 'PIC speak' register addresses are denoted with an 'f' (for 'file') and the Accumulator is known as the 'W register' (despite the fact that it is NOT a 'register' = if they had to find a name other than the 'Accumulator', 'W latch' would have been a better (more descriptive) choice) ...

If I was a 'conspiracy theorist' I might wonder if the 'mnemonics' had been chosen deliberately to make life as hard as possible for the assembly language user - or perhaps Microchip felt that using 'r' for 'register' and 'a' (or 'Acc') for accumulator would somehow 'infringe' Intel's 'copyrighted' x86 instruction set ??

What's more, I was taught to 'think' in the perfectly logical order sequence of 'start to end', 'left to right', 'cause before effect' and (especially) 'source before destination' (for why this fundamental to our grasp on reality and understanding of the Universe, see wikipedia, Causality).

Admittedly this is not the case for every one - some cultures (for example, Japan) often 'think' in what I would regard as 'reverse' (me 'What is this ?', my Japanese friend "This is what ?" = you might think these are the same ("commutative") .. and most of the time that's correct .. however whilst I might say "Is this the correct button to push ?" my foreign language co-worker might say "This, the correct button to push, is ?" (and well before he reached 'is ?', I would have pushed the button .. )
Why does it matter ? Well, subtraction is NOT "commutative" (i.e. one minus zero (1 - 0) is NOT the same as zero minus one (0 - 1).
So order counts. In 'PIC speak' the subtract instruction is "SUBWF f,d", where 'd' is the 'destination'. To the Logically thinking this means 'Subtract source from destination'.
Thus, when the destination is f, we expect f = f-W and when the destination is W, we expect W = W-f (of course).
WRONG .. it's ACTUALLY ALWAYS f-W (so f=f-W or W=f-W) .. or, to be exact, it's actually (-W)+f, which is significant, as you will discover when you look at how the Status 'Carry' (and 'Digit Carry') bits are set on the result !

Then there's the problem of multiple different CPU's. There are (at least) 5 different '8 bit' PIC CPU's (baseline, extended (or enhanced) baseline (16F527, 16F570 with 4 level stack, 1 interrupt and MOVLB (bank select)), mid-range, enhanced and 'PIC18'), so, unless you code for the 'lowest possible common denominator' your code can't be 'transported' = and if you do 'code down to the lowest', then your code will never be 'optimal' for the higher end CPU's ..

.. unless, of course, you define your own instruction set and code your own macro 'interpreter' = which means you can can 'tailor' the coding of your own instructions to whichever CPU is being used !

My final irritation is how some English words with well established meanings have been used as a PIC assembler name. Unless you remember how the meaning has been 'bent' in PIC-speak, you will inevitably end up confused and frustrated when your code fails to 'work' as expected.

SWAP, for example, means two 'things' will be swapped - so the Accumulator and a register, perhaps ? no, it means 'rotate by 4 bits' (or, in PIC speak, 'nibble swap').
COMP is not "2's complement" - and Rotate uses the Carry as a 9th bit (rather than rotating within the 8 bit byte).
I also find 'MOVE' a bit confusing, in so far as it suggests that a value is 'removed' from the 'source' (so 0 is left behind) = why they chose 'MOVE' (rather more obvious - and less doubt creating - 'COPY') is another one for the Conspiracy Theorists ...
Then we have INDF and FSR .. FSR (which I'm guessing means 'File Select Register') is the indirect register POINTER (address latch), and selecting INDF will get you access to the register 'pointed to' by the FSR.
Indirect addressing is always a pain to get your head around, so having nice clear names is vital - for example 'Indirect register Pointer Latch' (IregPL) and 'Indirect Regsiter' (Ireg) ...
Finally there is the OPTION command - no, it's nothing to do with optional instructions or even CPU 'modes' = it copies the 'W register' to the pre-scaler (counter) control bits latch (who on earth is ever going to remember that ??)

Of course some things just can't be 'fixed'

For example, not all the bits in the 'special' registers exist, and some that do can't be written (others can't be read) = and reading the i/o port 'register' doesn't (it reads the actual i/o pins, not the register) whilst reading the Tri-state pin control latch just isn't supported

I find it hard enough to keep my mind on the 'goal' I'm trying to achieve without having to 'think backwards' (or remember that SUBWF w means SUBFW) when trying to program the d*mn chip.

Fortunately MPASM is a 'macro assembler' = so I can define my own mnemonics as macros and need never bother to learn the 'real' names at all !. For those thinking of 'doing themselves', here's the details of why (and how) I did it = just read on ... (or go to the next page and download my macro include file)

Always use 'easy to remember' names - and a systematic approach to parameters

You can choose whatever new instruction 'names' (and 'parameter' sequence order) you like, but to reduce doubt it's a 'good idea' to adopt a 'systematic approach'.

So, for example, if all your new instruction names are (say) 3 letters long, then we know the 'GOTO' replacement can't be JUMP or even BRANCH - it must be JMP (or BRA). On the other hand, if all are 4, then it's GOTO or JUMP (and not JMP or BRA).
Next we come to parameter order. I ALWAYS adopt the 'from' (source) 'to' (destination) approach.
When multiple byte registers or bit numnber etc. has to be specified, I adopt 'high to low' order = so msb,lsb and register before bit (so it's reg,bit not bit,reg)
Finally, if an instruction allows 'source=destination', then I accept that the user may simply 'leave out' the 'duplicate' value (destination)

It's also a 'good idea' to eliminate (or at least minimise) the 'special cases'

Thus, for example, MOVWF, OPTION and TRIS are all replaced by the more generic "COPY source,destination" instruction (which covers COPY Acc,reg and COPY reg,Acc and even COPY regA,regB).
However, I also support a 'MODE' command (for those who wish to 'highlight' use of the TRIS/OPTION case)

Even after careful consideration of names, it's still quite possible that the user will get lazy and use the 'wrong' instruction

The good thing about Macro's is that you can just 'redirect' any likely mistaken names or usage to the 'correct' Macro.
So for example, to load a value into the Acc or a register, my new instruction is 'LOAD 0xNN,destination'.
However, if you forget and use "MOVE 0xNN,destination" you will discover 'MOVE' is redirected to 'COPY'. Tnen, when 'COPY 0xNN,dest' is 'detected' (by my 'COPY s,d' Macro), it will simply cope (i.e it will find source is a value (0xNN) and redirect again, this time to the LOAD macro)

Use descriptive names

It is VITAL that the instruction names used are not only easy to remember but also 'descriptive' (rather than misleading)

Take the PIC-33 'COMF', the 'complement' instruction. Given that the PIC CPU ALU use 2's complement arithmetic for SUBtract (as stated in the Specification sheets), most people would be justified in expecting 'COMF' to mean "2's complement".
WRONG (and gotcha) = the COMF instruction ACTUALLY performs a "1's complement" or (as is usually known in the real world), a bit-wise INVert (or 'flip', if you prefer)
Further the COMF instruction only inverts registers - so you might think that there is no (easy, single instruction) way to invert the Accumulator .. but actually, there is ... in 'PIC speak' this is written as "XORLW 0xFF" :-)
My 'INV s,d' Macro deals with all this ..

Define 'complete' instructions

When defining Macro's for a logical instruction set, make sure to 'cover all cases'.

To place a value into a register, the PIC-33 set defines multiple hard-to-remember 'special case' instructions. These are CLRF, CLRW, CLRWDT (place 0x00 into a 'file', W 'reg' or WatchDogTimer), MOVF, MOVWF, OPTION, TRIS (copy 'file' to W (or file to itself !) or W to a 'file', Option latch or one of the TRIS latches) and the MOVLW (place value into W). All 8 of these 'limited special cases' can be replaced with a single generic "COPY source to destination" macro !

The 'sources' are a specified value ('literal' in PIC-speak) the Accumulator ('W reg' in PIC-speak) or a Register ('file' in PIC speak). Possible destinations would be Accumulator, Register or one of the 'control' latches (TRIS etc).
Of course not every 'combination' can be supported in a single PIC-33 instruction, however there is nothing to stop you writing a Macro that expands into multiple instructions - see, for example, "COPY regX,regY".
Some combinations can't be supported at all = for example, the WDT latch can only be cleared, but there's nothing to stop the user specifying WDT as a general 'destination'.
However the advantage of a Macro is that it can be written to 'detect' the 'special case' and use the minimum code (for example, "COPY 0xNN,RegX" can detect when 0xNN is zero and use the CLRF PIC-33 instruction rather than have to save the Acc, load 0xNN to Acc, copy Acc to RegX and then restore Acc). When a combination is found that can't be resolved, the Macro can 'throw an error' (and insert a 'NOP').

Each instruction that deals with 'source' and 'destination' should be defined in the exact same way - i.e. ADD, SUBtract, INCrement, DECrement, AND, OR, XOR, 'ROTate with Cy' are all cases of the generic "action source,destination" instruction.

The PIC SWAPF instruction is actually a 'special case' of a more generic 'Rotate N bits, source to destination' instruction, however SWAPF has no effect on Cy (unlike RLF, RRF)
So, rather then create a (complex) 'Rotate n bits' instruction**, I decided to use "NIBS s,d" (Nibble Swap) - where s=reg, d=Acc or reg - which is (almost) a 'special case'.
**There's also a problem of names = I use ROTR, ROTL (Rotate Right, Left) for rotate via Cy. A 'rotate N bits in place' Macro would have to be something totally different - say 'nRL' (for n bits left)
Update: Added nRR reg,n (and nRL) costing 2,4,3,1 instructions (for 1,2,3,4 bits) = is this the 'minimum (instruction) count' solution for each case, ior can you do better ?

There are a number of instructions which only have one parameter - for example, where 'source==destination' (so "action parameter") = INCrement, DECrement, and where there is a 'destination' only = CALL, GOTO and finally RETurn (with 0xNN, however a macro would support 'RETurn with Register')

When the macro processes the 'generic case', sometimes it will generate more than one PIC-33 instruction - for example, "INC Acc" is "SUBWF PCL,0", "ADDWF PCL,0" (and "DEC Acc" is "ADDWF PCL,0", "SUBWF PCL,0") which sets the Z,Dc and Cy flags (as expected).

Instructions that deal with 'bits' (Bit set, bit clear, bit test) are of a generic "action source,bit" type - i.e. BSF (Bit set), BCF (bit clear)

Make it easy to specify 'common' flags

The PIC-33 instruction set has a pair of 'Skip if bit set / bit clear' instructions, however if you want to 'skip on carry' etc. you have to consult the documentation to discover the 'f number' for the Status register and the bit position of the various flag bits (Zero, Nibble Carry - which is confusingly called 'Digit Carry', yet another bit of FUD (or at least it is for those who known what BCD is) and Carry). Worse, however, is trying to get your brain around dealing with a 'borrow' after a SUBtract.

Rather than two different instructions - BTFSC (skip if bit clear) and BTFSS (skip if bit set), a 'generic' instruction would be "SKIP reg,bit" where reg is 'optional' (in which case bit must be one of Z/nZ DCy/nDCy Cy/nCy Bw/nBw)
My 'Bw' (Borrow) and 'nBw' (no borrow) flags are defined as the inverse of the Cy/nCy flags (i.e. Bw = nCy, nBw = Cy) thus eliminating SUBtract confusion
Note that I retained the name DCy (Digit Carry) because using the more descriptive 'nibble carry' (NCY) would 'clash' with 'no carry' (nCy)
remember = MPASM assembler IGNORES CASE .. so NCY and nCy are ACTUALLY THE SAME NAME !

The Status flags

At some point every PIC programmer will be caught out by the fact that the 'Decrement/Increment (register) and skip if Zero' (INCFSZ and DECFSZ) instructions DO NOT effect the Status bits at all, and whilst both Increment and Decrement register (INCF, DECF) effect the Z bit, neither effect the Carry bit ! For those wanting to return a simple 'flag' from a Subroutine, 'RETURN with 0' does not effect the Z bit

Of course your 'new name' macros can 'fix' all this, HOWEVER it's going to cost extra instructions.
To avoid 'forcing' extra instructions on the user (who might not care what the Z or Cy status is), a 'flag' can be defined (eg 'setStatus' TRUE/FALSE) that the macro checks before adding a Zero test etc.

The 'missing' instructions

The most obvious 'missing' ones are ADD with Carry (ADDCy) and SUBtract with Borrow (SUBBw) plus a 'conditional' Jump / Call (Jump/Call on Carry etc). To complete the 'BIT' instructions, 'Bit Flip' would be 'good to have', as would be Shift and Rotate (without using Carry). Whilst the ability to TEST the Accumulator / a register for Zero would be 'nice', better would be SKIP / JUMP / CALL on Acc/reg zero / non-zero

Macro's are an easy way to build 'extra instructions' using multiple PIC-33 'primitives'. You can even define things like "MULtipy" !
Since there is a 'nibble carry' (DC) Status bit (which is only set during Add and Subtract, not as a result of any INC or DEC), it might be a 'good idea' to define a set of BCD arithmetic instructions (to avoid the poor programmer having to spend time debugging code as the DC bit doesn't get set as expected).

You can find lots of good ideas for your own 'library' instructions on the web - for example, see here for some maths examples

Optomise the Macro expansion

When defining your own multi-instruction Macro, you can take the time to make an effort to minimise the instruction 'count'. Indeed, it's even possible to 'detect' the cpu 'type' (baseline ect) and 'tailor' the expansion to the available instruction set

Detecting the CPU type has the advantage that you can 'simulate' the 'expanded' instructions by using the baseline set
The 2 two extra instructions in the 'expanded' (35-set) are 'ADDLW 0xNN' (ADD 'literal' to W reg) and 'SUBLW 0xNN' (Subtract Wreg from 'literal').
The Macro for 'ADDLW 0xNN' (using just the baseline set) would be "COPY Acc,rTemp LOAD 0xNN,Acc  ADD rTemp,Acc").
The Macro for 'SUBLW 0xNN' (Wreg-0xNN) is rather more complex. SUBWF is Acc-f, so copying Wreg to rTemp and loading 0xNN into the Acc will get us 0xNN-Wreg after which NEGate gets the 'right' answer BUT not the right Cy flag ..... 

How MPASM deals with Macros

A Macro definition consists of a Name, the word 'macro' and a 'parameter list' ('local variable names', comma separated = in 'PIC speak' these are known as 'arguments' however 'local variable names' is rather more understandable). To use a Macro within your code, you just type it's name followed by a (comma separated) list of values = you don't have to set values for all the parameters, any you don't set will be passed as "" ('empty string' or 'null')

When the macro name is found in your code by MPASM, the definition is fetched and processed one line at a time. To process a line, first the 'local variable names' (in the Macro definition) are REPLACED with the EXACT text 'values' you entered in your code (i.e. each occurrence of a 'local name' is replaced by the 'matching' value text - EVEN IF THAT IS 'blank' (i.e. if you cut the list short or skipped (omitted) one (or more) of the values).

The resulting line is then 'processed' (i.e. 'expanded') and 'resolved'. Each line of the macro is processed, one after the other, until an 'exitm' or 'endm' is found

Note that the 'local variable' is replaced with the TEXT value from the invoking line BEFORE the macro line is 'run'. It is only when the line is 'run' that the actual value of the text is 'evaluated' and 'resolved', usually into a numeric value.
If the 'invoking' line has fewer parameter values than the list of 'local variables' in the macro definition, then any 'left over' local variable names will be replaced with 'nothing' (i.e. "" = blank).
This is key to how the 'conditional assembly' macro script IF instructions 'work' i.e. how the 'if' test is first processed into text and then the resolved into zero (FALSE) or non-zero (TRUE). The IF result will 'choose' which of the macro code lines are processed and thus which are used to construct 'real' PIC-33 instructions

For example, the PIC-33 instruction set has a single 'ADD' instruction. This always ADD's the Accumulator (in PIC-speak the 'W (or 'Working') Register') to a Register (in PIC-speak, a 'file'), with the result being saved in either the Acc or the Reg. The (totally illogical) PIC-33 mnemonic for this is "ADDWF f,d" where "f" ('file') identifies the Register and "d" ('destination') is either '0' (for destination=Acc) or '1' (for dest=Reg).

I replaced this with a generic "ADD (source),(destination)" instruction. This is defined as ADD macro a,b, where 'a' and 'b' are the 'local variables' that will be used within the macro definition. To 'invoke' the macro, the user types :
ADD (source=value, Acc or RegNN), (destination, Acc or Reg (if not specified, Acc is assumed).

A few 'special cases' leap to mind - one is when the user types 'ADD 1,Acc (or Reg)' (which means 'Increment Acc' (or Reg)), the other is 'ADD Acc' (or 'ADD Acc,Acc') which means 'shift Acc left' = these are dealt with in the actual code

The 'Accumulator' will be the 'default destination', allowing the user to type 'ADD x' to mean 'Add x to Accumulator' (where x = Register or immediate value)

This means the macro has to 'detect' if the 'b' value is 'missing'.
One way to detect 'nothing' is by using a 'conditional test' of the form "IF (b + 0) == 0".
When the macro is evaluated, if no parameter is set for the 'b' variable, then the 'b' in "IF (b + 0) == 0" will be replaced by "" (empty string). On expansion, the 'IF' test becomes :-
IF (  + 0) == 0
Which is 'true' (so long as any 'real' value of 'b' is greater than 0) and means b is 'missing'.
Of course this "IF (b + 0) == 0" trick fails when the value of 'b' could actually be zero - and, unfortunately, there is one very common case where 'b' can indeed be 0 = and that's when b is the INDF 'register' i.e. the 'indirect pointer' register (which is register number 0).
One way around the INDF 'problem' is to 'cheat', for example we could define INDF as 0x20 (which will 'map' to 0 in every 'register bank' = see above 16F57/59 register addressing). Another method is to SET INDF to some non-zero value before the IF, then SET it back to zero after the IF
Since b="" (missing) leads to the same case as b=Acc, to save code, we just SET Acc=0
If one of the macro's needs to uniquely 'detect' the 'Acc', which is just a variable name, it can be SET to some value exceeding the 'normal' range of 'immediate' values (or Register addresses) = eg. 0x100 before doing the IF test.

To 'detect' if the user has specified an immediate value or a Register address (or Accumulator), the user must always specify a Register 'by name' (i.e. as a text sting - for example 'rTemp', 'INDF', 'TRIS' or 'regN' (reg0, reg1, reg2, reg3 etc), rather than by it's actual address value

To 'detect' the use of a register 'name', all the names can be SET to some value outside the 'immediate value' range  = eg. 0x100 before the IF test, and reSET back to their correct values after the IF test.
Since macros can be 'nested' (up to 16 levels), this is (relatively) easy to do using another macros that has a single parameter. To SET the registers 'outside' their 'normal' values I use 'setReg FALSE', to SET them back to 'normal' values eg 'setReg TRUE' (a single setReg macro a definition avoids the need to keep multiple setReg macros 'in step' when the register names change or are added to).

The WHILE and #v() directives

To repeat the evaluation of a code block, the 'while (expression), code block, endw' construct is used

Note that any non-zero value is interpreted as 'TRUE' (so only 0 == FALSE)

while expression  ;the following code block will be evaluated so long as expression == TRUE (i.e. non-zero)
code ...                     ;the code to be repeated
... block                    ;(any variables within the block are re-evaluated each time the block is repeated)
endw              ;marks the end of the code  block to be repeated

MPASM also supports a 'variable numeric value' using the "#v(variable name)" construct. The variable name will usually be a local variable that is 'confined' to within a macro, the value of which of which is modified by the macro code. When "#v(variable name)" is found (anywhere within a line of macro code), it is evaluated and the numeric result inserted into it's place immediately after the other local variables have been replaced.

In MPASM, any text found at the start of a line == a label, so if the line starts with anything containing "#v()" then the numeric value becomes part of a label.
Anything after the first space in a line is taken as code (assembler instructions), so, if "#v()" is found in the code, the numeric becomes part of the code (note, by default, the numeric is a hex value) - unless the "#v(variable name)" is enclosed as "D'#v(variable name)'" (for Decimal) or "B'#v(variable name)'" (for Binary)
NB. If a ";" is found in the line, anything after that is assumed to be a comment (and is ignored)

For example :-

local fred set 3
while fred
fred set fred-1
CLR reg#v(fred)

will evaluate to :-

CLR reg3 CLR reg2 CLR reg1

The #v() can be used anywhere in a label = this allows the 'auto-generation' of jump destination labels (for example)

For example :- .. local fred set 1 label#v( fred ) ... ... fred set fred+1 label#v( fred ) ...
will be expanded to :- label1 ... ... ... label2 ... ....

Next page :- New PIC 33 mnemonics - (macro code)