[cc65] Understanding inner workings of ca65

From: Spiro Trikaliotis <ml-cc651trikaliotis.net>
Date: 2013-07-03 21:18:54

as some might know, I've added a commented disassembly (which is a
source for ca65) onto my homepage at http://cbmrom.trikaliotis.net/

While at it, I wanted to fix an annoyance I have had.

Have a look at: http://cbmrom.trikaliotis.net/listings/c64-03.lst
especially at the lines:

000000r 2                       .include "token.s"
000000r 3                               init_token_tables
000000r 3               
000000r 3  rr rr 45 4E                  keyword_rts "END", bEND, TokEnd
000004r 3  C4 xx        

The macro keyword_rts is defined as follows:

000000r 2               ; optionally define token symbol
000000r 2               ; count up token number
000000r 2               .macro define_token token
000000r 2                       .segment "DUMMY"
000000r 2                               .ifnblank token
000000r 2                                       token := <(*-DUMMY_START)+$80
000000r 2                               .endif
000000r 2                               .res 1; count up in any case
000000r 2               .endmacro
000000r 2               
000000r 2               ; lay down a keyword, optionally define a token symbol
000000r 2               .macro keyword key, token
000000r 2                               .segment "KEYWORDS"
000000r 2                               htasc   key
000000r 2                               define_token token
000000r 2               .endmacro
000000r 2               
000000r 2               ; lay down a keyword and an address (RTS style),
000000r 2               ; optionally define a token symbol
000000r 2               .macro keyword_rts key, vec, token
000000r 2                       .segment "VECTORS"
000000r 2                               .word   vec-1
000000r 2                               keyword key, token
000000r 2               .endmacro

Thus, altogether, we get the equivalent:

.macro keyword_rts key, vec, token
    .segment "VECTORS"
        .word vec-1
    .segment "KEYWORD"
        htasc key
    .segment "DUMMY"
        .res 1

(BTW, these are defines and codeparts that Michael Steil used in his famous
MS BASIC disassembly on http://pagetable.com/.)

Thus, the macro writes a WORD, a string (last byte ORed with $80) and a
dummy value in three different segments. However, this is not
represented correctly in the assembly file:

000000r 3  rr rr 45 4E                  keyword_rts "END", bEND, TokEnd
000004r 3  C4 xx        

The bytes themselves are correct ("rrrr" for vec-1; 45 4E C4 for "END";
and "xx" for .res 1). However, one cannot see that these bytes actually
go into different segments.

My approach to fix this was to make the listing file also output the
"executed" lines of macro expansions, as this seems to me to be the
easiest solution.

I did some debugging sessions in ca65 and I *believe* I found something
out, which I would like to be confirmed

If I understand it correctly, for macros, ca65 does *not* remember the
actual lines, but only the already tokenized input.
a. Is this correct? 
b. If yes, I cannot recreate the exact source input line, or is this
   data hidden somewhere?

Now, I have done some more tests. In src/ca65/listing.c,
CreateListing(), I called DumpLineInfos() (from src/ca65/lineinfo.c) for
each fragment and found out that there is actually the info of which
line called which line; that is, there is the info which macro was
called, even for nested macros.

However, I only have the info for lines that actually output anything. I
do not have info for lines "in between" (for example, the lines with the
.segment dummy-opcodes).

I fail to see how I could benefit from this info. The struct Fragment
does not even contain the segment name, so I cannot even fake this info
with the help of this struct.

I am currently thinking about different approaches to get a "full"

a. add "somehow" dummy-fragments for lines in macros which do not emit
   data bytes; this way, the struct Fragment would contain line info for
   these lines, too.
b1. from the file name/line number info, recreate the input line by
    rereading the input file (this also applies for macro lines that
    actually emit data)
b2. alternative to b1: add code so that the full input line data is
    actually added, so that I do not need to reread the input line.
    (where, how is totally unclear to me ATM)

Alternative to this:
Try to use b2 all over, that is, for each macro, remember the actual
lines that lead to the token (by copying them) and actually generating
full listing info.

Is this clear? Is this feasable? What would make the most sense?

Thanks in advance for any help.


Spiro R. Trikaliotis
To unsubscribe from the list send mail to majordomo@musoftware.de with
the string "unsubscribe cc65" in the body(!) of the mail.
Received on Wed Jul 3 21:19:06 2013

This archive was generated by hypermail 2.1.8 : 2013-07-03 21:19:08 CEST