After I had to come up with a name for the fastloader I decided to post this bog-standard routine I use in Paradroid Redux to depack waypoint and droid distribution data. The only reason it's worth of posting is that it doesn't use any other register than accumulator.
; in: ax=ptr
InitGetBits
stx .get+1
sta .get+2
lda #$80 ; start with end bit
sta cbits
rts
GetBits
.1 asl cbits ; 5
bne .3 ; 8
pha ;( 3)
.get
lda $1234 ;( 7)
inc .get+1 ;(13)
bne .2 ;(16) assume not taken
inc .get+2
.2 rol ;(18) rotate end bit in
sta cbits ;(21)
pla ;(25)
.3 rol ; 10
bcc .1 ; 13 assume taken
rts
13 cycles per bit unless new byte is needed, 25 cycles extra every 8th bit. This adds up to 16.125 per bit on average, not accounting for jsr/rts overhead.
Calling it requires loading A with a bit mask telling it when to stop. Usually this is single bit to return only input data, for example %00010000 to read four bits. However, as I need to ORA the four-bit Y coordinate with #$20, I can do it for free by doing "LDA #%00010010; JSR GetBits". The loop ends when the highest bit is rotated into the carry, so A is "0010yyyy" at that point. Even better, after I've stored the ORed value I need to multiply A by four and then add two, and this time I can use the fact that carry is always set on exit so it's just "ROL; ASL". Try to do that with a high-level language :)
Edit: oh bugger, IE6 collapses multiple line feeds into one even inside PRE tag. I hope IE7 is more intelligent.
1 comment:
For multiple lines, I do <newline><space><newline><space>
Seems to work better.
Post a Comment