Sunday, July 6, 2008

Time is of the Essence

After reading about Mike Dailly's raster split trouble I decided to write a bit about how the top split is done in Paradroid Redux.



There are three things to do when playing area begins:

  • change background color
  • start character display
  • start sprite display
The first one is just $d021 change, other two require a bit of trickery.

To mask vertically scrolling characters one would usually use illegal graphics mode (Extended Color Mode comined with Bit Map Mode and/or Multi Color Mode) or sprites. The first method produces black pixels so it isn't usable unless background is black, and the second one is unusable if sprites need to cross the split.

So, it's time for some font trickery. Raster interrupt several lines above the split changes to blank font, and then the actual split interrupt changes $D018 to display correct character data. This means "wasting" $0800 bytes for the blank font, but half of that memory is used as temporary buffer elsewhere so actual cost of clean split is one kilobyte.


Clipping sprites cleanly requires trickery as well as there is no way to start sprite display from the middle of graphic data. One way to achieve clipping is clearing top of sprite graphics if it overflows top split, but that would waste time both when clearing the memory and when running extra animator/digit generator rounds to restore top of sprite when more of it gets shown. Another way to achieve clipping is to put sprite at non-visible x-coordinate and then change them at the correct line. However, there is no time to change multiple registers in time.

Guess what? $D018 comes to rescue once again. There is an extra screen where the sprite pointers point to blank sprites. When the time comes, split interrupt doesn't only change font (bits 1-3 of $D018) but also displayed screen (bits 4-7 of $D018). Nice and easy solution but it requires a blank screen, another one kilobyte wasted. No, not really - there is no reason why one half of the blank font couldn't be used for blank screen. What that means is that sprite clipping is practically free as there is no extra memory required and $D018 needs to be written anyway for character blanking.


There is one problem with the screen change though. While VIC-II reads font data and sprite pointers & sprite data every line, character pointers (screen data) are only read on every eighth line. This means that while sprite and font changes are immediate, screen change affects display 0-7 lines later. To overcome this the top line of playing area is copied onto blank screen so character pointers are correct when $D018 is changed.

As blank screen is located inside blank font this copying creates yet another problem. Blank font isn't that blank any more. The topmost line is at SCREEN + $140, which means that chars $28-$2c aren't blank and will produce garbage if they appear on the topmost line. The easiest way to avoid that is to not use those chars inside playing area at all, so that's what is done. The same problem happens because of blank sprite pointers at SCREEN + $3F8, char $7F. That one is unused as well.


And what does all this has to do with timing problems Mike mentioned?

VIC-II doesn't have time to fetch sprite data when CPU is not using memory bus, so it has to stop CPU momentarily whenever sprites are active. Just how many cycles VIC-II steals from CPU depends on which sprites are active, and this causes a timing hell as you have to change registers when C64 is inside the side border area to avoid flicker. With $D021 change you have all the side border (23 cycles) to change it, but $D018 is trickier. Sprite data is fetched very early, the first three sprites get their data read at the end of previous line. This means that if $D018 write is late sprites will stay blank for one extra line.

To get register writes done at the very beginning of side border area the game uses CIA timer to stabilize raster timing. During game init CIA1 timer A is started, running through 63-65 cycles depending on VIC-II version. This means that $DC04 is always synchronized with current display X position. IRQ only needs to read $DC04 and skip that many cycles.


Did I forget bad lines? Every eight line VIC-II needs to read character pointers and that's not possible without stopping CPU for most of the raster line. This means that there is absolutely no time for anything unnecessary. In the case the split happens on a bad line the game triggers raster IRQ two lines above split, prepares next interrupt at the correct line and then preloads $D018/$D021 values and executes two-cycle instructions until IRQ happens. That guarantees minimum interrupt latency. When raster interrupt happens it will just write those two registers, clean up the stack (this second interrupt pushed status register and return address into stack) and jumps into the common code.



Nothing explains code better than source, so here it is. Only relevant parts are shown, and for clarity I've removed all assembly directives which were there to make sure branches don't span page boundaries (which would add one cycle to the branch).

       IRQ at line 95, prepare for split
 
...
 
lda #$10
ora _vScroll
sta $d011
cmp #$16
bne .057
 
; special case for bad line

lda #<Irq_116
sta $fffe
 
lda _d018+1
sta _d018b+1
lda _d021+1
sta _d021b+1
lda #116
bne .x1 ; jmp
 
; normal case
 
.057 lda #<Irq_118
sta $fffe
lda #118
 
.x1 sta $d012
 
...
 
;----------------------------------------------------------------
 
; this one used when ($d011 & 7) = 6, stuffs
; d018/d021 as fast as possible at raster 118
 
subroutine
Irq_116 pha
sty .yr+1
cld
 
lda #<Irq_118b
sta $fffe
lda #>Irq_118b
sta $ffff
lda #118
sta $d012
inc $d019
 
; 118/15 = 7.8 so this one is executed 8 times
 
sbc #15
bcs *-2 ; 8*5-1=39 cycles
 
; preload registers and execute 2-cycle
; instuctions until next IRQ happens
 
_d018b lda #scr_GAME
_d021b ldy #0
cli
repeat 16
cli ; 32 cycles wasted
repend
 
; now is the time to write registers, we always enter
; via interrupt as the above code never runs this far
 
Irq_118b
sta $d018
sty $d021
 
; clean up stack and continue normal IRQ code
 
pla ; flags
pla ; PC lo
pla ; PC hi, always != 0
bne .irq0 ; jmp
 
;----------------------------------------------------------------
 
; normal case, use timer value to stabilize
; raster regardless of sprites over the split
 
Irq_118
pha
sty .yr+1
cld
 
lda $dc04 ; [1,15] ([2,15] if NTSC/Drean)
eor #$0f ; [14,0] ([13,0])
sta .j3+1
.j3 bpl *+2 ; jump into the delay code
 
; entering at offset 0 delays 16 cycles,
; entering at offset 14 delays 2 cycles
;
; OP_CMP_IMM is opcode for CMP #immediate (2 cycles),
; OP_CMP_ZP is opcode for COM $zeropage (3 cycles)
 
cmp #OP_CMP_IMM
cmp #OP_CMP_IMM
cmp #OP_CMP_IMM
cmp #OP_CMP_IMM
cmp #OP_CMP_IMM
cmp #OP_CMP_IMM
cmp #OP_CMP_ZP
nop
 
_d021 ldy #0
_d018 lda #scr_GAME
sta $d018
sty $d021
 
; continue with interrupt
.irq0 ...

8 comments:

dzdt said...

You say you're spending $0800 of memory for a [nearly] blank font. Why not enable ECM mode as well for the split, and reduce that to $0200 of memory? Or consider enabling graphics mode, and just have to have the bitmap blanked out for the appropriate rasters in the split zone?

TNT said...

ECM/bitmap methods require one more register write ($d011), and I run out of time with just two ($d021 and $d018) when there are multiple sprites over the split happening near bad line.

dzdt said...

:^) Okay, here's a more complicated idea that might work. How about moving the bad line out of the way? Set up so the characters pointers for the first row of the play area are read 8 rasters before they are normally. Then use the double-line trick to force this character row to be re-displayed. The cost is two extra writes to $d011, one of which has to be timed pretty accurately between the end of the graphics row and start of sprite 0 data fetch. But now you're guaranteed no bad line at the split. That should give time to use ECM or BMM methods. Maybe?

dzdt said...

Is it a known bug that you can get ejected out into space? I managed to do this in the little front room on the bridge level, trying to shoot while standing on the lift. It ran me through the wall down and right. I have a vice snapshot saved if it happens to be any use...

Zarkov said...

Things seem to be going a bit slow around here lately, but I still thought I might mention an apparent bug I've come across in the current version.

What's with those type 834s and 883s that seem to be invulnerable to laser fire? I ran into those repeatedly while in the middle cargo hold and, just now, the quarters. Of course it's easy to just zap the bastards with a disruptor, but it seems impossible to kill them with a 751 or 834, which is a bit annoying. Just now I burned through three droids trying to take one down, just to see if it could be done.

TNT said...

I hope to get a new version out before xmas, but there is only so much time and I've been busy with life. Push-through-wall-when-exiting-from-lift should be history then...

You can't damage 834 or higher with type 1 laser, that's how the original worked. I posted about this on Lemon64 many moons ago.

Zarkov said...

Well, I'll be ... I must have zapped gazillions of those buggers and never realized they were immune to type 1 lasers. I guess I always had the sense to simply disrupt them.

That thread on Lemon64 looks like interesting reading; thanks for the pointer.

Also, hooray for a new version! Another thing to look forward to this Christmas, besides the next episode of Doctor Who. I really doubt the game can be (much) improved with additional features and stuff, now that we got statistics and radar (for the introduction of which I love you dearly), but it's sure nice to see the bugs going.

Guy said...

Hi TNT. Just to say that I check on your progress every now and again, and news of a new version around Xmas time is great (if you find the time)! What you've done already is fantastic.