ZX Spectrum Bubble Bobble: graphics routines

Bubble Bobble is an arcade game originally released by Taito in 1986. The ZX Spectrum port, which was developed by Mike Follin, must have been a challenge to write because the ZX Spectrum is quite underpowered for a game with so many sprites. In this article we will take a look at how some of the graphics routines work.

Overview

Lets take a look at a recording of how the screen is painted. In this video the emulation speed is reduced so we can see exactly the order of which things are painted. The screen is also filled with all pixels set and attributes set to white on black to fully show what is going on.

Some things to note:

Sprites are painted directly to screen.
Player and enemy sprites and are erased before painted. The erasing is exactly 16 pixels width instead of the full 24 pixels that would be required for a 16-pixel sprite pre-shifted into 24-pixels. This indicates that the erase routine uses AND-mask when replacing the background.
Bubbles are not erased before painting.
Background is not painted. This is kept on screen from previous frame.

Even though sprites are painted directly to video memory, there is also a back buffer at address 0xE000. This buffer is updated when entering a new level and contains the background graphics only. It is used as source data when erasing sprites. buffers

Sprite routine

Finding the sprite routine is very easy using the excellent Spectrum Analyser tool. You just click on the pixel and it will show you the routine that wrote to that pixel address.

spectrumanalyser

It turns out that this game have several different sprite routines. One of them is at 0xA6F3. The structure of it can be split up into different sections. Let us call them init, pre, mid and post. These will be described below.

init makes some calculations and stores them in registers. This section is quite big so it is described in pseudocode below for brevity.


; Prepare to paint one masked 16*24 sprite:
; Sprite source graphics pointer = sprite gx + animation index * 96. Store in SP so sprite data can be read quickly from stack.
; Add 11 to raster line counter (to be explained later).
; Obtain screen address by looking up the sprite Y coord in a screen offset table. Store in HL.
; Determine number of lines to write in pre and post sections.

Each sprite animation takes 96 bytes (3 bytes per row graphics data, 3 bytes per row mask data, 16 rows sums up to 16*6 = 96).

Then it enters the pre section which looks like this.


x
NrOfLinesMod:
    LD B, XXXX      ; Actual number is runtime modified to be 1-8
NextLine:
    POP DE          ; Load sprite and mask from stack
    LD A,(HL)       ; Load from screen
    AND E           ; AND mask data
    OR D            ; OR sprite data
    LD (HL),A       ; Write 8 pixels to screen
    INC L           ; Advance to next column
    
    POP DE          ; Load sprite and mask from stack
    LD A,(HL)       ; Load from screen
    AND E           ; AND mask data
    OR D            ; OR sprite data
    LD (HL),A       ; Write 8 pixels to screen
    INC L           ; Advance to next column
    
    POP DE          ; Load sprite and mask from stack
    LD A,(HL)       ; Load from screen
    AND E           ; AND mask data
    OR D            ; OR sprite data
    LD (HL),A       ; Write 8 pixels to screen
    DEC L           ; Back up two columns
    DEC L

    INC H           ; Next row
    DJNZ NextLine   ; And loop

Each iteration in the loop writes a row of 24 pixels with masked sprite data. The init section determines the number of rows to update and writes this number to NrOfLinesMod+1 address which will modify the instruction LD B, XXXX to load the actual number.

Now on to the mid section. It updates the screen target address in HL and then continues writing to screen in the same way as above.


xxxxxxxxxx
    POP DE          ; See comments from section above
    LD A,(HL)
    AND E
    OR D
    LD (HL),A
    INC L
    
    POP DE
    LD A,(HL)
    AND E
    OR D
    LD (HL),A
    INC L

    POP DE
    LD A,(HL)
    AND E
    OR D
    LD (HL),A
    DEC L
    DEC L
    INC H
    
    ... above code copied 7 times more

This is basically a unrolled loop to write 8 rows of sprite data.

Then the post section.


xxxxxxxxxx
NrOfLinesModPost:
    LD B, XXXX  ; Actual number (0 to 7) determined in init-section
    INC B
    DEC B
    JR Z, Done  ; Skip if zero rows
    EXX         ; Determine screen address. HL' holds row address offset table
    LD A,(HL)   ; Load low byte of screen address from adress offset table
    INC L       ; Increase row address offset pointer
    EXX
    ADD A,C     ; Add column from C
    LD L,A      ; Set low byte of screen address
    EXX
    LD A,(HL)   ; Load high byte of screen adress from adress offset table
    EXX
    LD H,A      ; Set high byte of screen address
    
; Entering loop. B=nr of rows, HL=screen, DE=graphics.
PostLoop:       ; Write a row to screen using same method as sections above

...omitted same code as in "pre" section to write a row

    DJNZ PostLoop   ; Loop to next row
Done:

So why are the rows split up and painted in three different sections? The reason is that the developer have taken advantage of the fact that moving to the next row within the same 8 row character cell is faster than if the next row is in another cell.

The code that uses HL as screen register can simply use INC H to move to the next line. This is because H is the high part of HL so it increases HL with 256. As soon as you move to the next cell then you need a more complicated address calculation that takes more CPU cycles.

What about the unrolled mid section? Remember that all sprites are 16 rows high. This means that regardless of the sprite Y coordinate there will always be a section of 8 rows that lies within the same character cell. This is why the mid section can write 8 rows without any branches.

Let's try an example with a sprite that has Y coordinate 3. This gives 5 lines to update in the first loop (8 - (3 mod 8) ). Then 8 lines in the unrolled loop. Then finally 3 more lines (3 mod 8). In total this updates 16 lines.

yalign

Raster aware sprites?

A note by Kevin Edwards caught my attention. He has access to the original commented source code of Bubble Bobble. The picture he posted indicates that the game features some kind of raster aware sprite routine code. I have not seen that before so I wondered how it works. The ZX Spectrum hardware does not provide a way to read the current raster scan line.

tweetpic1

The reason for wanting to know the scan line is to avoid sprite flickering. A display updates from top to bottom. If we move or erase a sprite while the raster is over it then this will result in flickering or tearing.

Lets take a look at the code. We find the routine at address 0xA3AD. The routine is slightly different compared to the picture above and this is likely because it shows a version of the source code that is not quite final.


xxxxxxxxxx
SpriteLoop:
    LD A,$01
    CALL AddAndReadRasterPosition
    LD B,A                  ; Top of scan
    ADD A,30                ; Note: was 20 in image above
    LD C,A                  ; Bottom of scan
    LD A,(IY+$06)           ; Load sprite Y position
    CALL CompareWithRasterPos
    JR C,NextSprite         ; Skip to next sprite if comparison returned true
    ADD A,16                ; Also test Y + 16
    CALL CompareWithRasterPos
    JR C,NextSprite
    LD A,(IY+$02)           ; Load sprite old Y position
    CALL CompareWithRasterPos
    JR C,NextSprite 
    ADD A,16                ; Also test old Y + 16
    CALL CompareWithRasterPos
    JR C,NextSprite
    BIT 5,(IY+$07)          ; Test DONE flag
    CALL Z,ProcessSprite    ; Process this sprite if DONE flag is not set
NextSprite:

The code above tests for overlap of the sprites current position and its old position against the current raster position and 30 lines beneath it. The code that is missing from the picture above are the functions RASCHEK and the definition of the RASTER label.


xxxxxxxxxx
CompareWithRasterPos:           ; Named RASCHEK in original source
  ; On entry: A = Y-coordinate. B/C = Top/bottom line to compare against.
    CP C
    RET NC
    CP B
    CCF
    RET                         ; Returns with carry set if A is within B and C.
AddAndReadRasterPosition:
RasterPosition:                 ; RASTER label in original source
    ADD A,XXXX                  ; Add raster position to A
    LD (RasterPosition+1),A     ; Store updated value in raster position (modifies previous instruction)
    RET NC
    RET

Those are the raster check routines. It is surprising how simple it is. Nothing fancy is hidden here, it simply updates a global variable (stored in a self-modified instruction). It turns out the code doesn't actually "know" the exact raster position at all. Instead it only roughly tries to sync with the raster and this is enough to avoid flickering.

An interesting part in the Kevin Edwards image is the line with out 254 that is commented out. This code changes the border color. Changing the border color is a way to visually see where the raster is. So if we set the border color to black before painting a sprite, then to white when we are done then we can see how long it takes. Fewer lines means a faster routine.

A guess is that the developer used the border color change, then measured with a ruler on the TV and took note of approximately how many lines were used, and then used this number in the code.

The code compares the sprite's old and new positions with the raster line to determine if it should paint the sprite in this loop iteration. This is to avoid flickering caused by erasing the old position or painting the new position when they collide with the raster position.

rasterscan

After painting or erasing a sprite then the raster line is increased by calling AddAndReadRasterPosition.

Action	Raster line increase
Erase on aligned position	7
Erase non-aligned using AND-mask	20
Paint sprite on aligned position	8
Paint sprite on non-aligned position	11

Again these numbers are probably chosen after roughly measuring the CPU cost of each action. Aligned position means a X coordinate that is evenly divisible by 8. On such a coordinate only two bytes per row needs updating (8 pixels per byte and the sprites are 16 pixels wide). A non-aligned write needs to update 3 bytes per row.

xalign

Bubbles

The routine described above only handles the main sprites which are a maximum of 9 in total (2 player sprites and 7 enemies). There are also a lot of other moving objects in the game.

The bubbles are one category of objects that are handled separately. They are painted in an interesting way. The bubbles blend with the background but they do not blend with each other. The reason for this is that the routine fetches the background from the back buffer then blends with this and replace what is on the screen. To reduce the effect it uses AND-masks on the left and right sides. Thanks to this blend technique it doesn't need to erase the previous location unless the bubble have a high movement speed (such as when launching a bubble or when a bubble with a captured enemy burst).

The developer did not use the raster check routine for the bubbles. This is probably because flickering bubbles would be less annoying and noticeable for the player than flickering sprites. It also saves some important CPU cycles to not have to do this check.

Just as with main sprites there are different routines for painting bubbles depending on X align and it again takes advantage of the fact that a 8-row mid section always will be in same character cell. That sums up what I was able to decipher because the routines makes heavy use of shadow registers with lots of EXX instructions inside the inner loops which makes reverse engineering them a bit of a headache so I'll stop here for now. It is interesting to consider how hard this code must have been to write at a time when development tools were very primitive (slow edit/build/test cycle with no debugger) and development was done by a single developer over a period of just a couple of months.

bubbleblend

Summary

On YouTube there are comments that this arcade conversion is not very good. I disagree, it is a great conversion given the hardware limitations. It even supports the 2-player mode which was very unusual for a Spectrum game. The developer had to use a lot of interesting techniques to make it possible to display as many sprites as possible and that is an impressive achievement.

@VilleKrum

Also see: ZX Spectrum Ghost 'n Goblins: graphics routines

Also see: Mike Follin interview