Skip to content

Conversation

@ISSOtm
Copy link
Member

@ISSOtm ISSOtm commented Oct 15, 2021

Also avoid describing SameBoy internals, instead relying on it when otherwise corroborated, or on schematics and/or test ROMs when possible.

Restructure the article to describe behavior more than components, especially in a way that is more friendly to someone not knowing what all the components are about.

Add a diagram, too, and move the mode timing diagram to the STAT article, where it belongs just as well, but where it will be more visible and thus more useful.

Fixes #377, fixes #408.

@ISSOtm ISSOtm added content Improvements or additions to documentation enhancement New feature or request labels Oct 15, 2021
@ISSOtm
Copy link
Member Author

ISSOtm commented Sep 24, 2022

I rebased the branch and applied some editorial changes; the article is still not finished, and should not be reviewed yet. (At least not before #350 is merged, so we can focus on that.)

Copy link
Member

@avivace avivace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just gave it a quick look.

Looks nice so far!

src/STAT.md Outdated
::: tip TERMINOLOGY

A *dot* is the shortest period over which the PPU can output one pixel: is it equivalent to 1 T-state on DMG or on CGB single-speed mode or 2 T-states on CGB double-speed mode. On each dot during mode 3, either the PPU outputs a pixel or the fetcher is stalling the [FIFOs](<#Pixel FIFO>).
A *dot* is the shortest period over which the PPU can output one pixel: is it equivalent to 1 T-state on DMG or on CGB single-speed mode or 2 T-states on CGB double-speed mode. On each dot during mode 3, either the PPU outputs a pixel or the fetcher is stalling the [FIFOs](<#Rendering Internals>).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A *dot* is the shortest period over which the PPU can output one pixel: is it equivalent to 1 T-state on DMG or on CGB single-speed mode or 2 T-states on CGB double-speed mode. On each dot during mode 3, either the PPU outputs a pixel or the fetcher is stalling the [FIFOs](<#Rendering Internals>).
A *dot* is the shortest period of time over which the PPU can output one pixel: is it equivalent to 1 T-state on DMG or on CGB single-speed mode or 2 T-states on CGB double-speed mode. On each dot during mode 3, either the PPU outputs a pixel or the fetcher is stalling the [FIFOs](<#Rendering Internals>).

<text x="99" y="419">(9-bit tile ID, Y offset)</text>
<rect x="90" y="429" class="legend pxrow"/>
<text x="99" y="433">Pixel row (2 bytes)</text>
<text x="465" y="405" class="right">(Some arrows have been</text>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put this text way smaller (and maybe in a angle?)

- The BG FIFO is not empty

Once both conditions are fulfilled, the OBJ FIFO takes over, discarding the pixels slices already fetched.
Note that if the BG FIFO is empty, the Pixel Slice Fetcher immediately switches to [Get tile ID](<#Get tile ID>) when refilling it, so the OBJ fetcher will wait for 6 additional dots.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to my understanding of the sprite timing, the maximum "penalty" for a sprite occurring too soon in the BG slice fetch sequence is 5 dots (making that sprite contribute at most 11 dots to mode 3). I'm not clear on the technical reason behind this, though.

This understanding is corroborated by https://gbdev.io/pandocs/STAT.html#properties-of-stat-modes and https://www.reddit.com/r/EmuDev/comments/59pawp/gb_mode3_sprite_timing/

- The Pixel Slice Fetcher is attempting to [push pixels](<#Push pixels>)
- The BG FIFO is not empty

Once both conditions are fulfilled, the OBJ FIFO takes over, discarding the pixels slices already fetched.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What pixel slices are discarded? Are they refetched later, and when is there time for that? I assumed the BG fetching sequence latches the fetched slices somewhere until it's able to successfully push them to the BG FIFO (for which it always has to wait at least 2 dots anyway).

@kapoisu
Copy link

kapoisu commented Jan 13, 2023

I'm going to propose some feedback about why I've had a hard time understanding this document from the point of view of a novice.

Especially in the Sprite part, there are terms like "X coordinate" which I'm not sure what they refer to. From the aspect of implementation, I reset the counter of the fetcher when I enter the window. Hence I can't compare the X coordinate of the sprite to it and it's reasonable to think that these terms refer to a counter which tracks the number of pixels actually shifted out and pushed to the LCD.

The assumption above is valid only if I keep as less states as possible. What if I had an extra counter which tracks the number of pixels already fetched? The timing of the sprite checking would be changed from "after a pixel is popped" to "before a pixel is fetched." This may not be true (to be honest I'm not that confident) but it's actually a reasonable consequence - because I have to do the sprite checking, I introduce this extra counter.

That is to say, I suppose the term is a little bit too brief because I can consider all of the counters mentioned above sorts of "X coordinate."

In the OBJ fetcher part,

the OBJ fetcher waits to take control until two conditions are met:

  • The Pixel Slice Fetcher is attempting to push pixels
  • The BG FIFO is not empty

Firstly, what is the exact timing of "attempting to push pixels?" Is it the 6th, 7th, or 8th cycle of the fetcher? From my understanding, this would affect the number of cycles added to mode 3.

I've learned - from somewhere else - that there may be delay up to 11 cycles for each sprite. There are descriptions such as "shorten/lengthen mode 3 by n dots" spread across the document. My point is, they are not tidied up and summarized in an easy-to-understand way. How/Why these conditions contribute to the max 11-cycle delay and when these conditions are checked aren't stated clearly.

Use this page as an example. 172-289 dots? Readers are not going to figure out 289 = 12 (before shifter is filled) + 160 (pixels per scanline) + 7 (max(SCX % 8)) + 11 (sprite delay) * 10 (number of sprites) from pure imagination (and I don't even know why a possible window restart is not included.)

With all the ambiguities combined, I can't even have a strong guarantee of the correctness of the overall control flow.

@ISSOtm ISSOtm force-pushed the rendering-internals branch from bbdc0ef to 15009a4 Compare July 2, 2023 10:03
Every time both FIFOs are clocked, the selector decides whether to retain the pixel from the BG or OBJ FIFO.

The selection follows the following rules:
1. **In CGB Mode**, if [`LCDC` bit 0 (priority enable)](<#CGB Mode: BG and Window master priority>) is reset, pick the BG pixel.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule appears to be simply disabling OBJ, which duplicates the functionality of rule 3 and doesn't seem correct. I believe a correct rule would be to ignore rules 4 and 5.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it pick "OBJ if not transparent, otherwise pick BG" rather than BG in the first place? (If changing the wording, make sure "otherwise" cannot be thought to be related to LCDC bit 0 being reset.)

@ISSOtm ISSOtm force-pushed the rendering-internals branch from 15009a4 to fe13a13 Compare March 6, 2024 13:24
@ISSOtm ISSOtm added the help wanted Extra attention is needed label Mar 6, 2024
Copy link
Contributor

@quinnyo quinnyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit of a noob with this part of the Game Boy, which is hopefully a good thing for someone reviewing this!
I've made a bunch of inline comments/suggestions that are all mostly writing/language based because I can't really correct technical points with much confidence.
Instead, I'll cover some general points and my issues with understanding the technical side here:

I can't get past "Pixel Slice Fetcher/Get Tile ID".

  • the Get Tile ID + BG + OBJ fetcher nested headings are confusing
  • The Pixel Slice Fetcher doesn't even do anything until Get Tile ID has been done?
  • we're getting the Tile ID but we're also getting attributes and doing ... something with them?
  • after the BG and OBJ fetchers have had a go, the Pixel Slice Fetcher does the actual pixel slice fetching -- which is only the Color ID part of the the tile?

Some of the points about FIFO refill priority and the related potential delays are repeated several times in slightly different ways. This makes the differences seem like they should be very meaningful but I can't discern what the meaning is.

The BG FIFO, BG Fetcher, BG Fetcher with Window Fetcher hat, OBJ FIFO, OBJ Fetcher terminology makes it really difficult to follow just because they all look/sound like the same words. Sometimes it seems like the two fetchers are part of their respective FIFO or vice versa.

I think I want a more clear distinction between procedural, architectural, conceptual information.
To be clear, the information mostly seems to be here, but the presentation sometimes jumps between these modes and I struggle to follow that. I'm sure others have both more and less difficulty with that than I do.

This is a really complicated thing you're trying to explain!


Once the last pixel has been output, the PPU releases the VRAM bus, and does nothing while it waits for the scanline to end.

The PPU embarks both a vertical counter (exposed as [`LY`](<#FF44 — LY: LCD Y coordinate \[read-only\]>)), *and* a horizontal counter, which will be referred to as "`LX`" henceforth.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"embarks" doesn't seem to be the right word to use here.

There seems to be only one place where LX is used (#### BG Fetcher) so I think this should move there. I added my suggested change there.

Suggested change
The PPU embarks both a vertical counter (exposed as [`LY`](<#FF44 — LY: LCD Y coordinate \[read-only\]>)), *and* a horizontal counter, which will be referred to as "`LX`" henceforth.

</tr>
</tbody>
</table>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain LX here instead of above.

Maybe I misunderstood the purpose, but I think this makes the point more clearly:

Suggested change
:::tip
`LX` refers to the PPU's horizontal counter, or *LCD X coordinate*, but it isn't the name of a register like its vertical counterpart [`LY`](<#FF44 — LY: LCD Y coordinate \[read-only\]>).
:::

@@ -0,0 +1,408 @@
# Rendering Internals

The Game Boy's PPU is the component responsible for feeding the LCD (= the screen) with pixels.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"(= the screen)" is very informal compared to the rest of the text. I don't think you need to explain what the LCD is / is for.

Suggested change
The Game Boy's PPU is the component responsible for feeding the LCD (= the screen) with pixels.
The Game Boy's PPU is the component responsible for feeding the LCD with pixels.

::: tip Terminology

A "dot" is the unit of time within the PPU.
One "dot" is one 4 MiHz cycle, i.e. a unit of time equal to 1 ∕ 4194304 of a second.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need to emphasise dot every time after introducing it. (There's many of these, not just this one)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree here


#### BG fetcher

During this step, a tilemap is sampled to determine which tile to fetch.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This under the BG fetcher heading sounds like the Bg fetcher is the step.

Suggested change
During this step, a tilemap is sampled to determine which tile to fetch.
When the BG fetcher is active, a tilemap is sampled to determine which tile to fetch.


### Get tile ID

This step determines which background/window tile to fetch pixels from.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This step determines which background/window tile to fetch pixels from.
This step determines which tile to fetch pixels from.


:::

A byte is read from the computed address, and is forwarded to the Pixel Slice Fetcher as a tile ID.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A byte is read from the computed address, and is forwarded to the Pixel Slice Fetcher as a tile ID.
A tile ID is read from the computed address, and is forwarded to the Pixel Slice Fetcher.

Comment on lines +141 to +146
::: tip Raster effects

Interestingly, unlike e.g. the NES' PPU, great care has been taken to ensure that the BG fetcher re-reads as many registers as possible (`SCY`, `LCDC`, etc.).
This may have been insight from the former console, on which [proper "raster splits" are quite tricky](https://www.nesdev.org/wiki/PPU_scrolling#Split_X_scroll) due to a lot of internal caching.

:::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting but might be a bit out of scope?
It sounds like trivia, due to the framing with the NES PPU.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also believe this may be a bit out of scope right there, especially the second "speculation" part.


:::

This step takes 2 dots, with the VRAM access(es) being performed on the second.

This comment was marked as outdated.

Also avoid describing SameBoy internals, instead relying on it when
otherwise corroborated, or on schematics and/or test ROMs when possible.

Restructure the article to describe behavior more than components, especially
in a way that is more friendly to someone not knowing what all the components
are about.

Add a diagram, too, and move the mode timing diagram to the STAT article, where
it belongs just as well, but where it will be more visible and thus more useful.
@ISSOtm ISSOtm force-pushed the rendering-internals branch from 09c063f to 53c0f1d Compare December 17, 2024 21:44
avivace and others added 2 commits January 7, 2025 10:50
Co-authored-by: Estus <git@estus.dev>
Co-authored-by: Quinn <3379314+quinnyo@users.noreply.github.com>
@nummacway nummacway mentioned this pull request Nov 16, 2025
@nummacway
Copy link

nummacway commented Nov 16, 2025

Thank you for your work so far.

I have two notes on this though:

so the OBJ fetcher will wait for 6 additional dots.

I think you're referring to the variable part of the OBJ penalty here, but isn't that 5? And shouldn't it mention here that it is reduced with every pixel that OBJ is further to the right?

Do you want to rewrite the "Mode 3 Operations" section or do you want to delete it? It seems kinda redundant.

@ecopinrox
Copy link

Do you want to rewrite the "Mode 3 Operations" section or do you want to delete it? It seems kinda redundant.

This section can be renamed to something like "Pixel FIFO Operation", since the current title implies that the FIFO can operate in other modes as well. However, I think redistributing this section's information over the rest of the page would be a better course of action.

I am very new to emulator development, and the biggest problem I have with the pan docs is that many pages reference information that is introduced only in later pages, providing no clear starting point. Related information is also spread across multiple pages and sections, which will surely cause inconsistencies.

@avivace avivace requested a review from nummacway November 18, 2025 09:50
Copy link

@nummacway nummacway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review probably not finished, but not much time right now.


During this step, a tilemap is sampled to determine which tile to fetch.

The address read depends on whether the BG fetcher is in ["BG mode" or "Window mode"](<#>):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a missing link here (there's another one linking to plain #). The way it sets up the window has not been discussed in the new part.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest, for now, to simply remove the link. Let's add this point to a follow up Issue tracking the remaining things.

The Pixel Slice Fetcher continuously runs in parallel to refill the BG FIFO.
If the OBJ FIFO needs to be refilled, both FIFOs temporarily stop being clocked while the OBJ FIFO "steals" the Pixel Slice Fetcher to get its pixels.

Additionally, in the middle of the scanline, the window may be triggered; this is described in further detail [below](#).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing link to a yet-to-be-created "window" section.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as my previous comment.

Every time both FIFOs are clocked, the selector decides whether to retain the pixel from the BG or OBJ FIFO.

The selection follows the following rules:
1. **In CGB Mode**, if [`LCDC` bit 0 (priority enable)](<#CGB Mode: BG and Window master priority>) is reset, pick the BG pixel.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it pick "OBJ if not transparent, otherwise pick BG" rather than BG in the first place? (If changing the wording, make sure "otherwise" cannot be thought to be related to LCDC bit 0 being reset.)

Comment on lines +170 to +177
[`LCDC` bit 1](<#LCDC.1 — OBJ enable>) toggles whether OBJs are displayed, but the implementation is very different on DMG and CGB.
On all models, `LCDC` bit 1 controls whether pixels from the OBJ FIFO are selected; however, **on monochrome models**, `LCDC` bit 1 being off also causes the OBJ fetcher to be disabled entirely.

This differs in two important ways:
- On DMG, clearing `LCDC` bit 1 causes OBJs not to incur any Mode 3 length penalties; on CGB, Mode 3 length is not affected by `LCDC` bit 1.
- Setting the bit back to 1 in the middle of an OBJ being (putatively) displayed will cause it to appear on CGB, but not on DMG, since its pixels aren't in the OBJ FIFO.

And importantly as well, **this behavior remains in the Color's compatibility mode**, making software behave potentially differently.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention the effect. The effect on CGB is: It behaves like if the color ID was 0 (transparent).

This is also the case for LCDC bit 0, at least in CGB DMG mode, just that BGP is applied so you get whather color ID 0 is mapped to. Since BG and window also cause a 12/6 dot delay by pure existence, did you check if LCDC bit 0 has the same difference between an actual DMG (fetch doesn't happen, no delay) and CGB's DMG mode (just assumes color ID 0)? So when you clear the bit, does mode 3 get shortened by 12 dots (18 if window would have been visible) on DMG hardware? (Let's ignore mid-scanline behavior here for a moment.) Because in CGB DMG mode, it does not affect mode 3 length.


{{#include imgs/src/ppu_overview.svg:2:}}

The Game Boy's rendering process, at its core, works using two queues of pixels, also known as the **pixel [FIFO](https://en.wikipedia.org/wiki/FIFO_(computing_and_electronics))s**: one for "background" pixels, one for [OBJ](#Objects) pixels[^real_fifos].

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a dead internal link.


[^real_fifos]:
Actually, there are more than 2 FIFOs.
For example, on DMG, there are [two FIFOs for BG pixel indices](https://raw.githubusercontent.com/furrtek/DMG-CPU-Inside/master/Schematics/32_BG_PIXEL_SHIFTER.png), [two for OBJ pixels](https://raw.githubusercontent.com/furrtek/DMG-CPU-Inside/master/Schematics/33_SPRITE_PIXEL_SHIFTER.png), [one for OBJ palette bits](https://raw.githubusercontent.com/furrtek/DMG-CPU-Inside/master/Schematics/34_SPRITE_PALETTE_SHIFTER.png), and [one for OBJ-to-BG priority bits](https://raw.githubusercontent.com/furrtek/DMG-CPU-Inside/master/Schematics/26_BACKGROUND.png).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't the CGB need to store the OAM index in some FIFO to resolve OBJ-on-OBJ priority?
Also, didn't we agree to not use furrtek's schematics anymore for some reason?

Copy link
Contributor

@quinnyo quinnyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your efforts!
I've had a solid go at this now. I (re-)read through all the new material, from the top. The overall structure and information presented seems good. It made sense. 👌
Note (again) that I'm not an expert with PPU internals though, so I'm not the best fact checker for this material.

I added a bunch of suggested changes, which are mostly small things, and hopefully pretty self-explanatory.
That gets a bit noisy though, so I'll highlight a couple of more significant issues here:

  • I think the "fetchers" should be renamed.
    • The overlap/ambiguity between "fetcher" and "Pixel Slice Fetcher" slowed me down a lot.
    • The "fetchers" seem to be tightly coupled to their associated FIFO(s?) and also responsible for a bit more than just retrieving data.
    • They seem a bit less like a "fetcher" and a bit closer to a "queue manager".
    • So my idea of a better name would be along those lines (manage, monitor, wrangle) -- but I'm sure someone that understands the topic better probably can come up with something better (and isn't completely wrong 🤷‍♀️)
  • the term "FIFO" could be replaced with "queue" in many cases
    • It's tiring reading FIFO over and over again.
    • "queue" would be used to refer to the logical/conceptual, FIFO to refer to the physical/electronic.
    • With FIFO being the default term used, it stops being possible to use it to refer to specifically the electronic side of things.
    • May aid in dispelling any confusion about "real FIFOs"
  • The structure of the headings and steps (under Pixel Slice Fetcher) could be improved.
    • The pair of 'get tile row' steps in particular are presented in a way that makes understanding a bit more difficult than it needs to be.
      • Merging the pair of them as two parts of the same step seems like a reasonable choice. Effectively having 3 steps instead of 4 would look less intimidating.
    • The step headings could have the step ordinals included, to aid navigation
    • The 'BG fetcher' and 'OBJ fetcher' subheadings under 'Get tile ID' is a bit awkward.
      • It was easy to lose track of the steps while reading these sections.
      • I don't have a good suggestion for this, unfortunately.

::: warning Timings caution

Timings here are not tested by a single test ROM (made especially difficult by their resolution being finer than M-cycles).
The information here was largely obtained from an emulator that passes `intr_2_mode0*` from [this test suite](https://github.com/wilbertpol/mooneye-gb/tree/b78dd21f0b6d00513bdeab20f7950e897a0379b3/tests/acceptance/gpu), but not all of it has been verified from e.g. [hardware schematics](https://github.com/furrtek/DMG-CPU-Inside).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The information here was largely obtained from an emulator that passes `intr_2_mode0*` from [this test suite](https://github.com/wilbertpol/mooneye-gb/tree/b78dd21f0b6d00513bdeab20f7950e897a0379b3/tests/acceptance/gpu), but not all of it has been verified from e.g. [hardware schematics](https://github.com/furrtek/DMG-CPU-Inside).
The information here was largely obtained from an emulator that passes `intr_2_mode0*` from [this test suite](https://github.com/wilbertpol/mooneye-gb/tree/b78dd21f0b6d00513bdeab20f7950e897a0379b3/tests/acceptance/gpu), but not all of it has been verified from e.g. [hardware schematics](https://github.com/msinger/dmg-schematics).

newer / corrected schematics -- the msinger project seems to be the preferred one, but I don't know all the differences/details.

The Game Boy's rendering process, at its core, works using two queues of pixels, also known as the **pixel [FIFO](https://en.wikipedia.org/wiki/FIFO_(computing_and_electronics))s**: one for "background" pixels, one for [OBJ](#Objects) pixels[^real_fifos].
(The Window largely piggybacks on the BG rendering mechanism, more on that below.)

Every "dot", one pixel is shifted off of both FIFOs, and one of them is selected for output.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Every "dot", one pixel is shifted off of both FIFOs, and one of them is selected for output.
Both queues are clocked (advanced by one) on every dot.
The pixel at the front of each queue is shifted out, and one of the pixels (BG or OBJ) is selected for output.

- The BG FIFO is not empty

Once both conditions are fulfilled, the OBJ FIFO takes over, discarding the pixels slices already fetched.
Note that if the BG FIFO is empty, the Pixel Slice Fetcher immediately switches to [Get tile ID](<#Get tile ID>) when refilling it, so the OBJ fetcher will wait for 6 additional dots.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There doesn't seem to be a "base" duration mentioned in this section (OBJ Fetcher). What are these 6 dots in addition to?

Comment on lines +216 to +219
### Get tile row (high)

Exactly the same as [Get tile slice (low)](<#Get tile row (low)>), except the following byte is fetched (i.e. bit 0 of the address is 1 instead of 0).
This step takes 2 dots as well.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering both 'Get tile row' steps are almost identical, presenting them as sibling substeps of a e.g. "Fetch"/"Get tile row" step would make sense.
The headings/structure can enhance the meaning in a more "tangible" way.
The 4 steps listed could be presented as 3 instead, which would look a lot less intimidating.

Comment on lines +221 to +230
#### Bitplane desync

Interesting phenomena can be triggered by changing the address' "parameters" between the two bitplane reads, called "bitplane desyncing".
Since VRAM and OAM cannot be modified during Mode 3 (though OAM DMA can change what the PPU reads from OAM), the parameters that can be changed are [`SCY`](<#FF42–FF43 — SCY, SCX: Viewport Y position, X position>) and [`LCDC bit 4`](<#LCDC.4 — BG and Window tile data area>).

Modifying `SCY` causes the second bitplane (and also the first one, depending on timing) to be read from a different Y offset within the tile than normal.
This does not occur starting with CGB revision D, including AGBs: `SCY` is internally latched during the tilemap read, so both bitplanes are always read correctly.
(Compare [CGB-C](https://github.com/mattcurrie/mealybug-tearoom-tests/blob/70e88fb90b59d19dfbb9c3ac36c64105202bb1f4/expected/CPU%20CGB%20C/m3_scy_change.png) and [CGB-D](https://github.com/mattcurrie/mealybug-tearoom-tests/blob/master/expected/CPU%20CGB%20D/m3_scy_change.png).)

Modifying `LCDC` bit 4 exhibits much more complex behavior, [explained in this document by mattcurrie](https://github.com/mattcurrie/mealybug-tearoom-tests/blob/70e88fb90b59d19dfbb9c3ac36c64105202bb1f4/the-comprehensive-game-boy-ppu-documentation.md#tile_sel-bit-4).
Copy link
Contributor

@quinnyo quinnyo Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(See also: above comment re. reorg 'Get tile row' steps)

Is this level of subheading nesting correct?
The hierarchy:

  • Get tile row (low)
  • Get tile row (high)
    • Bitplane desync
Or maybe this should be in a 'tip' box?

Suggested change
#### Bitplane desync
Interesting phenomena can be triggered by changing the address' "parameters" between the two bitplane reads, called "bitplane desyncing".
Since VRAM and OAM cannot be modified during Mode 3 (though OAM DMA can change what the PPU reads from OAM), the parameters that can be changed are [`SCY`](<#FF42–FF43 — SCY, SCX: Viewport Y position, X position>) and [`LCDC bit 4`](<#LCDC.4 — BG and Window tile data area>).
Modifying `SCY` causes the second bitplane (and also the first one, depending on timing) to be read from a different Y offset within the tile than normal.
This does not occur starting with CGB revision D, including AGBs: `SCY` is internally latched during the tilemap read, so both bitplanes are always read correctly.
(Compare [CGB-C](https://github.com/mattcurrie/mealybug-tearoom-tests/blob/70e88fb90b59d19dfbb9c3ac36c64105202bb1f4/expected/CPU%20CGB%20C/m3_scy_change.png) and [CGB-D](https://github.com/mattcurrie/mealybug-tearoom-tests/blob/master/expected/CPU%20CGB%20D/m3_scy_change.png).)
Modifying `LCDC` bit 4 exhibits much more complex behavior, [explained in this document by mattcurrie](https://github.com/mattcurrie/mealybug-tearoom-tests/blob/70e88fb90b59d19dfbb9c3ac36c64105202bb1f4/the-comprehensive-game-boy-ppu-documentation.md#tile_sel-bit-4).
::: tip Bitplane desync
Interesting phenomena can be triggered by changing the address' "parameters" between the two bitplane reads, called "bitplane desyncing".
Since VRAM and OAM cannot be modified during Mode 3 (though OAM DMA can change what the PPU reads from OAM), the parameters that can be changed are [`SCY`](<#FF42–FF43 — SCY, SCX: Viewport Y position, X position>) and [`LCDC bit 4`](<#LCDC.4 — BG and Window tile data area>).
Modifying `SCY` causes the second bitplane (and also the first one, depending on timing) to be read from a different Y offset within the tile than normal.
This does not occur starting with CGB revision D, including AGBs: `SCY` is internally latched during the tilemap read, so both bitplanes are always read correctly.
(Compare [CGB-C](https://github.com/mattcurrie/mealybug-tearoom-tests/blob/70e88fb90b59d19dfbb9c3ac36c64105202bb1f4/expected/CPU%20CGB%20C/m3_scy_change.png) and [CGB-D](https://github.com/mattcurrie/mealybug-tearoom-tests/blob/master/expected/CPU%20CGB%20D/m3_scy_change.png).)
Modifying `LCDC` bit 4 exhibits much more complex behavior, [explained in this document by mattcurrie](https://github.com/mattcurrie/mealybug-tearoom-tests/blob/70e88fb90b59d19dfbb9c3ac36c64105202bb1f4/the-comprehensive-game-boy-ppu-documentation.md#tile_sel-bit-4).
:::


Once the fetcher reaches this state, it will attempt to push the two bytes it read, plus associated metadata, into the target FIFO on every dot.

The BG FIFO will only accept pixels when it's empty.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could find this confusing because if it only accepts pixels when empty, how does it ever get more than one long?

Does this mean that it then gets filled completely as an atomic operation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 👌

avivace and others added 3 commits November 18, 2025 18:46
Co-authored-by: Quinn <3379314+quinnyo@users.noreply.github.com>
Co-authored-by: Quinn <3379314+quinnyo@users.noreply.github.com>
Co-authored-by: Quinn <3379314+quinnyo@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

content Improvements or additions to documentation enhancement New feature or request help wanted Extra attention is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Contradictions in PPU FIFO size Get Tile step during the FIFO needs correction / improvement

8 participants