Emulator Issues #6936
PPC_FP Merge Master Issue
This issue is mainly to keep track of all of the things broken by the PPC_FP merge. Phire and I both thought this would be a good idea to have for when it (inevitably) gets fixed.
#2 Updated by a41pizza over 7 years ago
Sonic Colors has these white graphics defects that show up on D3D and OpenGL (can't see any menus on OpenGL because it is broken with this: https://code.google.com/p/dolphin-emu/issues/detail?id=6914)
#5 Updated by AlbertWesker1988 over 7 years ago
WWE 13 collision Bug issue
#10 Updated by phire over 7 years ago
Ok the problem is the lfs (load float single) and stfs (store float single) instructions.
The PowerPC(gecko) only has double precision registers, so it converts singles to doubles when they are loaded into FP register file. But singles don't gain any extra precision, the single point operations still only use around 32 bits of the register (This is not entirely true, there is some special handling for denormal singles)
Dolphin emulates this behaviour by converting all singles to doubles using the CVTSS2SD instruction, then after each single precision op (actually implemented as a double) it executes a pair of CVTSD2SS/CVTSS2SD instructions to keep the precision of float in the double register at only single point.
This gives dolphin and gecko mostly the same behaviour.
The problem arises when the "denormal as zero" flag is enabled.
Because single <--> double conversions happen automatically on the gecko when singles are loaded in and out of memory, there is no need for explicit conversion instructions, while dolphin on the x86 has extra explicit single <--> double conversion instructions all over the place, including the lfs and stsf instructions.
Because the x86's conversions are explicit, they count as operations and denormals are changed to zero, while gecko's conversions are implicit and don't trigger a denormal as zero conversion.
This means on the gecko you can safely copy a 32 bit value from one place in memory to another with just a lfs/stfs pair while dolphin with it's new daz code will mutilate these values.
#11 Updated by phire over 7 years ago
But why does JitIL still work?
JitIL has an optimisation which folds duplicate IL opcodes into each other and this optimisation (correctly?) assumes that a SingleToDouble followed by a DoubleToSingle is a nop and optimises it out.
So JitIL does the correct thing when a lfs/stfs pair are inside the same jit block however the mutilated SingleToDouble value is still calculated and written to the register file.
If a game does this trick over multiple jit blocks, JitIL will still generate an incorrect result.
#13 Updated by degasus over 7 years ago
phire: imo we have to check what happens on real hw when we load a denormalized single and use it as double. Does it normalize them on the convertion, also in non-ieee mode?
Maybe there is a fast way to correctly load a single without ctz (but a branch if ieee-mode is enabled)
#14 Updated by phire over 7 years ago
- Status changed from Accepted to Work started
degasus: The docs say that singles are normalized into the double registers and denormalized on the way back out. The docs (PowerPCProgEnv.pdf) also list the exact algorithm used.
flacs was suggesting that we use the x87 hardware to do float to double conversions, as it is just sitting around doing nothing.
There is also the possibility that x86 processors will correctly pipeline changes to the mxcsr register. I've seen no evidence to the contrary, but I'd want to do a test program/microbenchmark before going down that path.
#17 Updated by degasus over 7 years ago
the next commit are likely not "minimal" ;)