Project

General

Profile

Emulator Issues #8053

Shader Generation Slowdown/Framedrops/Stuttering

Added by JMC4789 over 5 years ago. Updated about 3 years ago.

Status:
Fixed
Priority:
Normal
Assignee:
-
% Done:

0%

Operating system:
N/A
Issue type:
Bug
Milestone:
Regression:
No
Relates to usability:
No
Relates to performance:
No
Easy:
No
Relates to maintainability:
No
Regression start:
Fixed in:
5.0-4869

Description

Game Name?

This effects most titles. Notably, in F-Zero GX, Metroid Prime 2/3 and several others, actual defects can be caused by the shader cache stuttering in dualcore mode. Other games just suffer from an inconsistent framerate that can be annoying to users as they play the game.

The Problem:

All modern graphics processors are flexible programmable processors, which use shaders as their application of choice. Dolphin uses shaders to emulate the entire fixed-function pipeline of the Gamecube/Wii GPU.

Updating the state of the GPU is exceedingly cheap on hardware, meaning there can be a lot of shaders being generated in short order.

It takes a reasonable amount of time to generate our shaders (with some maxing out at 30ms according to my tests in F-Zero GX) which guarantee that Dolphin cannot generate the shader in one frame. A second problem is that even when the shaders take less time to generate, there are often 30 - 40 shaders being generated in short order. Even at 1 - 2ms each, that would still lead to stuttering.

In order to partially quell this problem, Dolphin caches the shaders after they are generated. That means on subsequent uses, the shaders don't have to be generated. This does not work on all hardware configurations or drivers, though, and doesn't really solve the problem. Every time Dolphin updates, the cached shaders have to be thrown out currently anyway.


Related issues

Has duplicate Emulator - Emulator Issues #8618: Final Fantasy Crystal Chronicles: The Crystal Bearers Infinite shader caching.Duplicate

Has duplicate Emulator - Emulator Issues #8887: final fantasy cystal bearers slowdownDuplicate

Has duplicate Emulator - Emulator Issues #9081: Mario Kart Wii Race Start FPS DropDuplicate

Has duplicate Emulator - Emulator Issues #9150: Metroid prime trilogy fps slowdownDuplicate

Has duplicate Emulator - Emulator Issues #9189: Direct3D backend microstuttering shader cacheDuplicate

Has duplicate Emulator - Emulator Issues #9318: Beyond Good & Evil heavy stutterDuplicate

Has duplicate Emulator - Emulator Issues #10376: Lag/Stutter in Metal Arms with New System HardwareDuplicate

History

#1 Updated by magumagu9 over 5 years ago

The fact that F-Zero GX etc. crashes is more because our dual-core synchronization is junk rather than anything about the shader cache (essentially, we should pause the CPU thread if the GPU thread is taking longer than expected to execute a draw call, no matter what the cause). That said, slow shader generation certainly makes the issue much easier to reproduce.

#2 Updated by JMC4789 over 5 years ago

Yeah, I should have clarified that better. Ishiiruka has a predictive fifo option that more or less fixes cases like that at a 10% cost of speed or something.

#3 Updated by magumagu9 over 5 years ago

One approach to solve this is to compile a more general shader, which depends on fewer parameters. (We can't build a single shader which handles every possible configuration, but we can get closer.) This would mean compiling fewer programs at the expense of increasing the GPU workload. This could be combined with some sort of shader recompilation on a separate thread to reduce the GPU workload.

Another approach is to speed up shader compilation. OpenGL allows linking together GLSL Objects, which could be faster than compiling a whole shader from scratch. D3D11 has a feature called dynamic linking. Probably not enough to solve this completely, but it could help.

Another approach is to build a database of the shader configurations a game needs, and compile them all at startup. This would be easy to implement, but building the database is a logistical nightmare, so it's probably a bad idea. :)

#4 Updated by JMC4789 over 5 years ago

Something Ishiiruka is working on is interesting is making generic shaders that are slower, but pre-generated to fit every single situation of a game. Then they use their asynchronous shader feature to generate the faster, specialized shaders and dynamically change over to them as the game runs.

#5 Updated by Armada over 5 years ago

magumagu9, the D3D backend already uses dynamic linking, it's just the OpenGL backend that still compiles all stages in one big shader.

#6 Updated by ZephyrSurfer over 5 years ago

How exactly would we do recompilation faster? It would still be making it from scratch right, not manipulating the general one on recompilation.

Or is it that the general shader would catch most cases and a minority would have to be recompiled thusly it's faster?

#7 Updated by Armada over 5 years ago

The general shader would catch more cases meaning we don't need to compile as many shaders.

Compilation won't be faster, we'll just have to do it less often.

#8 Updated by ZephyrSurfer over 5 years ago

Is there a test or something already made that we could use to see exactly how long building a certain set of shaders takes.

A benchmark, if you will
If not, wouldn't it be helpful?

#9 Updated by rodolfoosvaldobogado over 5 years ago

Shader compilation time depends in a lot of factor, API, CPU, GPU and platform.
On windows D3D, that is where I have more experience, the compilation has 3 stages, CODE TO Bytecode using Microsoft HLSL compiler, this generates a driver independent bytecode. the second stage is the first step in the real compilation, once you load the bytecode, is compiled to gpu dependent code but still not optimized, that first code is used immediately to keep rendering and avoid stales, but a third stage starts where the code is recompiled and optimized inside the driver in a different thread, once is finished, it replaces the first un optimized version.
Having this 3 steps the fist stage is the more critical, it can take from 1 millisecond, in really simple shaders, to more than a second (always speaking about dolphin shaders, in reality you could make shader that take more than 10 minutes to compile).
The second stage always take less than 5 ms, and the third stage can take more than 20 ms.
The second and third stage are not controllable from our perspective, they produce increased frame latency but still a tolerable framerate. The first stage is the critical for dolphin.
All the times here are measured with a 3gz AMD cpu as a reference.

#10 Updated by ZephyrSurfer over 5 years ago

I read in PSTextureEncoder.cpp that dynamic linking was disabled has that been fixed since??

#11 Updated by Armada over 5 years ago

That's the wrong file, each shader stage has its own cache and they are definitely linked dynamically. (The code would break without dynamic linking)

#12 Updated by MayImilae over 5 years ago

  • Status changed from New to Accepted

Accepting this, as it's a legit problem. However it is unlikely that this will be addressed any time soon. I've been waiting a loooong time, and will probably have to wait even longer.

#13 Updated by ZephyrSurfer over 5 years ago

Jules.B...@gmail.com then can you explain this //#define USE_DYNAMIC_MODE

I don't get it then. Shouldn't this be uncommented now, since dynamic linking is working?

#14 Updated by Armada over 5 years ago

Dynamic linking is probably just not enabled in PSTextureEncoder.cpp, but that's just the texture encoding shaders which are very simple shaders which are not cached. The large shaders used for actual rendering are in the shader caches and they are dynamically linked.

#15 Updated by rodolfoosvaldobogado over 5 years ago

Dynamic linking is not used inside dolphin for actual gc/wii shaders, that will require a complete change in the structure of the shaders.

#16 Updated by Armada over 5 years ago

rodolfoosvaldobogado, are you sure about that? I've worked with the shaders a lot recently, introducing a geometry shader stage. In the DX11 backend, shaders are cached separately and only linked on runtime. I thought that is what dynamic linking means in the context of shaders?

#17 Updated by Armada over 5 years ago

Also the "gc/wii shaders" have consistent interfaces, so a change of the structure is not necessary for dynamic linking.

#18 Updated by rodolfoosvaldobogado over 5 years ago

Dynamic linking in dx11 is a technique to reuse functionality blocks across different shaders avoiding the classic need to recompile every instance of a shader when a small part changes. It is based on the declaration of common interfaces that are used inside the “base shader”. Once the base shader is compiled you get reference to those interfaces (using reflection) and you then set the implementing classes, at runtime. For dolphin to be able to do this, first the different part of the shader needs to be abstracted using interfaces, then every possible combination for the implementation classes need to be coded. That is in my future plans for dolphin but is far from my current schedule as is a really big refactoring in shader generation code.
For reference se here:
http://msdn.microsoft.com/en-us/library/windows/desktop/ff471420%28v=vs.85%29.aspx

#19 Updated by Armada over 5 years ago

In that case I was confusing dynamic linking with GL_ARB_separate_shader_objects in OpenGL.

#20 Updated by JMC4789 over 5 years ago

issue 8618 has been merged into this issue.

#21 Updated by JMC4789 about 5 years ago

issue 8887 has been merged into this issue.

#22 Updated by JosJuice almost 5 years ago

#23 Updated by JosJuice almost 5 years ago

#24 Updated by JosJuice almost 5 years ago

#25 Updated by JMC4789 over 4 years ago

#26 Updated by eckso over 4 years ago

"Beyond Good & Evil" runs smooth in master if you don't enable VSync... so do you still think this is a shader cache issue?

#27 Updated by JMC4789 about 3 years ago

#28 Updated by JMC4789 about 3 years ago

#29 Updated by JMC4789 about 3 years ago

#30 Updated by emmausssss about 3 years ago

i guess this can be closed now, with ubershaders merged

#31 Updated by MayImilae about 3 years ago

  • Status changed from Accepted to Fixed

Fixed by 5.0-4869.

#32 Updated by JosJuice about 3 years ago

  • Fixed in set to 5.0-4869

Also available in: Atom PDF