Project

General

Profile

Actions

Emulator Issues #12961

closed

Vulkan backend multithreading not actually running in parallel

Added by TellowKrinkle almost 2 years ago. Updated over 1 year ago.

Status:
Fixed
Priority:
Normal
Assignee:
-
% Done:

0%

Operating system:
N/A
Issue type:
Bug
Milestone:
Regression:
No
Relates to usability:
No
Relates to performance:
No
Easy:
No
Relates to maintainability:
No
Regression start:
Fixed in:
5.0-17728

Description

Game Name?

Legend of Zelda Skyward Sword

Game ID? (right click the game in the game list, Properties, Info tab)

SOUE01

MD5 Hash? (right click the game in the game list, Properties, Verify tab, Verify Integrity button)

89387d670395b2a2c32f77f167763115

What's the problem? Describe what went wrong.

Backend multithreading doesn't actually result in any processing happening in parallel

What steps will reproduce the problem?

  1. Maybe use a debug build of dolphin, or your stack traces will be missing data
  2. Launch Dolphin from Instruments collecting a system trace
  3. Run the game with the Vulkan renderer
  4. Load a save state in Skyloft
  5. Stop the system trace and look at it

Is the issue present in the latest development version? For future reference, please also write down the version number of the latest development version.

5.0-16704

Is the issue present in the latest stable version?

Not tested, hopefully the provided information will make that not matter

What are your PC specifications? (CPU, GPU, Operating System, more)

CPU: i9-9980HK
GPU: AMD Radeon Pro 5600M
OS: macOS 12.4 (21F79)

Is there anything else that can help developers narrow down the issue? (e.g. logs, screenshots,
configuration files, savefiles, savestates)

Check the attached "SystemTrace.png" to see the system trace (taken of a release build). In it, you can see submissions are about as threaded as a Python program, with the GPU thread immediately waiting for the submission thread after almost every submission. The attached "StackTrace.png" is a stack trace of the GPU thread during one of its waits. It appears to be stuck where the command buffer manager calls WaitForCommandBufferCompletion to wait for a different command buffer, but WaitForCommandBufferCompletion decides it has to wait for the submission thread to finish before it does anything at all.


Files

SystemTrace.png (88.4 KB) SystemTrace.png TellowKrinkle, 06/24/2022 12:12 AM
StackTrace.png (82.9 KB) StackTrace.png TellowKrinkle, 06/24/2022 12:12 AM
Actions #1

Updated by TellowKrinkle almost 2 years ago

Adding the following code to the beginning of SubmitCommandBuffer so that the wait happens before submitting the next command buffer fixes the issue, but feels incredibly hacky.

  const u32 next_buffer_index = (m_current_frame + 1) % NUM_COMMAND_BUFFERS;
  FrameResources& next_resources = m_frame_resources[next_buffer_index];

  // Wait for the GPU to finish with all resources for this command buffer.
  if (next_resources.fence_counter > m_completed_fence_counter)
    WaitForCommandBufferCompletion(next_buffer_index);
Actions #2

Updated by golivax over 1 year ago

TellowKrinkle wrote:

Adding the following code to the beginning of SubmitCommandBuffer so that the wait happens before submitting the next command buffer fixes the issue, but feels incredibly hacky.

  const u32 next_buffer_index = (m_current_frame + 1) % NUM_COMMAND_BUFFERS;
  FrameResources& next_resources = m_frame_resources[next_buffer_index];

  // Wait for the GPU to finish with all resources for this command buffer.
  if (next_resources.fence_counter > m_completed_fence_counter)
    WaitForCommandBufferCompletion(next_buffer_index);

TellowKrinkle wrote:

Adding the following code to the beginning of SubmitCommandBuffer so that the wait happens before submitting the next command buffer fixes the issue, but feels incredibly hacky.

  const u32 next_buffer_index = (m_current_frame + 1) % NUM_COMMAND_BUFFERS;
  FrameResources& next_resources = m_frame_resources[next_buffer_index];

  // Wait for the GPU to finish with all resources for this command buffer.
  if (next_resources.fence_counter > m_completed_fence_counter)
    WaitForCommandBufferCompletion(next_buffer_index);

This is interesting. I have an anecdotal evidence. On Android, every game that I have (e.g., Wii Sports) runs better (higher framerate) on Vulkan compared to OpenGL EXCEPT for Skyward Sword. Could be a coincidence as well, who knows. Do you think this issue that you're reporting applies to Skyward Sword only or is it general? If you ever build an APK with your patch, I'd be glad to test it.

Actions #3

Updated by JMC4789 over 1 year ago

  • Status changed from New to Fixed
  • Fixed in set to 5.0-17728
Actions

Also available in: Atom PDF