Project

General

Profile

Actions

Emulator Issues #3829

closed

Speed improvement for OpenGL Plugin

Added by Metzelmaennchen about 14 years ago.

Status:
Fixed
Priority:
Normal
Assignee:
% Done:

0%

Operating system:
N/A
Issue type:
Bug
Milestone:
Regression:
No
Relates to usability:
No
Relates to performance:
No
Easy:
No
Relates to maintainability:
No
Regression start:
Fixed in:

Description

Hello Devs!
As I like your project I wanted to do my 2cents for you :)
I was always annoyed by the fat slowdown on my machine when watching the Wind Waker intro. At the moment the island pops into the viewport the framerate dropped to ~14fps :(
Having a look at the glcalls with gdebugger the reason was quite obvious... there are around 180000 OpenGL commands executed per frame :( Most of them vertex shader programs.

Well, attached you will find a patch with tweaked SetMultiVSConstant functions. The patch is backward compatible if the GL_EXT_gpu_program_parameters extension isn't supported. In that case the old, slow version is used :(
If it is available, the plugin is quite as fast as the dx version now :)
Now its also more fun to watch the MP1 intro :)

Actions #1

Updated by james.jdunne about 14 years ago

  • Status changed from New to Accepted

I will apply your patch locally and test. If it does what you say it does, I just might commit it :)

Actions #2

Updated by james.jdunne about 14 years ago

  • Status changed from Accepted to Work started

Sorry, I don't have Wind Waker. I tried your patch and didn't notice any difference while playing NSMBW. I tested against the World 8 map and consistently get ~40FPS with and without your patch.

Perhaps my hardware is unaffected by your changes? nVidia GeForce 9800 GTX+

Since it's such a small patch and doesn't seem (to me) to cause any harm, I'll commit it.

Actions #3

Updated by james.jdunne about 14 years ago

  • Status changed from Work started to Fixed

Patch applied in r6713

Actions #4

Updated by Metzelmaennchen about 14 years ago

9800 GTX+ is a heavy shader cruncher... go back and try a hd4870 ;)
But you should see an effect in an opengl debugger (using gdebugger which is for free). To speak in numbers.
Before there were a max of 190k OGL calls for the Wind Waker intro (isn't there a demo flying around?) and now the max is around 35k :)
Having a look at NSMBW (into) there were 24k OGL calls which now drops to 8.5k :)

Actions #5

Updated by james.jdunne about 14 years ago

Well your patch is in so I'll just go ahead and trust you :)

Actions #6

Updated by Metzelmaennchen about 14 years ago

That's nice that you'll trust me ;)
Now I go for the redundant Get/Set changes... my profiler reports around 95% unnecessary changes. As they are not cheap, I think there is something more to gain with a bitmask tracking the states.

Actions #7

Updated by NeoBrainX about 14 years ago

fwiw, don't any decent gfx drivers optimize out obsolete (i.e. duplicate) state changes anyway?
At least that's what I heard for D3D, not sure if they do that for OGL, too.

Actions #8

Updated by Metzelmaennchen about 14 years ago

Hello NeoBrain!
Very interesting stuff you are pointing here and you are totally right (for dx this only works in pure mode if I remember correctly). Hopefully every driver writer implements such a thing :)
But there is still the cost of sending calls to the driver. On a first shot there are three functions which use 26% of all calls... reported to be 95% useless.
Breaking down these calls is visible in a lower usage of the corresponding driver, well at the moment there is a benefit of 1,6% for the fglrx_dri module :)

Actions #9

Updated by sl1nk3.s about 14 years ago

I didn't see much of a difference here using a 4870 on 7x64, but that's a good patch tho.

Actions

Also available in: Atom PDF