OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 2:33 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 19 posts ]  Go to page Previous  1, 2
Author Message
 Post subject: Re: Is my GUI design terrible?
PostPosted: Wed Sep 20, 2017 3:58 am 
Offline
Member
Member

Joined: Sat Oct 16, 2010 3:38 pm
Posts: 587
onlyonemac wrote:
mariuszp wrote:
Yes, I had the problem with an event queue before, and fixed it in (almost) exactly the same way as you just described.
Good. How did you fix it?


BY having the OS have a blocking "wait for mouse" call, which after returning reports only the most recent mouse state. The previous ones are not queued up.

Also, it seems most of my performance issue is with text rendering, which I do using FreeType. A 5-paragraph "lorem ipsum" takes 12 seconds to render 10 times (in DejaVu Sans, 20)...


Top
 Profile  
 
 Post subject: Re: Is my GUI design terrible?
PostPosted: Wed Sep 20, 2017 5:51 am 
Offline
Member
Member

Joined: Thu May 17, 2007 1:27 pm
Posts: 999
Your ddiWritePen() function is horribly inefficient.

  • You're calling FT_Load_Glyph in a loop without reusing the result. It's probably worth to try to cache the result of that function.
  • The same might also apply to FT_Render_Glyph. You might want to look into FreeType's caching API or roll your own cache.
  • Instead of directly modifying the target surface you're drawing every single pixel with a ddiFillRect(). ddiFillRect() thus handles special cases like negative offsets for every single pixel! Just manipulate the surface memory directly. Also swap the x and y loops to get better caching behavior.
  • ddiWritePen() performs a realloc() for each character in the text!
  • ddiFillRect() computes the surface pitch on the fly, which involves a multiplication. Change that to a shift or store the pitch as part of the surface.
  • ddiFillRect() calls into ddiFill(), which calls into ddiCopy() once per pixel! ddiCopy() seems to be a roll-your-own memcpy() (that still calls memcpy?). Apart from being incorrect (accessing the buffer as uint64_t violates aliasing rules) I also expect it to be slower than GCCs internal optimized SSE memcpy().

_________________
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].


Top
 Profile  
 
 Post subject: Re: Is my GUI design terrible?
PostPosted: Wed Sep 20, 2017 1:57 pm 
Offline
Member
Member

Joined: Sat Oct 16, 2010 3:38 pm
Posts: 587
Korona wrote:
Your ddiWritePen() function is horribly inefficient.

  • You're calling FT_Load_Glyph in a loop without reusing the result. It's probably worth to try to cache the result of that function.
  • The same might also apply to FT_Render_Glyph. You might want to look into FreeType's caching API or roll your own cache.
  • Instead of directly modifying the target surface you're drawing every single pixel with a ddiFillRect(). ddiFillRect() thus handles special cases like negative offsets for every single pixel! Just manipulate the surface memory directly. Also swap the x and y loops to get better caching behavior.
  • ddiWritePen() performs a realloc() for each character in the text!
  • ddiFillRect() computes the surface pitch on the fly, which involves a multiplication. Change that to a shift or store the pitch as part of the surface.
  • ddiFillRect() calls into ddiFill(), which calls into ddiCopy() once per pixel! ddiCopy() seems to be a roll-your-own memcpy() (that still calls memcpy?). Apart from being incorrect (accessing the buffer as uint64_t violates aliasing rules) I also expect it to be slower than GCCs internal optimized SSE memcpy().


The library is compiled with -O3 which appears to optimise ddiCopy(). But either way, I'll look into all the issues one-by-one and see how much i can boost the performance


Top
 Profile  
 
 Post subject: Re: Is my GUI design terrible?
PostPosted: Fri Sep 22, 2017 8:39 am 
Offline
Member
Member

Joined: Sat Oct 16, 2010 3:38 pm
Posts: 587
WOW!

The loop and other micro-optimisations didn't do too much, but the caching certainly has. Render 5 paragraphs of "lorem ipsum" ten times previously took approximately 12000 ms, now it takes an average of 470 ms. This is more than 25 times faster!

I;ll continue to look for possible optimisaitons, and will get onto implementing partial updates in the compositor.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 19 posts ]  Go to page Previous  1, 2

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 21 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group