Why SMP?

Octacone · Post by **Octacone** » Mon Aug 12, 2019 10:20 am

Are there any real benefits when using SMP instead of single core tasking?
Do I magically get 75% more performance (assuming 4 cores, 1 started by default, each core 25%?) after enabling SMP?
I'm talking about hobby OSes, is it worth the hassle?

Korona · Post by **Korona** » Mon Aug 12, 2019 10:51 am

It's not clear what this question really asks.

Do you get a 4x speedup on all workloads? No, some workloads are just inherently sequential or bounded by I/O.

Does it give a measurable speedup, e.g., on boot time? Yes.

nullplan · Post by **nullplan** » Mon Aug 12, 2019 12:07 pm

SMP relaxes some restrictions you'd otherwise have. For instance, if you have a dynamic tick system, you only need to enable the tick if more tasks are runnable than cores available (otherwise the tasks can just keep running). To be sure, it requires planning (especially the scheduler), but hey, you knew this hobby would be challenging when you went in.

Octacone · Post by **Octacone** » Wed Aug 14, 2019 12:24 pm

I was thinking about something specific, for example would it speed up my GUI or make it more "fluent" in a way?
Something like:
1 core = 14 FPS
4 cores = 56 FPS
Without changing the code that runs the GUI, only adding SMP support.

Another example will it calculate some number "X" times faster?

Schol-R-LEA · Post by **Schol-R-LEA** » Wed Aug 14, 2019 12:57 pm

Octacone wrote:I was thinking about something specific, for example would it speed up my GUI or make it more "fluent" in a way?
Something like:
1 core = 14 FPS
4 cores = 56 FPS
Without changing the code that runs the GUI, only adding SMP support.

In a single process with a single thread[1], without changes to the code? No, at least not in most instances without some sort of compiler sleight-of-hand. What it will do is allow you to have multiple processes - or multiple kernel-supported threads within a process, if your kernel scheduler allows it - to run in parallel on different cores, truly simultaneously, rather than interleaved via time slicing.

IOW, while it won't make a single strictly sequential program any faster, it will let you run several programs at once faster than could be done if they were all splitting time on a single core, and a program which uses multithreading may run faster if the kernel scheduler is thread-aware and capable of floating threads across multiple cores (and the threads are compute-bound rather than I/O-bound or IPC-bound). To put it another way, if there are 4 programs running on a four-core system, such that each gets a core scheduled more or less to itself, they will each run at something close to what they would if they were the only program running.

In any case, the CPU cores might not be the best thing for speeding up the GUI - the real answer for that specific case most likely is to have a good GPU driver (which has enough cores, and enough specialized architectural operations, to make a difference), rather than trying to use software rendering. Keep in mind that internally, the GPU is a specialized form of symmetrical multiprocessor (mostly, it's a bit more complicated than that but it's close enough for this discussion), one that is designed for that particular task.

Of course, this leads to the problem of having the necessary documentation for the GPU, which is its own can of worms.

Octacone wrote:Another example will it calculate some number "X" times faster?

No, not in the general case. For specific cases, where the problem is broken up over multiple threads, it depends on how parallelizable the problem is, but even under the best case scenario it won't scale linearly, as there will be IPC overhead, added scheduling overhead, etc.

If the problem is something like, 'compute a single spreadsheet formula', then no, almost certainly not. If it is 'compute several spreadsheet cells in parallel', maybe, if the answers don't cascade (that is, they can each be computed without knowing the results of any of the other cells' computations). If it is 'add five thousand numbers together into one final result', well, there are ways to use multiple threads on multiple cores to speed it up, yes, through some sort of divide and conquer approach, but it won't be a linear speed-up[2], and it would have to be programmed to do it somewhere (there are specialized compilers which can parallelize that sort of thing, but your typical GCC back-end isn't going to without a lot of tweaking) - and again, this is more something you'd ideally use GPGPU programming for if the dataset is large enough to justify it (5000 probably isn't).

[1] It helps to keep the distinction between 'thread' and 'process' in this topic, even if you don't use multithreading - every process has at least one implicit thread of execution, even if there are no explicitly created threads. Technically, the process is the protected memory environment in which the process's threads run, not the implicit thread of execution itself.
[2] Algorithmically, such a divide and conquer approach is potentially better than linear, even for a strictly sequential algorithm; the so-called 'Russian peasant method' of multiplication is just such an algorithm. However, when you take both setup time and IPC overhead into account, and the fact that you'd have more threads than cores for the general case (or possibly the reverse for manycore or GPGPU programming), it's pretty much a coin flip whether it would be better or worse than applying this sort of tree reduction as a single thread or not.

Octacone · Post by **Octacone** » Sat Aug 17, 2019 1:17 am

@Schol-R-LEA that was super helpful, thanks for explaining it thoroughly. Others too.
My conclusion is that I don't need SMP at the moment and implementing it would be a hassle, without much performance gain.

Schol-R-LEA · Post by **Schol-R-LEA** » Sat Aug 17, 2019 9:48 am

OK, your call, sure. You still might want to at least consider how your scheduler is designed, and what it will take to make it SMP-capable at some point in the future, as some things which work fine with a single processor can go very awry when used in a multi-core environment, especially in task synchronization and IPC.

And keep in mind that, no matter what, if you have a multi-core CPU, you'll still have the unused cores there, using a certain amount of power even when idle. You might as well put them to use at some point.

Also, to really answer the question accurately, I'd need to know more about how your OS runs as it is right now. For example, if you are running a lot of background processes (and many systems have tons of them, especially if they use a GUI, as they often use them to update things which aren't touched directly by the running applications, such as the clock display), then it would indeed improve responsiveness - it won't make the rendering faster, but it might mean that it is less likely to bog down when some compute-heavy operation is hogging a core.

I don't recall if you have posted a link to your repo before or not (and I couldn't find a reference to one on an admittedly cursory search of your posts). Could you give us one, so we can to a quick peek at your code?

Octacone · Post by **Octacone** » Sat Aug 17, 2019 10:51 am

@Schol-R-LEA
Right now as we speak I'm working on my multitasking code. I'm trying to refine it from the ground up. So I figured it's the right time to ask this question.
To be honest, I'm still learning about multitasking and not everything has yet to click together. It would be silly of me to try to implement SMP at this point.
I figured that the main reason why I loose motivation is that I always set too high of a goal, eventually fail and loose interest. It's better to go step by step, one small goal at the time.
I don't have a repository (yet), waiting to buy a laptop (so I can create a repository and then work on both my PC and laptop at the same time, while keeping everything in sync).

Schol-R-LEA · Post by **Schol-R-LEA** » Sat Aug 17, 2019 12:07 pm

Octacone wrote:@Schol-R-LEA Right now as we speak I'm working on my multitasking code. I'm trying to refine it from the ground up. So I figured it's the right time to ask this question. To be honest, I'm still learning about multitasking and not everything has yet to click together. It would be silly of me to try to implement SMP at this point.

Ah, OK, that makes sense.

Octacone wrote:I figured that the main reason why I loose motivation is that I always set too high of a goal, eventually fail and loose interest. It's better to go step by step, one small goal at the time.

Yeah, I have that problem. I always tend to overreach, with the result that I never actually get very far. If you can restrain that impulse, it shoudl help a lot.

Octacone wrote:I don't have a repository (yet), waiting to buy a laptop (so I can create a repository and then work on both my PC and laptop at the same time, while keeping everything in sync).

record scratch Well, crud. That's not good.

Stop. Stop whatever you are doing and set one up. Right Now. Don't wait. Every moment you delay is one more moment in which your hard drive could fail, you could accidentally overwrite good code with bad, or you could find yourself having to backtrack to where a bug snuck in and not have any clue when and where the code was changed. If nothing else, a free repo host such as Github, Gitlab, or CloudForge would give you another backup of your code. Trust me, even for a hobby project, losing your code sucks.

Not to mention that DVCS is now ubiquitous, so learning how to use it is an important programming-related skill. Getting practices helps. I know you know GitHub, as you've mentioned reading other projects' code when studying how to write your own, so I probably don't need to say this, but it's worth repeating.

Repos are about a lot more than synchronization. The main purpose of a Version Control System is in the name for this class of software - it gives you an audit trail on your code changes, so you can see how it changed over time, and so you can retain a record of what you have changed over time. All the other functions - backup, merging of separate modifications, sync over multiple systems - are secondary, with most of them being an outgrowth of the main function rather than a feature in and of themselves.

Hell, prior to the introduction of CVS in 1990, this was pretty much the sole reason for having version control at all - most of the earlier tools such as SCCS or RCS were for local revision control on a single system, for a single developer, with the support for multiple users (if any) being bolted on as an afterthought in many cases. The widespread adoption of CVS (which became the de facto standard between 1998 up to the arrival of Subversion in 2003), which was a massive improvement over RCS in several ways especially IRT concurrent users (the name even stands for 'Concurrent Versions System'), and several web front-ends for it were developed. It was refined further with Subversion, but the adoption of distributed VCS systems such as BitKeeper, Mercurial and Git changed the focus towards sync features, which is where a lot of this misunderstanding comes from.

Octacone · Post by **Octacone** » Sun Aug 18, 2019 10:34 am

@Schol-R-LEA
The main reason I don't have a GitHub repository (GitHub is my favorite btw) is because it didn't use to have free private repositories. Since Microsoft owns GitHub now, that is not longer a case.
I just didn't feel like publishing the entire project at once, it's like skipping 1000 commits. I wanted to have a file by file detailed commits, but that was too much job anyways. Also I have an SSD.
TLDR: I'm lazy

Schol-R-LEA · Post by **Schol-R-LEA** » Sun Aug 18, 2019 11:14 am

Oh, well, your funeral. On a related note, if you do decide you would like one or more of us to review the code for you (many eyes make all bugs shallow etc.) you might want to find some way to post the relevant portions of the code for us some other way (as file attachments, on Pastebin, etc.). But that's your call if you even want that.

nullplan · Post by **nullplan** » Sun Aug 18, 2019 10:27 pm

With git you can create a repository in any directory on your hard disk by opening a shell in there and doing:

Code: Select all

git init
git add *.c #and whatever other files you really need under source control
git commit -m "Initial commit"

Then you can work with that repository until you are satisfied that it can be published, go to whatever site, create an empty repo, and do this:

Code: Select all

git remote add origin [email protected]:repo.git
git push -u origin master

The URL for the "remote add" step is the same as the clone URL of the repo, and will be provided to you by the site you use.

Anyway, even a local repo with no external mirror already allows you to undo accidental deletions. And really useful features like "git bisect" can help you find a regression that snuck in somewhere.

Octacone · Post by **Octacone** » Tue Aug 20, 2019 2:46 am

@Schol-R-LEA sure thing
@nullplan yeah I know, I used it with GitLab for my bootloader.

loonie · Post by **loonie** » Fri Aug 23, 2019 4:33 am

Octacone wrote:I was thinking about something specific, for example would it speed up my GUI or make it more "fluent" in a way?
Something like:
1 core = 14 FPS
4 cores = 56 FPS
Without changing the code that runs the GUI, only adding SMP support.

You have one mouse & one keyboard and events need to be processed in order - so there wont be any speedup here.

If you have several windows drawing into different portion of the screen - then yes double speed up with 2 cores assuming you are not using videocard and you have reasonably fast synchronization implemented (topic for another thread).
If you are dragging one window and if window is reasonably large then yes again - double speedup with 2 cores but you need even better synchronization implemented to draw same window on 2 CPUs.

As far as changing you existing code - it depends on what you already have. Maybe you can simply break window into 2 rects, ask each CPU to redraw its own rect, and wait until each CPU finished its task. A lot depends what you already have and how its implemented.

There used to be post on this forum:
The guy copied its GUI data structure with interrupts disabled (simplified: linked list of z-ordered RECTs that are windows plus some other info). Then he was able to ido ts GUI processing on any CPU he likes (any CPU that is free). By GUI processing I mean breaking screen into small parts, determine which window occupies certain rects and finding window ID under the cursor.

If you start thinking about multi CPU early then you have less to rewrite later. You'll make mistakes, will see certain design issues and may try to work around them in you existing single CPU code. But its sort of boring and doesn't yield any immediate improvements.

eekee · Post by **eekee** » Sun Aug 25, 2019 3:54 pm

Octacone wrote:Do I magically get 75% more performance (assuming 4 cores, 1 started by default, each core 25%?) after enabling SMP?

You do if everything is shell scripts.

I wrote a shell-script web server to run a friend's shell-script CMS. SMP made a huge difference to them. I'm not quite sure why, because you'd think they'd be IO bound, but they weren't. Perhaps it was the markdown converter in awk, or something. There was a lot of awk. Oh wait, when tested on single-core it used a Perl markdown converter, possibly with slow regexps. Performance was appalling on 466MHz PPC single-core, but fine on 2.4GHz Core 2 Quad.

OSDev.org

Why SMP?

Why SMP?

Re: Why SMP?

Re: Why SMP?

Re: Why SMP?

Re: Why SMP?

Re: Why SMP?

Re: Why SMP?

Re: Why SMP?

Re: Why SMP?

Re: Why SMP?

Re: Why SMP?

Re: Why SMP?

Re: Why SMP?

Re: Why SMP?

Re: Why SMP?