For converting coordinates on the screen to coordinates on the window:
Position in window = position on screen - window's position
E.g. if your window is as 100,100 and you click at 150,150 on the screen, that will be 50,50 in the window.
For compositing window managers (where each window is drawn into its own buffer, and the window manager then combines them together), the easiest way is to draw them back-to-front, so overlapping buffers. All of this is drawn into the window manager's buffer, then once all the windows are drawn to it, this is the copied into the graphic device's framebuffer.
Now, you might have noticed there's a lot of overdrawn (pixels being copied from background windows that are covered up by foreground windows). If you are comfortable doing rectangle slicing, you can cut one rectangle out of another, that way you don't copy pixels that aren't shown, and you can avoid the need for a window manager's back buffer.
---
This is what I do: I iterate through the windows back-to-front, and call Draw on my window.
https://github.com/AndrewAPrice/Percept ... positor.ccBut the Draw function doesn't really draw, it calls the CopyTexture function adds a rectangle to copy to quad tree.
Here's my quad tree that slices up/removes rectangles behind it:
https://github.com/AndrewAPrice/Percept ... uad_tree.hThen my window manager sends a collection of "copy rectangle from a to b" commands to the graphics driver.