OSDev.org

The Place to Start for Operating System Developers
It is currently Tue Sep 25, 2018 4:47 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 15 posts ] 
Author Message
 Post subject: Overcomplicated C/C++ build systems
PostPosted: Tue Sep 11, 2018 3:46 pm 
Offline
Member
Member
User avatar

Joined: Mon Jun 05, 2006 11:00 pm
Posts: 1991
Location: USA (and Australia)
Does anyone else feel like C/C++ build systems are generally overcomplicated? I love the language but none of the build systems.

I'd like a system where at the root level of my project I can say here's a list of libraries I depend on, here are some preprocessor definitions (if it's a debug, release, unit test build.) Now, build every .cc file in every subdirectory. Maybe a simple blacklist of files/directories:
Code:
if !WIN32 exclude "platforms/win32/"


Then in my home directory have a config file that has a library library (a database of libraries) that has a list of installed libraries. For each library it has:
- a unique name
- the directory to include
- the file to link with

Perhaps some system-wide definitions, e.g. MAC_OS_X, X86, etc

Then my projects build file is simply:

Code:
type executable
output "my program"
library SDL
library libcxx
if !WIN32 exclude "platforms/win32"
if !MAC_OS_X exclude "platforms/win32"


For including libraries, you can substitute it with a path to another project with "type library", and it'll make sure to build that before yours.

For building libraries, you could whitelist files/folders of what become the public includes.

This is just an idea I have in my head. It would be fun to build a proof of concept.

_________________
My OS is Perception. (1 2)


Top
 Profile  
 
 Post subject: Re: Overcomplicated C/C++ build systems
PostPosted: Tue Sep 11, 2018 4:30 pm 
Offline
Member
Member
User avatar

Joined: Fri Oct 27, 2006 9:42 am
Posts: 1311
Location: Athens, GA, USA
EDIT: Never mind, I was misreading your intention. Sorry for that. I'll keep what I wrote here below, but I don't think it is applicable after all.

Perhaps I am misconstruing what you are saying, but isn't that exactly what make(1) is? I know it sucks in a lot of ways (the 'requiring tabs at the start of a rule' thing is particularly notorious, and odious), but it does have the advantage of being relatively simple compared to, say, ANT, and readily available if you care to use it (being bundled in some form or another with nearly all C/C++ compilers).

I mean, the example Makefile given by Wikipedia is just:

Code:
edit: main.o kbd.o command.o display.o
   cc -o edit main.o kbd.o command.o display.o
     Perhaps I am misconstruing what you are saying, however.
main.o: main.c defs.h
   cc -c main.c
kbd.o: kbd.c defs.h command.h
   cc -c kbd.c
command.o: command.c defs.h command.h
   cc -c command.c
display.o: display.c defs.h
   cc -c display.c

clean:
   rm edit main.o kbd.o command.o display.o


As ugly as it is, it is pretty much just what you seem to be asking for.

(Interestingly, this example code didn't have tabs on the Wicked-Pedo page, presumably because Mediawiki chokes on tabs or something like that - not that lacking support for tabs is any great loss, except in this instance).

Mind you, the typical Makefile defines names for things such as the compiler, assembler, options list, and so on, to reduce repetition and to keep changes to them in a single location. Furthermore, the GNU version of make has a bunch of additional features as well (in addition to having an option to replace the rule prefix tabs with some arbitrary string). Still, at the barest level, this is what it is.

Also, you might consider just why this was seen as inadequate for larger builds by some people later, leading them to develop the more baroque build tools you were complaining about. I am not defending them, but I am guessing that were you to write a build system of your own, you'd end up finding yourself adding similar ornamentation as you ran into the limits of your design.

Keep in mind, too, that some of the build formats are designed to make generating them with tools easier, and aren't really meant to be hand-edited. Whether this is a good design principle or a bad one is left as a exercise for the reader.

I am not saying you don't have a point, but I think the point goes deeper than merely, "why is this so complicated?" There's a lot more to the question than it might appear at first.

_________________
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
μή εἶναι βασιλικήν ἀτραπόν ἐπί γεωμετρίαν
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.


Top
 Profile  
 
 Post subject: Re: Overcomplicated C/C++ build systems
PostPosted: Tue Sep 11, 2018 9:08 pm 
Offline
Member
Member
User avatar

Joined: Mon Jun 05, 2006 11:00 pm
Posts: 1991
Location: USA (and Australia)
Thanks for the reply. That makefile is very simple, but it's mostly boilerplate. Most of the could be eliminated, no? I don't blame make because it's designed to be language agnostic. Because, a build system for C++ could be higher level and infer that this is a directory of C++ source files, so build everything in here, and automatically calculate dependencies.

I know make can do it, but with the boilerplate.

_________________
My OS is Perception. (1 2)


Top
 Profile  
 
 Post subject: Re: Overcomplicated C/C++ build systems
PostPosted: Tue Sep 11, 2018 11:23 pm 
Offline
Member
Member

Joined: Mon Jul 25, 2016 6:54 pm
Posts: 124
Location: Adelaide, Australia
IMO the big problem is that flexibility and convenience are often mutually exclusive goals.
To get more specific though, systems like Make were designed decades ago, and are maintained with the assumption that they could be running on Windows 10 in WSL, or on a PDP-11 with Berkeley Unix using the same source code, and that they should give sane responses even in insane environments. Also that any design decisions made by uni students 60 years ago should be preserved forever.
It bugs me to no end that writing even a very simple c application requires such verbose scripts to build.


Top
 Profile  
 
 Post subject: Re: Overcomplicated C/C++ build systems
PostPosted: Wed Sep 12, 2018 12:40 am 
Online
Member
Member

Joined: Wed Mar 30, 2011 12:31 am
Posts: 254
Make's biggest problem is that so many people write Makefiles by copying existing ones and never really learn how Make works.

This is the complete Makefile for one of my old projects:
Code:
LDLIBS := -lpthread
CFLAGS := -g -pedantic -std=c99

all: cgiserver
It never even specifies how to build cgiserver from its sources - because it doesn't need to! Make can figure that out itself, and will use the LDLIBS and CFLAGS variables in its recipes!

In fact, with the right command line invocation you don't even need a Makefile for Make to do its job! Try it out for yourself: Write up a hello.c and run make hello - it'll work!

Make's reputation has been tarnished by the likes of autotools and recursive Makefiles. Makefiles don't need to be complicated messes!

Make has built-in rules to turn C and C++ files into object files. It has built-in rules to turn multiple object files into a single binary. All you need to do is define your additional dependencies!
Code:
awesome-app: stuff.o things.o
If you have awesome-app.c, stuff.c, things.c then Make will do the right thing! Without ever specifying anything else!

Make will even default to the first target you specify - all is just a convention!

Do you want parallelism in your build? Define your dependencies correctly and make -jN for up to N concurrent build processes!

Let's talk about GNU Make, specifically.

You want to detect your sources automatically? Make can do that!
Code:
awesome-app: $(patsubst %.c,%.o,$(wildcard *.c))
You've got directories? Make can do that too, and without any recursive nonsense if you do it right!
Code:
awesome-app: $(patsubst %.c,%.o,$(wildcard *.c) $(wildcard */*.c))
Don't know how deep your directories go? Think all those wildcard calls are ugly? Maybe find can help you!
Code:
awesome-app: $(patsubst %.c,%.o,$(shell find -name '*.c'))
You want to build different sources for different platforms? We can do that!
Code:
SRCS=$(wildcard common/*.c)
ifneq (,$(findstring Microsoft,$(shell uname -r)))
   SRCS += $(wildcard platform/win32/*.c)
else
   SRCS += $(wildcard platform/linux/*.c)
endif
OBJS= $(patsubst %.c,%.o,$(SRCS))

awesome-app: $(OBJS)
But why stop with Make? Let's talk about pkg-config while we're here!

You've got dependencies on third-party libraries? We can do that!
Code:
LDLIBS=$(shell pkg-config --libs cairo)
CFLAGS=$(shell pkg-config --cflags cairo)

SRCS=$(wildcard src/*.c)
ifneq (,$(findstring Microsoft,$(shell uname -r)))
   SRCS += $(wildcard platform/win32/*.c)
else
   SRCS += $(wildcard platform/linux/*.c)
endif
OBJS= $(patsubst %.c,%.o,$(SRCS))

awesome-app: $(OBJS)
Make can be powerful and simple at the same time, you just need to know how to use it!

_________________
gitlab | twitter | ToaruOS | PonyOS 5.0 | ToaruOS-NIH - a completely-from-scratch ToaruOS distribution


Last edited by klange on Wed Sep 12, 2018 1:03 am, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: Overcomplicated C/C++ build systems
PostPosted: Wed Sep 12, 2018 12:47 am 
Offline
Member
Member
User avatar

Joined: Thu Nov 16, 2006 12:01 pm
Posts: 7266
Location: Germany
I would like to hear your take on CMake, when it comes with some pre-configuration. More specifically, what do you think of JAWS?

I know that it can be called overcomplicated at the other end of the spectrum, as it includes much you don't really think about when starting a project. That was kind of the point of it, though -- it's much harder to retrofit these things (like automated unit testing) than having plumbing in place from the beginning.

As for just taking all the C++ source in a directory and compile it... no. For one, any non-trivial project has its source subdivided into modules. You do not want them all lumped together. You also have test drivers (which don't get compiled into release code), you have obsolete / deprecated sources in there, something you dropped there for temporary reference etc.; so no, globbing for *.cpp and just compiling everything is not a good idea.

But yes, adding new source files should be really simple. That's why I isolated these things in JAWS, having the "lists of things" you will touch every day in one file (CMakeLists.txt), and all the "logic" you should not have to touch more than a couple of times in another (cmake/JAWS.cmake).

_________________
Every good solution is obvious once you've found it.


Top
 Profile  
 
 Post subject: Re: Overcomplicated C/C++ build systems
PostPosted: Wed Sep 12, 2018 7:02 am 
Offline
Member
Member
User avatar

Joined: Mon Jun 05, 2006 11:00 pm
Posts: 1991
Location: USA (and Australia)
Thank you for the replies everyone.

I'm leaning towards the opinion that you ought to keep a clean house. Code under development can be hidden behind compile time or run time flags, but obsolete code ought to be deleted. A version control system allows you to browse past revisions. Any major VCS gives you a web interface for exploring your source repository, which gives you a permalink to file+revision+line that you can bookmark or share via e-mail, paste in a bug, paste into a doc, etc.

Regarding modules:
The build system should let you blacklist specific files/directories. Keeping with the principal of keeping a clean house, this should be far and few. So you could blacklist "src/modules". Or you could just have separate source directories for the master program and the modules.

_________________
My OS is Perception. (1 2)


Top
 Profile  
 
 Post subject: Re: Overcomplicated C/C++ build systems
PostPosted: Wed Sep 12, 2018 9:13 am 
Offline
Member
Member
User avatar

Joined: Mon Jun 05, 2006 11:00 pm
Posts: 1991
Location: USA (and Australia)
Solar wrote:
I would like to hear your take on CMake, when it comes with some pre-configuration. More specifically, what do you think of JAWS?


I'm interested in JAWS. Is there an example of what a setup looks like?

_________________
My OS is Perception. (1 2)


Top
 Profile  
 
 Post subject: Re: Overcomplicated C/C++ build systems
PostPosted: Fri Sep 14, 2018 8:24 am 
Offline

Joined: Tue Jan 02, 2018 12:53 am
Posts: 15
Location: Australia
In an ideal build system, there would be no need for a make utility or makefiles at all. That functionality should be built into the compiler for the language in question, and all information needed to successfully and efficiently build the program would be contained within the source and header files themselves. Dependencies, platform specific modules, etc, could either be inferred automatically, or explicitly defined using compiler directives or metaprogramming if needed. This would make source files completely self-contained and would drastically simplify the build process.


Top
 Profile  
 
 Post subject: Re: Overcomplicated C/C++ build systems
PostPosted: Fri Sep 14, 2018 9:35 am 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8515
Location: At his keyboard!
Hi,

Qbyte wrote:
In an ideal build system, there would be no need for a make utility or makefiles at all. That functionality should be built into the compiler for the language in question, and all information needed to successfully and efficiently build the program would be contained within the source and header files themselves. Dependencies, platform specific modules, etc, could either be inferred automatically, or explicitly defined using compiler directives or metaprogramming if needed. This would make source files completely self-contained and would drastically simplify the build process.


My theory is that ideally it would all be designed more like an online game.

A group of people can join the same Minecraft server and collaborate to build a complex machine (in redstone, etc) in real-time; with no revision control, no compiler, no build system and no time waiting for the complex machine to be processed before it can be used. Why can't a group of programmers join the same server and collaborate to build a complex piece of source code in real-time; with no revision control, no compiler, no build system and no time waiting for the source code to be processed before it can be executed?

I think the reason our tools suck is "componentization" - almost nobody sees it as a single system to create software, they see it as isolated components with specific roles. Someone might create a new component (a new compiler, a new editor, a new build system) but almost nobody tries to create a whole system.


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Overcomplicated C/C++ build systems
PostPosted: Sat Sep 15, 2018 2:35 am 
Offline
Member
Member
User avatar

Joined: Thu Nov 16, 2006 12:01 pm
Posts: 7266
Location: Germany
MessiahAndrw wrote:
Solar wrote:
I would like to hear your take on CMake, when it comes with some pre-configuration. More specifically, what do you think of JAWS?


I'm interested in JAWS. Is there an example of what a setup looks like?


Just follow the link, export the sources and have a look at the README.

_________________
Every good solution is obvious once you've found it.


Top
 Profile  
 
 Post subject: Re: Overcomplicated C/C++ build systems
PostPosted: Sat Sep 15, 2018 8:11 am 
Offline
Member
Member
User avatar

Joined: Fri Oct 27, 2006 9:42 am
Posts: 1311
Location: Athens, GA, USA
Brendan wrote:
My theory is that ideally it would all be designed more like an online game.

A group of people can join the same Minecraft server and collaborate to build a complex machine (in redstone, etc) in real-time; with no revision control, no compiler, no build system and no time waiting for the complex machine to be processed before it can be used. Why can't a group of programmers join the same server and collaborate to build a complex piece of source code in real-time; with no revision control, no compiler, no build system and no time waiting for the source code to be processed before it can be executed?


I honestly can't tell if you are being sarcastic here. While this idea is intriguing - and frankly, even at its worst it would still be an improvement over some of the sh(1)-shows I have had the misfortune to be involved in - it sounds pretty much like the exact opposite of all the things you've advocated in the past about having a strong, central hand in control of the development and ensuring things got done. I am not saying they are actually contradictory goals, but I think meshing them might be difficult.

I am also not sure if most corporate IT departments would sign off on this approach, despite the fact that I have (as I mentioned) seen far worse being made Official Company Policy in the past. The management of companies, both big and small, is mostly ego-driven; managers rarely like the idea of a system that doesn't let them put their fingerprints (and maybe certain other body parts) all over everything, even (or perhaps especially) when they have no idea of what is going on. The very fact that the myth of 'Waterfall development' - which was never a workable model (it was introduced as a straw-man argument), and hence was never actually used by the developers of any project, anywhere - persists in management circles even now, is proof that IT management and planning are rarely hindered by reality.

This brings me to one problem I see with this, at least as you are presenting it: I have seen the problems that arise from 'cloud' (ugh) based development systems, the ShillFarce system in particular. While the fact that none of those I've seen in professional use have done it well doesn't mean it is a bad idea, but I am wary of it.

I am particularly concerned about the idea of a centralized, rather than distributed, approach to hosting such a system; the usual result of this is that the code and data are in effect held for ransom by the company doing the hosting, and once committed to a host or system, you rarely have any viable exit strategy except to start over from scratch. It also requires you to trust that they are competent at keeping everything up, load-balanced, secured, and backed up; while some are pretty good about this, a lot more do a lackluster or worse job of it. To be fair, this is a sore spot for me personally, as this was exactly the scenario I saw at my last position, and dealing with this day after day was part of what led to my meltdown towards the end.

But I don't see any reason why the system would need to be centralized to that degree. Indeed, I suspect you goals would be better served by a peer-to-peer approach, so that a) a given developer can continue working even if entirely offline, or unable to reach a server and/or other devs for some other reason (which happens a lot more often than you would expect, even in the best systems), withe the system automatically updating everywhere (as a 'branch') once the developer is connected again; b) it would shift the system from a single primary point of failure to many smaller ones, where no failure of a single node would be fatal; and c) it would distribute the loading automatically, so that while each node would be sending and receiving broadcast updates, none of them would act as a bottleneck for the others.

Finally, I am assuming that all of this applies to the 'Development server', and that while all of this is going on there is an equally automatic process - presumably managed by someone, but requiring little intervention from most of the individual devs - for pushing the program through a series of unit, integration, and all-up tests, and some way for the project leads to decide which stages of the project to pass to UX testing, acceptance ('alpha') and release ('beta') testing. Surely you don't mean to have the developers working directly on the user release version... do you? Because that's some pretty mid-1990s web dev, cowboy Bovine Excrement right there. No sane developers - not even web developers - work directly in production these days, and trust me, there are good reasons for this.

(Note the 'sane' qualifier. One of the Snailfarts insultants in my last job got the ax when we caught him doing that exactly that. Piece of advice: if you find yourself in a job where making unregulated changes to Production isn't a firing offense, walk away, for the sake of your own sanity.)

Brendan wrote:
with no revision control, no compiler, no build system and no time waiting for the source code to be processed before it can be executed?


I suspect that you don't mean that the tools wouldn't be there, but that the tools would be mostly invisible, running in an online mode, operating automatically without direct action from the programmers. I can certainly get behind this generally, as it more or less parallels my own ideas. As I would see it, the 'compiler' would operate in a compile-and-go parti-operation mode, recompiling the code as it is being edited and maintaining a dynamic set of possible outcomes; since this is in fact how some Lisp 'interpreters' actually work, it is a natural fit to my ideas.

Similarly, since a part of my goal is to have something similar to Xanadu, I mean to have 'revision control' as an inherent part of the storage system; the storage system, which records changes to a document as a series of branching paths and never deletes the earlier versions, would serve as a rolling log of changes (smaller series of changes would be journalled to some degree for efficiency's sake, but the sequence would always be present), with the 'version control' only accessed directly if you needed to roll something back or bump a milestone - the VCS would mostly just be an application for browsing the document's history, and annotating different 'branches' as 'main branch' and so only. The biggest complication would be in integrating parallel work (which is going to happen, is going to be necessary, even for the kind of collaboration you are talking about; if nothing else, you need to have 'maintenance' branches, as you will still have people working with older versions which need to get bugs fixed and such-like. The main role of the lead developer would be to decide which changes get merged into the final 'release'.

Of course, my approach also focuses on individual elements, not on applications (as an 'application' in my system would be just a collection of separate functions/tools which somebody bundled together - no one piece of software is separate from the rest). If each 'project' is only a small collection of functions or objects, or a 'framework' of several other projects tied together, there is a lot less need for complex project management for the individual projects. Integration - and a single overarching and integrated 'system' - is the name of the game, if I ever do get that far (not likely).

_________________
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
μή εἶναι βασιλικήν ἀτραπόν ἐπί γεωμετρίαν
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.


Top
 Profile  
 
 Post subject: Re: Overcomplicated C/C++ build systems
PostPosted: Sat Sep 15, 2018 12:21 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8515
Location: At his keyboard!
Hi,

Schol-R-LEA wrote:
Brendan wrote:
My theory is that ideally it would all be designed more like an online game.

A group of people can join the same Minecraft server and collaborate to build a complex machine (in redstone, etc) in real-time; with no revision control, no compiler, no build system and no time waiting for the complex machine to be processed before it can be used. Why can't a group of programmers join the same server and collaborate to build a complex piece of source code in real-time; with no revision control, no compiler, no build system and no time waiting for the source code to be processed before it can be executed?


I honestly can't tell if you are being sarcastic here. While this idea is intriguing - and frankly, even at its worst it would still be an improvement over some of the sh(1)-shows I have had the misfortune to be involved in - it sounds pretty much like the exact opposite from all the things you've advocated in the past about having a strong, central hand in control of the development and ensuring things got done. I am not saying they are actually contradictory goals, but I think meshing them might be difficult.


I'm quite serious.

Let's start with something more traditional. For almost all source code (for almost all languages) there's a set of "top level things" (e.g. typedefs, global data, global functions, class definitions, ...). If the language was designed with a specific keyword for each of these things and didn't allow nesting (e.g. no type definitions in the middle of a function, etc) then you could scan through the source code relatively quickly to find the start and end of each top level thing and build a list of the top level things, then determine signatures/types of each top level thing in parallel, then convert each top level thing into IR in parallel (while doing all sanity checks and doing partial optimisation).

Now, let's get rid of the "plain text source code" insanity and have a single binary file containing all of the source code, which also keeps track of various pieces of metadata (the list of top level things, which top level things depend on which other top level things, etc), and also caches the "already sanity checked and partially optimised IR" for all the top level things. When a top level thing (e.g. a global function) is modified you'd invalidate the previously cached IR for it (and if it's signature changed also invalidate the IR for things that depended on it) and put the invaldiated top level thing/s in a queue of top level things that need to be regenerated, and have threads running in the background to regenerate whatever was invalidated. This means that when any code is changed it doesn't take long to regenerate the IR for it (even without a huge "compile farm" regenerating the IR for each separate top level thing in parallel, although that wouldn't hurt either) simply because you're not regenerating all the IR for the entire program (and it'd be finer granularity than people currently get from "make" and object files).

Now let's add "client/server" and multi-user to it. When the client is started it displays some kind of diagram showing the top level things and their dependencies (which is information the server already knows/caches); plus a few buttons to add a new top level thing, delete an existing top level thing, etc. When the user clicks on a top level thing in the diagram the server gives the client the source for it and the server keeps track of the fact that the client is viewing that top level thing; and if the user starts editing anything the client tells the server to lock the top level thing (so that only one person can modify the same top level thing at a time) and if that succeeds (nobody else is editing) the client tells the server the changes the user makes and the server broadcasts changes to any other clients that the server knows is viewing that top level thing. Let's also assume that:
  • While the source for a top level thing is being modified, the client is converting it into tokens and doing syntax checking before sending it to the server and only sending "syntactically valid tokens", and the server is only double checking that it's valid.
  • When client asks server for a top level thing to be locked for writing, the server stores a copy of the current version, and (if the user doesn't cancel their changes and does commit their changes) the server stores the replaced old version of the top level thing somewhere (likely with some other info - who modified it when, and maybe a commit message), so that people can roll back changes later.
  • As part of tokenising, the client would convert names (function names, variable names, type names, etc) into a "name ID" to put in the token; which means that when anything is renamed the server would only need to update a "which name to use for which name ID" structure and inform all the other clients, so that changing the name of anything costs almost nothing (no need for programmers to update all the source code where the old name was used, no need for server to regenerate IR everywhere, etc)

Of course with the server only dealing with "already tokenised and checked for syntax" source code; the previous "doesn't take long to regenerate the IR because you're not regenerating doing all the IR for the entire program" becomes even quicker. Specifically, for most cases (e.g. only a single function was modified) it'd be so fast that there'd it can appear to be "almost instant".

So.. now we've got an environment where multiple programmers can collaborate, that auto-generates sanity checked and partially optimised IR in an "often almost instant" way. The next step is to add an interpreter to execute that sanity checked and partially optimised IR, so that it can be executed "almost instantly". Of course you'd also have a proper compiler running in the background doing the full "whole program optimisation" thing to generate an actual executable; but programmers wouldn't need to wait for that to be finished before they can test the code in the interpreter (with all the additional debugging support).

However, often (especially with multiple people simultaneously editing source code) you'd have to assume that the program isn't in a state where it can be executed, and testing a whole program is fairly inconvenient anyway (especially for things like services or libraries that don't have a nice user interface or anything). The obvious solution is to add a new kind of "top level thing" for unit tests; so that programmers can test a small piece of the program (execute individual unit test/s only, using the interpreter/debugger) even when other unrelated pieces aren't in a consistent state.

Note that I've mostly been assuming a simple language like C, where only one layer of "top level things" is enough; but there wouldn't be much reason why you can't have 2 or more layers (e.g. "top level packages, second level classes, third level methods"). Also; I should admit that I mostly focus on "an application is a collection of processes working together" (where each source code project for each process is relatively small and has a well defined purpose) and don't care much about "application is a single massive blob of millions of lines of code" (which would be harder/slower to deal with).

Of course it wouldn't be easy to design or implement a system like this, and there would also be multiple "unexpected problems" and/or additional tricks or features that my simple overview doesn't cover, and you'd probably want to augment it with other ideas (starting with some kind of chat system so users can communicate with each other); but (with enough work) I think it's entirely possible in practice.

In other words (as far as I can tell), you can have a group of programmers join the same server and collaborate to build a complex piece of source code in real-time; with no revision control, no compiler, no build system and no time waiting for the source code to be processed before it can be executed.

The only thing that's really stopping this is that people are obsessed with recycling horrible ideas from half a century ago (e.g. "source code as collection of plain text files").

Schol-R-LEA wrote:
I am also not sure if most corporate IT departments would sign off on this approach, despite the fact that I have (as I mentioned) seen far worse being made Official Company Policy in the past. The management of companies, both big and small, is mostly ego-driven; managers rarely like the idea of a system that doesn't let them put their fingerprints all over everything, even (or perhaps especially) when they have no idea of what is going on. The very fact that the myth of 'Waterfall development' - which was never a workable model (it was introduced as a straw-man argument), and hence was never actually used by the developers of any project, anywhere - persists in management circles even now, is proof that IT management and planning are rarely hindered by reality.


I'm not suggesting that the entire world would immediately abandon everything they're using as soon as the first ever attempt at this kind of tool is released. ;)

Schol-R-LEA wrote:
This brings me to one problem I see with this, at least as you are presenting it: I have seen the problems that arise from 'cloud' (ugh) based development systems, the ShillFarce system in particular. While the fact that none of those I've seen in professional use have done it well doesn't mean it is a bad idea, but I am wary of it.

I am particularly concerned about the idea of a centralized, rather than distributed, approach to hosting such a system; the usual result of this is that the code and data are in effect held for ransom by the company doing the hosting, and once committed to a host or system, you rarely have any viable exit strategy except to start over from scratch. It also requires you to trust that they are competent at keeping everything up, load-balanced, secured, and backed up; while some are pretty good about this, a lot more do a lackluster or worse job of it. To be fair, this is a sore spot for me personally, as this was exactly the scenario I saw at my last position, and dealing with this day after day was part of what led to my meltdown towards the end.


There's no need for cloud and no need for a third party to host the server. Anyone that feels like it could run their own server on their own private LAN, and anyone that feels like it (who has a publicly accessible IP address) can run a server that isn't restricted to a private LAN. Of course there's also no reason a third party couldn't provide "server as a service" either. It could be the same as (e.g.) git (where anyone can run the server but companies like gitlab and github also exist).

Schol-R-LEA wrote:
But I don't see any reason why the system would need to be centralized to that degree. Indeed, I suspect you goals would be better served by a peer-to-peer approach, so that a) a given developer can continue working even if entirely offline, or unable to reach a server and/or other devs for some other reason (which happens a lot more often than you would expect, even in the best systems), withe the system automatically updating everywhere (as a 'branch') once the developer is connected again; b) it would shift the system from a single primary point of failure to many smaller ones, where no failure of a single one would be fatal; and c) it would distribute the loading automatically, so that while each node would be sending and receiving broadcast updates, none of them would act as a bottleneck for the others.


You're right - it could be more "peer-to-peer" (and at a minimum I'd want to support things like redundancy); it's just easier to describe (and probably easier to design and implement) as "client/server" because of the synchronisation involved (complex peer-to-peer systems tend to get messy due to the consensus problem).

Schol-R-LEA wrote:
Finally, I am assuming that all of this applies to the 'Development server', and that while all of this is going on there is an equally automatic process - presumably managed by someone, but requiring little intervention from most of the individual devs - for pushing the program through a series of unit, integration, and all-up tests, and some way for the project leads to decide which stages of the project to pass to UX testing, acceptance ('alpha') and release ('beta') testing. Surely you don't mean to have the developers working directly on the user release version... do you? Because that's some pretty mid-1990s web dev, cowboy Bovine Excrement right there. No sane developers - not even web developers - work directly in production these days, and trust me, there are good reasons for this.


Yes, just "development server" - you'd still do things like fork the project for each release, and have UX testing and alpha/beta/release candidate versions, etc. For automated testing, I'd want that integrated as much as possible, ideally so that test results can be fed straight back to the programmer/user soon after they commit changes.

Schol-R-LEA wrote:
Brendan wrote:
with no revision control, no compiler, no build system and no time waiting for the source code to be processed before it can be executed?


I suspect that you don't mean that the tools wouldn't be there, but that the tools would be mostly invisible, running in an online mode, operating automatically without direct action from the programmers. I can certainly get behind this generally, as it more or less parallels my own ideas. As I would see it, the 'compiler' would operate in a compile-and-go parti-operation mode, recompiling the code as it is being edited and maintaining a dynamic set of possible outcomes; since this is in fact how some Lisp 'interpreters' actually work, it is a natural fit to my ideas.


The underlying functionality (at least most of it, possibly combined with some that can't be found in existing tools) would still be there; but the individual tools themselves wouldn't exist in a recognisable way. It'd be like grabbing GCC, make, GIT, an IDE (eclipse?) and some kind of server (Apache) and throwing them into a wood chipper and then gluing the chips of wood together.

Schol-R-LEA wrote:
Similarly, since a part of my goal is to have something similar to Xanadu, I mean to have 'revision control' as an inherent part of the storage system; the storage system, which records changes to a document as a series of branching paths and never deletes the earlier versions, would serve as a rolling log of changes (smaller series of changes would be journalled to some degree for efficiency's sake, but the sequence would always be present), with the 'version control' only accessed directly if you needed to roll something back or bump a milestone - the VCS would mostly just be an application for browsing the document's history, and annotating different 'branches' as 'main branch' and so only. The biggest complication would be in integrating parallel work (which is going to happen, is going to be necessary, even for the kind of collaboration you are talking about; if nothing else, you need to have 'maintenance' branches, as you will still have people working with older versions which need to get bugs fixed and such-like. The main role of the lead developer would be to decide which changes get merged into the final 'release'.


For what I described you'd still be able to do the equivalent of creating a "diff" between different versions of one branch and then try to apply that diff to the current version of a different branch (with the same problems people get now when "try" doesn't mean "succeed").


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Overcomplicated C/C++ build systems
PostPosted: Sat Sep 15, 2018 1:34 pm 
Offline
Member
Member
User avatar

Joined: Fri Oct 27, 2006 9:42 am
Posts: 1311
Location: Athens, GA, USA
Fair enough; I think we agree on this more than disagree, frankly. I was most surprised because it seemed at odds with much of what you'd said before, but now you've explained why this doesn't contradict your previous statements.

_________________
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
μή εἶναι βασιλικήν ἀτραπόν ἐπί γεωμετρίαν
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.


Top
 Profile  
 
 Post subject: Re: Overcomplicated C/C++ build systems
PostPosted: Sun Sep 16, 2018 12:43 pm 
Offline

Joined: Tue Jan 02, 2018 12:53 am
Posts: 15
Location: Australia
Schol-R-LEA wrote:
Fair enough; I think we agree on this more than disagree, frankly. I was most surprised because it seemed at odds with much of what you'd said before, but now you've explained why this doesn't contradict your previous statements.

I can also vouch for the effectiveness of what Brendan is talking about, since I've worked on a similar but more rudimentary project a few years ago. Back then, I wrote a command line interface for a virtual machine I designed that made use of what I called an "incremental compiler". In this model, the programmer enters source code on the command line, and then submits it to the compiler incrementally, instead of as a single monolithic blob at the end. In my command line, pressing the enter key generated a regular whitespace character, whereas shift + enter submitted everything you had typed thus far since your last submission. When you defined a function, you would simply type out that function definiton, and then submit it. An interpreter would then parse the submission and compile the function into native machine code right there and then, as well as add its metadata to a data structure that maintained a list of all the things you had submitted, such as functions, global variables, etc, and their interdependencies. You could view that list and click on an entry within it to edit it. Each time you made an edit, the changes were stored in a piece table, so you could look at the entire history of changes that were made to that function, and rewind to any point in the past if needed. This was all saved as a session (or project), and you could work on multiple sessions concurrently.

The beauty of this was that you could also call a function you had already defined from within the command line simply by typing its name and supplying its arguments. The interpreter would recognize this as code to be executed immediately, and would call the machine code translation of the function, which would run at native speed. This removed the distinction between a general purpose language like C and a command language like Bash, as well as the distinction between a command line and a text editor, so a single language and environment could be used for everything on the system. It also made universal syntax highlighting much simpler and more efficient, since when you submitted a defintion of a function, class or variable, that was kept in a table that would be scanned each time the user pressed a key, so all identifiers could be syntax highlighted, not just keywords or other built-ins. This was simpler than coding a truly real-time system where everything is monitored and updated on keystroke granularity.

Scaling this up to be suitable for large-scale development would be a fair bit of work, but certainly acheivable and with the myriad of benefits that Brendan mentioned.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 15 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group