Do you understand (relatively) complex code of others?

Muazzam · **Posted:** Sat Jan 27, 2018 6:50 am

People, even those who are not programmers, casually claim they have read code of Facebook/Linux/Bitcoin/etc. Here's one who (casually) says that he has read all of code of Bitcoin: https://www.quora.com/Is-the-cryptocurr ... s-Altucher

I consider myself an above average programmer but still 90% of code written by others, even simple 1000 line programs, make no sense to me, even worse, getting how all these parts work together. And I'm assuming that I know the technology and language. I wonder if it's some programming version of dyslexia.

I have got to ask, do you have any trouble reading and understanding complex codebases? How long does it take and do you face any initial trouble getting how all parts work together?

Korona · **Joined:** Thu May 17, 2007 1:27 pm **Posts:** 999

This is a very soft question. How long does it take to understand a large code base entirely? Probably forever. How long does it take to understand a single operation of Linux kernel driver code? That is multiple minutes to one or two hours for me. How long does it take to understand a complex driver like the Intel i915 graphics driver? That is over a week (but of course I'm not looking 9-5 at this code in that week).

alexfru · **Joined:** Tue Mar 04, 2014 5:27 am **Posts:** 1108

Muazzam wrote:

do you have any trouble reading and understanding complex codebases?

Naturally.

Muazzam wrote:

How long does it take and do you face any initial trouble getting how all parts work together?

It depends. Hours, days, months, years. I cannot possibly understand how all parts work together in a large and complex project unless it's not all parts or the understanding is approximate. Even in your own code at some point you start losing it (pun intended).

bluemoon · **Posted:** Sat Jan 27, 2018 9:13 am

It depends on how you familiar with the topic and the code quality of other. It help a lot if you know what the code does (and you have ability to write your own version).
I code review my coworkers code on daily basis.

However when talk about Facebook/Linux/Bitcoin, the technical complexity dominate (be it data processing, os internal, or crypto/maths), those are concepts and architectures, the coding/implementation part become a smaller factor. (like you can't comprehend Linux internal just with some app programming knowledge, you need specialize into it, and even osdever sometime thinks linux architecture is a mess).

Schol-R-LEA · **Posted:** Sat Jan 27, 2018 1:10 pm

I would be extremely skeptical about claims regarding having read the code of Facebook - IIRC, most of it is proprietary, so the only way that would be possible is to either have worked for Facebook, engaged in or received it from cracking/corporate espionage, or taken part in a massive reverse-engineering project (which wouldn't even get the actual source code, just an approximation - and would still require access to the server software itself, which in turn would require working for or spying on the company).

They might have looked at the JavaScript components, which are indeed visible (though I am pretty sure they apply obfuscation to at least some of those), but that would only be a small part of their code and would be nearly unintelligible without knowledge of the middleware and back ends.

But all of that is beside the point, as anyone claiming to have 'see the code' for any program of significant size without specifying which parts of that code they have seen is probably being disingenuous at best. For something the size of, say, cURL - not a particularly large package, even if you include the libraries that come with it - it is likely that even in the original version, only the primary developers had read and understood all of the code, and for later versions it is likely that no one has seen all of it, never mind understood all of it, simply because of the time it would take.

TightCoderEx · **Posted:** Sat Jan 27, 2018 2:30 pm

This should actually be moved to General Ramblings as it has nothing to do with programming.

As to claims, talk is cheap and all too often people claiming something, not only about programming, but most everything, wouldn't be able to back it up. But then let's just say for arguments sake, this guy is legitimate, then we just need ask, what does it matter. Has your world changed, the way you do things, what you eat for breakfast, has any of that changed. No, what he does in the grand scheme of things means nothing from nothing. I don't mean to imply he or anyone is insignificant, for all I know, what he does has some relevance, just not to me and probably not you.

On the other hand though, let's suppose his methodologies were really important to you, then you should be able to hit him up and ask, how do you go about doing what you're doing and he in turn should be able to tell you how he does it. The only thing you should be concerned about, is what you are doing and the way you are doing it, is it giving you the results you want.

StudlyCaps · **Posted:** Sat Jan 27, 2018 11:56 pm

Muazzam wrote:

People, even those who are not programmers, casually claim they have read code of Facebook/Linux/Bitcoin/etc. Here's one who (casually) says that he has read all of code of Bitcoin: https://www.quora.com/Is-the-cryptocurr ... s-Altucher

I consider myself an above average programmer but still 90% of code written by others, even simple 1000 line programs, make no sense to me, even worse, getting how all these parts work together. And I'm assuming that I know the technology and language. I wonder if it's some programming version of dyslexia.

I have got to ask, do you have any trouble reading and understanding complex codebases? How long does it take and do you face any initial trouble getting how all parts work together?

That Bitcoin link in particular I wouldn't read much into. His justifications for his assertions lack evidence and the whole thing has a quasi-religious flavor, so I would say that he is trying to convince rather than inform the reader.
That said, even he says he read the code many times to understand it. I would say this is typical for most people, and also, I would say reading code is a skill. It is a skill that is related to writing code, but is definitely independent of it. So if you want to improve, practice reading code specifically.
Personally, I do understand others code. It may take me some time, and the time depends on the quality of the code, and the complexity of the system, as you may expect.

Solar · **Posted:** Mon Jan 29, 2018 6:26 am

I have been working 9-to-5 on a specific code base for the last 10 years. About 100,000 lines of C++. I've been refactoring parts, extending others, fixing bugs etc.

Have I had every source file open in my editor at some point or another? Probably.

Have I actually, consciously, read every single line of code? Probably not.

Have I understood every part of the code, and how it interacts with others? Definitely not.[1]

Let me qualify that latter statement. Unless it's your hobby to actually read code, you are looking for something in the code. A specific functionality where you suspect a bug to reside, which you found troublesome and want to refactor, or you want to hook into with some extension.

So you "drill through" all kinds of trampoline / utility / unrelated code, going "nope... nope... nope... ah", and then start actually studying the code in detail.

Have you "read" the code you "drilled through"? Not by my definition, as you have skipped over it, probably not even thinking about what the code you're looking at does -- because all that mattered to you is that it wasn't the piece of code you were looking for. You might have missed bugs, or genius feats of engineering, simply because "reading code" is not like "reading a book".

---

[1]: I know there is a TrieMap implementation in there. I'll probably even find the corresponding implementation file on the second attempt. I've had a look at it at some point, because I was thinking about replacing the somewhat large-ish code with something from std:: or boost::, found that it was implemented in a way that made a drop-in replacement virtually impossible, and left it alone since. It's working. So why should I read the nitty-gritty when I can't do much else but break it?

Thomas · **Joined:** Thu Jun 04, 2009 11:12 pm **Posts:** 281

Muazzam wrote:

People, even those who are not programmers, casually claim they have read code of Facebook/Linux/Bitcoin/etc. Here's one who (casually) says that he has read all of code of Bitcoin: https://www.quora.com/Is-the-cryptocurr ... s-Altucher

I consider myself an above average programmer but still 90% of code written by others, even simple 1000 line programs, make no sense to me, even worse, getting how all these parts work together. And I'm assuming that I know the technology and language. I wonder if it's some programming version of dyslexia.

I have got to ask, do you have any trouble reading and understanding complex codebases? How long does it take and do you face any initial trouble getting how all parts work together?

I have the following comments,

Ego is not your friend, if you know the person who wrote the module - I would go and ask him
Learning how to use the debugger effectively helps a lot!
It makes sense to think in terms of user scenarios than actual code.

To answer your question from my point of view, usually not everything but good enough to finish the task at hand.
--Thomas

~ · **Joined:** Tue Mar 06, 2007 11:17 am **Posts:** 1225

Muazzam wrote:

People, even those who are not programmers, casually claim they have read code of Facebook/Linux/Bitcoin/etc. Here's one who (casually) says that he has read all of code of Bitcoin: https://www.quora.com/Is-the-cryptocurr ... s-Altucher

I consider myself an above average programmer but still 90% of code written by others, even simple 1000 line programs, make no sense to me, even worse, getting how all these parts work together. And I'm assuming that I know the technology and language. I wonder if it's some programming version of dyslexia.

I have got to ask, do you have any trouble reading and understanding complex codebases? How long does it take and do you face any initial trouble getting how all parts work together?

I see myself as another computer who doesn't care how a program is written, the only thing I care about is knowing step by step what data is generated and sent to me, its structure, and for what.

I achieve this by adding lines to the programs to make it generate an UTF-8 HTML document with all data from a program step by step, even indentation, even categorized and capable of being enabled/disabled to inspect only a little part of it. In this way, I can make a program that produces a study document that is always becoming more humanly readable. If I want to understand something from a program, I concentrate on a part, I take my time, make it generate HTML at run time, and then I gradually understand what is happening. Then I can reimplement my own code without having to fully understand the programming style of it, only its tricks.

So you can make the analysis of the code as low level and fragmented or as high level and oriented to end results as you like with this.

Debuggers could be made to do the same so that running programs or OSes could be attached to an interface that knows the exact binary being analyzed, and then generate human-readable documentation from a pure binary running.

You can see how I do this, the program in which I've done it the most is my own C compiler attempt, but when I finish it, I will use this technique in any other program. I've only used it very little in other programs like DOS Wolf3D, or DOS z26 Atari emulator, but it's powerful as it lets you see how a program does things turning the computer into your teacher using your own words. And if there are tasks for which there are "magic" generated values you don't understand, you will get to know that fact, that you don't understand or know what they are using, which you couldn't have known if you don't understand how the program is written, but that now you can investigate to complete the documentation the program generates so that its data makes sense to you.

Remember that there is intermediate data specific to the implementation of a program, and data that corresponds to algorithms and standards. The objective is to get to reach that final data, know exactly what each value or set of values mean, and explain from there to re-implement those known tricks without really having to understand every single idiom used in the original code. The standard data is probably not cleanly generated until the last stage for its intended use, but it can be seen how it gets formed and we can intelligently use our brain to think how we could do something similar to form it too with our own implementation.

You can adapt the functions and macros called "WriteBookFromProgram__XXX" in your own programs. I've also attached a source file that contains the logs of what I write/think in categories. It shows extensively how to make a program human readable by basically making it write a book with real data. Later, functions to generate image or sound files, etc., can be added to make it more complete, not just plain/HTML text.

I call it SourceDoc, like source code and documentation at the same time built at run time with real tasks and real data:
http://sourceforge.net/projects/c-compiler/files/

OSDev.org

Do you understand (relatively) complex code of others?

Who is online