In an essay in 1955, Cyril Parkinson observed how bureaucracy expands over time regardless of outside pressures. One of the examples he used was how the number of employees in the British Colonial Office was at its highest when Great Britain had barely any colonies left to administrate.
The development of a software project is similar. In the beginning, a lot of work is needed to build the first few versions of the project. The teams on the project grow in size and sooner or later the project is close to the original vision.
At this point the project is in a crucial phase. There is a lot less to do than there was in the beginning but all the employees hired to ramp up the project are still there. It is now necessary to downsize the project to save it. If you keep more people around than the project needs, they will invent their own work. Teams will build features that nobody uses, iterate over UI designs that are more annoying than beneficial, add a homegrown Common Lisp interpreter to generalize whatever harebrained functionality in the name of good design, and generally just muck around.
It should be the goal of software projects to reach a level of maturity that they can be handed off to a maintenance staff that keeps the servers up and does the occasional bugfix. It should be the goal of people working on software to get their projects to this state of maturity so that they can move on and build the next great thing.
Most software could be like that. However, we need to acknowledge that it’s OK for software to be complete and it’s OK for employees to leave their project babies behind in search for new things to build.
Job seekers often wonder whether something they did matters to hiring managers. Does it matter that I attended an elite college? Does it matter that I didn’t? Does it matter that I volunteered for the UN? Does it matter that I chose advanced algebra over advanced calculus?
The answer to all of these questions is always “yes”. Everything you did in the past matters to hiring managers. For them, you are the sum of your past decisions. Everything they know about you shapes their opinion of you consciously or unconsciously. It also matters whether you disclose this information. Putting your Eagle Scout achievement on your resume is a signal but so is omitting it.
Now do any of your past decisions impact hiring managers positively or negatively? That’s difficult to say. It’s like asking what people like in their partners. Different people value different traits just like different hiring managers value different accomplishments.
There are certainly trends. It is usually better to attend an elite university than to an unknown state college, but some hiring managers prefer the School of Hard Knocks. It is usually better to have some internships under your belt before applying for a full time job, but some hiring manager never did internships either and don’t care. Some hiring managers prefer hackathon participation. Others prefer you spend your Sunday mornings at the local soup kitchen. Some hiring managers are freemasons. Play your mason membership right and you may get the secret handshake and an instant offer to join the company.
Every decision you have ever made matters to hiring managers but for any particular hiring manager it’s a crap shoot. If you don’t know about the hiring manager’s values, put accomplishments on your resume that you are proud of and hope they resonate.
Briefly speaking, the scientific method is the idea that you should formulate a hypothesis about the expected results of an experiment before looking at its actual results. Then compare the actual results to the expected results. In computer science – and software engineering, its applied field – the scientific method seems to be widely neglected though. I read a lot of computer science papers that don’t follow this method and it’s practically absent in software programming (notable exception: TDD).
This doesn’t have to be case. The scientific method works very well even in the lowliest places of programming. The first time I remember consciously thinking about this topic was in college when I – then a TA – watched students step through code in their debugger. Students single-stepped over lines of code, observing what happens. That’s not how debugging works best. Rather, before you step over a line of code, form a hypothesis about what’s going to happen, and check the expected result against the actual result once you stepped over the line.
Outside single-stepping through lines of code, the scientific method works pretty much during all parts of software developments. Making a change to speed up a part of the program? Estimate how much faster the program is going to run; maybe 10%. If the actual result is far off, you need to backtrack to understand why your understanding of the program didn’t align with reality. Same with A/B testing of GUIs. Is a round button or a rectangular going to lead to more click-throughs? Don’t just test it and see what happens. First, formulate a hypothesis how many more click-throughs you expect from the button change. If the observed user behavior is far off, you don’t understand your users and need to backtrack.
I didn’t go to a very famous college so in all the years I attended, we only ever had one industry guest speaker visit us. He was a higher-up at IBM Germany. One of the points he made was that software developers always send him excruciatingly detailed emails justifying decisions he doesn’t really care about, like ordering themselves a new laptop when their old one broke.
Instead he gave this advice: if you’re deciding between options, explore the options and arrive at a decision. Then set a deadline and send an email to stakeholders that you’re moving on with the decision unless they object before the deadline expires. This keeps projects moving on by default, rather than stalling because of missing decisions.
If you write emails this way, you don’t force stakeholders into the spotlight like you would asking for their opinions. Rather you allow them the easy out of staying silent, accepting your proposal, and archiving your email. They trust you enough to make this decision and are happy enough with it. If you solicit opinions instead, you have to keep pinging stakeholders to find out whether they don’t care enough to reply or they will reply later. That’s always awkward and takes a lot of time.
Avoid all of this by sending a brief email with the options you considered, the conclusion you arrived at, and a deadline for review. Stakeholders will be happy that you did most of the work yourself already, not taking time away from their own projects. If you’re about to make a bad decision, someone will speak up before the deadline. Guaranteed.
There are three kinds of candidates who fail software engineering interviews for technical reasons.
There are those that don’t know how to write code. Sorry but that’s a requirement for software engineering jobs.
There are those that can write code but are unable to come up with good ideas to solve the given problems. Difficult to judge why this goes wrong. Maybe just a bad interview.
There is a third kind of candidate that is particularly maddening to watch. Those are the candidates that propose great solutions and write confident and skilled code. However, the code they write doesn’t match their proposed solution at all. They might propose a perfect solution using a tree and breadth-first search but then write code that uses a binary search over a linked list.
Candidates of the third category are a rare breed but I see them often enough to wonder. How is it possible for evidently experienced and talented candidates not to see the complete disconnect between their proposed solutions and their implemented solutions? Is it just the nerves?
There are programmers who proudly proclaim ignorance of computer science fact knowledge because they can just search for definitions or code on the Internet. “Bubblesort? Why remember? You can Google that!” These people limit their own employability and career options. Facts fluency is required after a certain point.
One of the most interesting things I received in college was a cheat sheet of the minimal mathematical definitions of NP-complete problems. It was a double-sided single sheet of paper with maybe a total of 25 problem definitions. We used that to prepare for the final exam of one of the algorithm complexity classes. That was about ten years ago and I still remember plenty.
This was also the first time I realized that terms like “bin packing” or “maximum cut” in the language of computer science serve a similar function as terms like “dog” or “cat” in the English language. When you first acquire the language, you learn and rehearse these terms independently. After you mastered the basic terms you can combine them to create meaningful sentences.
The vocabulary of computer science is like any other vocabulary. You have to learn it and if you don’t use it, you lose it. After mastery of terms and sentences, you can reach the next skill level: to see a problem and immediately recognize its core, ignoring the frills that make it different from textbook examples. That’s a skill that can not be Googled because you don’t even know what terms to Google for.
I encourage every programmer to regularly brush up on the basics. You can do the minimum by reading Wikipedia. If you want to be an overachiever you can go to a page like the Dictionary of Algorithms and Data Structures and create a set of flashcards with terms to rehearse.
When I explain my idea of software design to people I draw a simple two-dimensional graph. The x-axis is the project size. The y-axis is the required effort to make a change to the project. There are two functions shown in the graph. One is slowly growing linearly and represents the cost of changes in a well-designed project. The other one is growing quadratically and represents the cost of changes in a badly designed project. The linear function starts above the quadratic function. A well-designed project requires more effort up front. Over time, a badly designed project requires more and more effort until the cost of changes passes and exceeds the cost of changes in a well designed project.
In the best case, the cost of change is independent of project size. This ideal can not be reached. A change in a 100 million LOC project is always going to cost more than a similar change in a 100 LOC project. However, we should strive for this ideal. In fact, decoupling the cost of changes from project size might be the ultimate reason for thoughtful project design. Maybe all of its other benefits are just steps on the way.
There is a big-O analogy in here. Let’s define big-P as the function that provides the worst-case cost for making a change to a project. In projects of P(1) the cost of changes is independent of project size. This is the theoretical ideal. Projects are in amazing shape if they are in P(log n) where a change of size n requires only twice the effort in a 10x larger code base. P(n) is still OK and probably better than most existing large projects. Beyond these complexity classes, projects quickly reach infinite costs for changes: desired changes become impossible.
There are some seriously old software projects around these days. Take SPSS which was first released in 1968 and is still updated. When a new person joins the SPSS team, how do they learn what has already been tried before? Are they bound to repeat bad ideas of the past? Is there a reference to tell them their idea was already explored and discarded in 1983?
You don’t need to have 50 years of project history to wonder how a project arrived at the status quo. Even if your project is just a few years old, new hires ask about past decisions. Often people who were on the team at the time of a decision can not recall why a decision was made the way it was made.
I have noticed that development teams seem to be focused on the future. Their process, documents, and tools are optimized for what must happen next. This is great for focusing on shipping product. After shipping, the usefulness of the documents and tools seem to crumble. They are not maintained anymore, become obsolete and outdated. It’s now difficult to recall questions that were asked and answered in 1993.
I feel that projects need scribes or archivists that track key decisions in a structured way. Their system would track key decisions and the reasons behind them. I want this system to be easily searchable to find decisions made years ago. I also want this system to be readable as prose in chronological order like a history novel. I want entries to be taggable and relatable, for example to highlight that a decision from 1997 was made obsolete by a newer decision from 2004.
I haven’t found tool support for this idea. Maybe it’s been tried and discarded but nobody kept records.
Documentation that is closer to code is more likely to be read and updated. Information in a wiki or shared documents in the cloud go out of sync with code very fast. The effort and diligence required to keep them in sync is an intense job most programmers are unwilling to do.
The obvious conclusion is to tie documentation very closely to code. However, this does not help. Programmers seem to have a kind of comment blindness. Even existing comments that specifically answer the programmer’s questions are often subconsciously ignored. And that’s just the reading part and doesn’t touch on the problem of updating comments. It doesn’t help that many IDEs visually de-emphasize comments over code.
This reminds me of code testing. In the beginning, programmers wrote code. As code became more complex they realized how hard it became to make changes without breaking something. Many programmers started to adopt testing frameworks to quickly learn about new defects in changed code. However, code and tests quickly went out of sync and test coverage decreased over time. Code coverage tools were created and and some companies these days require minimum coverage percentages for changed code to make sure tests stay in sync with code. Many companies have dedicated test engineers who handle that.
Can the same be done with documentation? What if you made a change to a commented line and some part of the toolchain explicitly asked you afterwards if the old comment still applied? Make a change to a method and the toolchain asks if the method documentation is still appropriate. Same for classes, packages, and whole programs. Whenever you make a change, you have to confirm that the existing comments and documentation on the same abstraction level still apply.
Will we see companies with dedicated comment/documentation engineers?
The common wisdom is that syntax errors don’t matter when you’re up on the whiteboard for a programming interview. Who cares about unbalanced parentheses or a missing semicolon? Who cares if you forget a cast before passing a value to a standard function? The common wisdom is wrong.
I used to tell candidates not to worry about syntax mistakes. I did that until I had interviewed so many candidates that I realized that maybe 5% of them write whiteboard code without making syntax errors. The astuteness and precision of these candidates is inspiring. All things equal, if you make syntax errors on the whiteboard, you’re already behind.
I noticed a second issue when writing up interview feedback. When I transcribe whiteboard code, tiny errors like a missing semicolon turn into comparatively large problem descriptions like “// missing semicolon”. That’s 20 characters of problem description for a one-character error. I have to jot that down so feedback reviewers don’t think I made that mistake during my transcription. On the other hand, if the candidate only finds the naive solution instead of the dynamic programming solution I need just one sentence for that. That’s maybe 500 characters of candidate weakness summarized in 100 characters of feedback; a problem to description multiplier of 0.2 instead of 20 for the missing semicolon.
Of course, only finding the naive solution to a problem is a much bigger issue than a missing semicolon but how do feedback reviewers subconsciously process that? Do they think of the missing semicolon as a much bigger issue because the problem description takes up comparatively much space?
I now tell candidates that syntax is not the most important thing but they should still try to get it right.