I find it fascinating that people talk about "Having a history of what people did" in such emotive terms - "Cluttering", "Polluting".
What matters is that you end up with working systems. That a lot of change happened is just, well, what happened. It doesn't need to be prettied up and made to look like your development occurred in a clockwork march of cleanliness. It literally does not matter unless you spend a lot of time doing git-bisect.
Let it go. Accept that coding is not a smooth, robotic, endeavour, where everything is always tidy. And that's just fine.
I've accepted this a decade ago.
I put my ego on the side, and now I don't care if my git history doesn't look like "beautiful" when looking at the commit graph.
I've been working on dozens of projects since, and probably did thousands of commits. Some of the teams of those projects included dozens of developers working concurrently on the same codebases. We always merged the upstream branches into our development branches and never did any rebases.
I have NEVER ended up in a situation where I thought rebases would have been better.
The git tools and IDE integrations of our current age allow me to find any information I need from the history without pain.
Have you ever had to use git bisect? That's really where a 'clean' git history is important. Plenty of people never use git bisect, and that's fine too. That said it's a very useful tool when you do need it, and can drastically simplify finding when and where a regression was introduced.
You can `git bisect --first-parent` and only bisect top-level merge commits. In most cases that gets you to the ballpark of "PR that introduced the bug" no matter how dirty the commit history inside that PR had been and if you can git bisect further in that branch. In my experience that is most of what you want anyway, "PR that introduced the bug" gives more than enough context.
You can bisect across the more coarse merge commits, without “destroying” history and losing the ability to bisect across more granular constituent commits. Bisect is more robust when more information is preserved.
This exactly. I'd rather pinpoint the issue to a small commit with only a few changes vs. "well I know which feature caused the issue, now to wade through 65 changed files."
The point of a clean git history is not to have a clean git history. The point is to make it possible to debug later, via bisect, or show, or even just a diff. The point is to make the workspace clean for the next guy.
Instead of letting it go, maybe we should have more discipline and organization in our lives and not less.
It's hard to tell what side you're on, because both sides refer to their stance as "clean history".
The pro-revisionists (squash, rebase) say they do what they do so the history looks clean (no intermediate commits breaking stuff, a "straight line" graph, etc)
The anti-revisionists say they do what they do so the history looks clean (can see the actual development, can safely diff different commits to see what changed in between, see the log in chronological order, etc).
> Instead of letting it go, maybe we should have more discipline and organization in our lives and not less.
Again, both sides could argue that they're the ones with more discipline.
> The point is to make it possible to debug later, via bisect, or show, or even just a diff.
This sounds anti-revisionist.
> The point is to make the workspace clean for the next guy.
This is one of the most common pro-revisionist arguments.
> > The point is to make it possible to debug later, via bisect, or show, or even just a diff.
> This sounds anti-revisionist.
That’s not how I see it. What makes debugging via bisecting easier is self-contained changes, not exactly chronological changes where you temporarily broke stuff and then fixed it before submitting your PR.
100% agree, but nobody gives a shit, and I’ve learned to just let it go. I’ve been in so many meetings, seen so many PSAs, and you know what happens every single time? Nothing. Maybe a couple people learn what interactive rebase is for the first time, try it once, say “it lost all my code” and never try it again. Good luck explaining ref log in these cases.
Did you notice, though, that rebase advocates use very "emotive" terminology when talking about git history? Like it's a subject they care about? Seems awfully touchy feely.
You say that like it's a bad thing. If there are two groups of people, and one of them is indicating (via words or behavior) that they don't care about something all that much and the other is indicating that they do care about that something quite a bit, why would I ever listen to the ones that don't care? It is almost tautological that the group that actually cares is going to have the more persuasive arguments and is thus far more likely to be right than the apathetic group.
I don't know if "emotive" is the right word, because to me this whole discussion is like trying to tell someone to be less sloppy because they make a mess when eating at their desk, knowing that the custodians will clean up after them.
> What matters is that you end up with working systems. That a lot of change happened is just, well, what happened. It doesn't need to be prettied up and made to look like your development occurred in a clockwork march of cleanliness. It literally does not matter unless you spend a lot of time doing git-bisect.
And git blame. And git checkout to a past state. It "doesn't matter" only if ease of understanding your project history doesn't matter.
how often is "understanding your project history" something that actually comes up for you? In all my years of working with projects in git, I will occasionally look at my history to help me find a change that may have led to a bug, but it really only comes up for me once or twice a year and even then, it is rarely an extensive deep dive and never very far back in time.
>how often is "understanding your project history" something that actually comes up for you?
Frequently, for any long and complex project. Large amounts were written by people no longer working on it, and the history of how things came to be can help fill in documentation gaps and make intent clear.
By "frequently" I mean something like "I check history for about 2/3rds of bug fixes, and 1/4 of adding features" to understand the surroundings better, when writing or reviewing. Anything that makes that better saves me hours per week.
It catches and prevents more than enough subtle issues to be worth the effort.
I'm on a long and complex project. However most of previous folks were not very good and one reason I'm here to fix it. Their history is not particularly useful except to giggle at.
It's others history that I'm usually interested in. I can easy follow the small diffs of individual commits, but have a much harder time grokking a wall of red and green.
When I’m on call and discover at 3 AM that we’re doing something weird, I need to know whether we meant to do that and especially why. In theory you could write all that down, but the people who aren’t doing that in git also won’t do it outside of git. The more you write down, the less likely it is that I need to page you to ask WTF.
It comes up often enough. I run "git blames" frequently to figure why something odd looking was introduced. It may not be a bug, but a WTF. This is in an environment with few code reviews, despite my attempts to introduce them. It is frustrating.
Sometimes. Once every few months. Sometimes it conveys useful information. Sometimes it just hits the "product imported from previous VCS a decade ago" commit.
I never use rebase, and I've never once had trouble understanding who did what where and when, even in a large project with 500+ users.
That being said, after reading this stuff, I may start using it on my local branches to clean up multiple commits into one tidy one, but that's about it.
Every time I try to blame or bisect and just end up stuck on an irrelevant megacommit I curse the Git maintainers that don't have the backbone to just get rid of --squash.
Every time I try to review a PR and the bookmark resets because they decided to force push I curse the Git maintainers that don't have the backbone to just get rid of rebase.
I think if the definition of a “good history” is “clean and not messy”, then yes I agree that’s pointless. If the definition is “a clear ability to see what changes were made, by who, and most importantly why” I think that’s incredibly necessary and would even go so far as to say it’s naive at best to not support.
The amount of time that has been saved in my life by someone leaving an explanation in their commit (for some weird edge case or context I’d have no way of gleaning because they’ve since left the company) is SO much more than the extra time I’ve put in to make sure the history has this extra info in it.
What's worse, the desire for cleanliness ends up making things like `git bisect` less useful.
If I had a bad day and introduced something stupid, I want a bisect to point me a the code I wrote on that bad day. If you squash liberally, perhaps because you want each commit to correspond with a release-note, you're going to lose that debugging granulariry.
The git history of a project is the main source of knowledge on that project, once the people that wrote it are gone. The git history answers questions such as "wtf is that supposed to do?", "what's this code connected to?", and "why did they do it that way?". You can use other kinds of documentation, but the git history is always there, so it makes sense to make it semi-useful.
This is such a strange thing to say. I'd be curious if you feel the same way about cleaning up your code, or cleaning up your room. I think you have an unfair advantage in this argument because it's difficult to defend such intangible benefits. We have to resort to making up logical explanations, or sounding unhinged or emotional as you suggest.
But it's simply intangible. My instinct tells me that it's helpful and that's okay. I don't owe anyone a justification for how I organize things, and there's nothing controversial about this. (Or maybe I could even come up with a logical example of a benefit, but that's a trap I'm not going to fall into) And a lot of people agree, and they know what I mean, so it's not merely an individual preference. If I have to work with someone who has strong preference against it I'll worry at that point about negotiating.
> I'd be curious if you feel the same way about cleaning up your code, or cleaning up your room
Very genuinely: I do not care at all whether you clean your room starting from left and continuing to right. Or, starting from doors and continuing toward window. Or whether you clean it in a random order. I also do not care about whether you clean every Friday or whenever you feel like. That is the equivalent of git history. Because this excessive care about git history is just that - insisting that room is cleaned from left to right as if any other order was an issue.
The reason why it is hard to defend the tangible benefits of this or that git history strategy is that there are very little benefits.
> I do not care at all whether you clean your room starting from left and continuing to right
But you didn't say that you don't want it clean. It sounds like you're talking about how it's organized rather than whether it's organized.
> The reason why it is hard to defend
I'm talking about intangible benefits and no that's not the reason. Intangible benefits are inherently difficult to defend in words. Citing this as evidence of anything is akin to a debater's trick.
Insisting on highly organized git history is like insisting on particular order of cleaning. History is not the product itself. It is not the code itself. It is less important and matters only a little.
In the rare situation when I have to read it, I am perfectly ok looking at previous commit too or whatever. It is still less overall work then what people describe in here.
Even with room, I do not want my room infinitely clean. I am ok when books are not ordered by height and color for example. I do not need t-shirst ordered by color either.
A clean git history on a pull request also makes it easier for the reviewer to understand your code. Small, concise commits will tell the reviewers about your train of thought or what issues did you run into, making it easier to pick up the context. I start with every code review by looking at the commit history.
I prefer not to have squash commits in our team for this reason. It makes master look good, but usually nobody ever looks at the master commit history first, they look at the merged pull requests. However, everybody must look at the commits you made in a pull request. If you have squash commits, you are encouraged to have messy commit history in your pull requests, leading to meaningless commit messages and even large commits (causing other problems...).
IMO the only advantage of squashing is that it makes it easy to roll forward when you accidentally deploy something that causes problems.
Yeah we use pull requests for the coarse-grained stuff and leave the small commits, which should also have good comments, intact. Maybe other shops use pull requests differently.
Agree, plus let's avoid having the CI pipeline creating commits in the remote repo. I like CI/CD to be stateless with regards to the files in the repository. I tried to plea for this today with my colleagues with very mixed results
What matters is that you end up with working systems. That a lot of change happened is just, well, what happened. It doesn't need to be prettied up and made to look like your development occurred in a clockwork march of cleanliness. It literally does not matter unless you spend a lot of time doing git-bisect.
Let it go. Accept that coding is not a smooth, robotic, endeavour, where everything is always tidy. And that's just fine.