GSOC 2021:Progress Report of Improvements to clang-format | Haiku Project

Haiku · July 27, 2021, 7:17am

This blog will contain all the information about what I have done till now.

I started with the input preferences directory and started solving the issues according to haiku guidelines.

This is a companion discussion topic for the original entry at https://www.haiku-os.org/blog/saloni/2021-07-27_gsoc_2021progress_report_of_improvements_to_clang-format/

owenca · July 30, 2021, 6:15am

Hi, please see my comments below.

Preserve the indentation of comments

In the Haiku coding style, comments are sometimes indented one level more than the code (when they refer to the line above them) but clang-format doesn’t accept these one level indented comments and change their indentation to previous line indentation

I already implemented it in this commit last year. Also, your attempt doesn’t quite work. For instance, the comment in the following code should be outdented to align with the code below it:

int i;

	// comment
int j;

Removal of braces for multiline control statements

According to haiku guidelines, if there is single line control statement then the braces should be removed otherwise not.

I think you meant Insertion of braces here because Removal of redundant braces is a whole other beast. As I already warned before, the patch you used did not get merged into LLVM as it failed a lot when tested against a large code base.

Line break after return type in function definitions

Functions with one arguments were not properly detected. But the functions with no or more than one argument, code formatted correctly.

You could have saved tons of time if you had filed a bug report at bugs.llvm.org. Moreover, your fix doesn’t account for multiple abstract declarators, e.g. void f(BPoint, BPoint); it also breaks the clang-format unit tests. Anyway, I have fixed it upstream when fixing some K&R C style bugs. (See D105964 and D106112.)

Extra space before ++ operator in for loop

In for loop the extra space was added before the ++ operator.

Again, I already fixed it for Haiku in this commit last year. As I am not convinced that it’s a clang-format bug, I don’t have a plan to fix it upstream. If you are certain that this is indeed a clang-format bug, please file a bug report. If it gets confirmed by one of the LLVM developers, I will give it another look.

PulkoMandy · July 30, 2021, 7:00am

We don’t have indented comments after a blank line in the Haiku style. So this should not be a problem.

We have configured everything we could to use indentation and not alignment, and we have also configured alignment to use tabs, in case it’s still used. Yet, spaces keep showing up from time to time. To me this certainly looks like a bug, or at least, unexpected behavior.

Your patch mentions only K&R style but in fact it fixes the much more common case of having just the type and not the name of a (unused) variable?

owenca · July 30, 2021, 8:00am

Exactly my point: the customized clang-format should still fix the style just as it does before the customization.

I first thought so when I added the clang-format bug label to #16. I don’t remember what made me change my mind later as I changed the label to enhancement.

Yep!

owenca · August 3, 2021, 7:19pm

There is a new patch upstream that fixes D105964.

PulkoMandy · August 4, 2021, 1:30pm

I have rebased the branch at Commits · viveris/llvm-project · GitHub on a newer version of llvm (still about 20 days old, so I may need to rebase again). Let me know if I can help getting some of our changes in upstream llvm (probably not all the clang-format changes, but there are also various bug fixes to have llvm compile and run correctly on Haiku).

Coldfirex · August 4, 2021, 2:58pm

It looks like LLVM 13 was just branched for its first RC. I suppose these would miss that window so we could potentially see them in v14?

owenca · August 4, 2021, 9:23pm

You can file bug reports at Bugzilla and send patches to Phabricator (and add me as a reviewer/subscriber to expedite the review).

On the clang-format side, I recommended rebasing it to haiku-format (perhaps after rebasing haiku-format to LLVM 12.0.1 and applying the patch from @nielx). I still recommend it over keeping rebasing to the current LLVM main branch.

owenca · August 4, 2021, 9:34pm

After a patch is committed, the Details section will tell you which release(s) will include it. Here is an example.

PulkoMandy · August 5, 2021, 6:59am

For me it’s very complicated to use this repository which provides a patch that I need to apply to some checkout of LLVM. Also the git history is not that great inside the patch and it’s hard to separate the working and non-working parts. Why isn’t there just a normal git repository? That’s why I set up the one at Viveris which is easier to work with.

If there was a normal git repository and small independant commits, it would be easier to pick and test the ones we need, and track what’s already included in my version. I tried to reuse some of your work, but some parts were not usable (for example the patch to add/remove braces in single-if was generating code that didn’t even compile). And since the commit messages are not so clear in the patch, it’s hard to identify which parts we can still reuse.

owenca · August 5, 2021, 10:00am

As you already know, I started haiku-format as a customized version of clang-format 6.0.1 back in 2018. It was set up in such a way that it could be easily and quickly downloaded and built on Haiku x86_gcc2. Its main target was the users (Haiku devs and other contributors) of haiku-format.

Last year, @korli submitted a PR that ported haiku-format to LLVM 9.0.1 and combined all the changes I had made to clang-format 6.0.1 into a patchset. By the time I got around and merged the PR, the latest LLVM release had advanced to 10.0.1, and you had already merged haiku-format with the LLVM main trunk and filed a number of issues. So I went ahead and ported haiku-format to LLVM 10.0.1 and made further changes to address all the issues that I deemed suitable for haiku-format to handle.

None of the above would have been a problem if you had updated to haiku-format 10.0.1 from last year. Also, with a couple of exceptions, all the clang-format related commits at viv_haiku_format were already implemented in haiku-format 10.0.1. (So it’s not too late to rebase now. Let me know if you need any help.)

An obvious reason for not using the LLVM main trunk to customize clang-format is that it’s a moving target, and mixing haiku-format 6.0.1 (or even 10.0.1) with the current LLVM main trunk can hardly work.

PulkoMandy · August 5, 2021, 12:00pm

That’s not my experience from using it. I have tested it, found various problems in several places, and determined that some commits were not working and had to be removed. I have opened the corresponding tickets on your bugtracker, some of which you fixed by removing the changes yourself, and some you left open, or closed as invalid (which I don’t agree with).

I did the rebase and it was not that difficult at all. I’m a bit ocnfused with how you want to work here, on one side you say we shouldn’t be using the trunk version, and on the other you point us to changes that you merged there directly.

I think it makes sense to track stable releases rather than the random commits in the git tree, but there is no reason to stay locked to an older version of it if newer releases are available.

As for distribution, it seems simpler to me to have the changes in a git repository and then package it using haikuports, rather than the script and patch approach you have in your repository currently.

owenca · August 6, 2021, 11:13pm

I think you are confused with the timeline here. The haiku-format version you tested was based on LLVM 6.0.1 from 2018, and among the issues you created late last year, there was only one haiku-format bug (removing redundant braces) which @korli had already reported early last year and I reverted in haiku-format 10.0.1 late last year. All other issues you reported were either bugs in clang-format itself or additional customization requests (enhancements to haiku-format). Based on my experience working on clang-format upstream and customizing it for other companies for the past few years, I closed some of the issues created by you and others (including myself!) and explained why: they either were dangerous (removing redundant braces) or looked unrealistic (aligning class members). The rest were addressed in haiku-format 10.0.1 as of late last year.

The only reason I referred to that upstream patch (and a later patch that fixes it) was that saloni spent a lot of time fixing that bug and the fix was incomplete and would break unit tests. My suggestion of not working off the constantly changing LLVM main trunk still stands.

Agreed. That’s why I recommended either rebasing to haiku-format 10.0.1 or first bring it up to LLVM 12.0.1. (I will update haiku-format if you want 12.0.1, and apply the relevant patches from the main trunk if you want them.)

Agreed, but before then, the current setup allows easier and faster build of haiku-format, especially after building it for the first time.

PulkoMandy · August 7, 2021, 8:24am

I did the rebasing to LLVM 10 and Korli converted it back into your patch+script format.

Here is my view of the timeline:

You had a version based on llvm 6, that was unmaintained for a while (since august 2018: Commits · owenca/haiku-format · GitHub)
Korli did some fixes to it and ported it to LLVM 9 injanuary 2020. You did nothing with this patch for the whole of 2020
November 2020: I rebased it to llvm 11 and set up a normal git repository so it’s easier to work with, I announced this on the merge request: port to LLVM 9.0.1 by korli · Pull Request #12 · owenca/haiku-format · GitHub and sent an announcement to the haiku mailing list: [haiku-development] haiku-format update - haiku-development - FreeLists
I think at this point it is clear that I was using the latest available version
After that, you started working on rebasing your llvm 6 version to llvm 10 independently, and ignoring my work on rebasing to llvm 11 because it was not possible to build it under Haiku at the time
Later on, I imported the patches needed to fix the Haiku build (apparently sometime around april/may 2021) and we launched the GSoC project from this starting point

As I have already said, the patch+script format makes it very inconvenient to track what’s included and what is not, also because the patch was rebased and commits just removed or merged together.

So based on this, it looks like we need:

Fix #16 by indenting instead of aligning wrapped "for loop"s. · owenca/haiku-format@3387c83 · GitHub which has been re-integrated now in my branch,
Fix #14 by not changing the indent of comments for the code above. · owenca/haiku-format@c1fcbd5 · GitHub for which Saloni re-did the implementation already

In your git repository I don’t see anything else that I could be missing.

That is not the plan. But working on the latest release and rebasing when a new one becomes available, or when there is an important bugfix in trunk that we want to get immediately, sounds reasonable to me.

As long as your repository is in the patch+script format, it is complicated to use for cooperating on the work. Can we just have a normal git repository instead?

I just run “ninja clang-format” in the build directory and it generates new versions of clang-format in a few seconds, rebuilding just the few files that are modified. Your script is not easier or faster for me. And the extra work in generating the patch file when I could just use a sinmple git push is a waste of time and confuses things for everyone. This is why I set up a normal git repository where I can work in a normal way: cherry-pick patches, have multiple branches, use github pull requests if needed, etc.

owenca · August 7, 2021, 7:13pm

Rebasing haiku-format 6.0.1 to LLVM 10 wouldn’t work. There are clang-format (7.0.1 and later) changes that superseded some of the changes in haiku-format 6.

I think you mean unchanged instead of unmaintained. It was unchanged for a while because no new issues were reported during that time, until korli submitted a PR exactly 17 months after my last commit.

In the meantime, I worked on clang-format itself, fixing bugs that impacted haiku-format and landing a new feature required by Haiku’s coding style, which in turn required first fixing several bugs in clang-format. My goal was to minimize the diffs between clang-format and haiku-format.

As I said before, korli’s PR combined all my changes to clang-format 6.0.1 into a patchset and made it buildable with LLVM 9.0.1, but it didn’t fix anything. In fact, the PR would make haiku-format broken because some of the changes I made to clang-format invalidated some of the work in haiku-format. So it wouldn’t be as easy as simply merging the PR and expecting it to work with LLVM 9, and I couldn’t find enough time to sort it all out until some 10 months later. It was not a whole year but close.

I started updating haiku-format to 10.0.1 partially because you created a number of issues for haiku-format 6.0.1 last November. As explained above, it wouldn’t be simply rebasing haiku-format 6 to LLVM 10 (or to LLVM 11 as you did). It would involve reworking some of the haiku-format changes superseded by the clang-format ones. It would also include addressing most of the issues you reported. Even if you had rebased haiku-format to LLVM 10, I still wouldn’t be able to use yours.

My work of updating haiku-format from 6 to 10 was completed as of late last year whereas the version you rebased a month earlier remained outdated. Even worse, it was broken because of the conflicting changes in clang-format brought in from LLVM 11. Whether it could be built on Haiku was not important at that point. Anyway, this GSOC project should have launched from haiku-format 10, not haiku-format 6 merged with LLVM 11 (or 12 or the main trunk).

As I said earlier, the reason for the way the repository was set up is to reduce the burden on the users of haiku-format.

Please refer to the closed issues on GitHub, especially those created by you and saloni. At least #18, #21, and #22 are also among the issues that were closed late last year and saloni was still attempting to re-fix months later.

You should understand by now that unless you rebase to haiku-format 10, you will have conflicting changes from clang-format in LLVM 10 (or later). (Your last commit is just one example.) I strongly recommend that you fork the latest haiku-format and add the changes you and saloni want on top of that.

I can fork LLVM 10 and merge it with haiku-format 10. Meanwhile, can you create a release of haiku-format 10 via HaikuPorter? I don’t want to force people to clone the entire llvm-project if they just want to use haiku-format.

Well, you are the only one who complained, and you didn’t complain all these years until now. As mentioned above, the setup was for the users, not developers, of haiku-format. (I myself doesn’t really use it for development.)

PulkoMandy · August 7, 2021, 7:53pm

I missed the fact that you had made changes to llvm upstream, that was not mentionned anywhere until very recently in this thread. How could I have guessed? So from my point of view it was just some changes disapearing from your patch in the github repo without any explanation.

Again here is the info that I had: port to LLVM 9.0.1 by korli · Pull Request #12 · owenca/haiku-format · GitHub

In this pull request we can see:

Your only reply was “I will look into this sometime next week” then nothing for almost a month
Then there is a comment from me saying that I further rebased the changes on llvm 11. This is before you did any other work, unless you had some things in llvm upstream at the time, but that, I had no way to know.
Then you merged korli’s changes (without any further work)
And after that, you pushed a single commit saying “roll back”: Roll back to db13dc0 equivalent and update to clang-format 10.0.1. · owenca/haiku-format@680f261 · GitHub which just makes it very hard to see what’s changed. Nowhere I can see a mention of work being upstreamed to llvm. So I was quite confused by this, the patch from korli being merged but my version based on llvm 11 being ignored, and then the changes being undone to put in an apparently almost empty patch in its place. Moreover, the patch is just a diff file (not a patchset as haikuports would do, for example), and as a result it is very hard to track what each change is supposed to do. I did my best to understand what was going on and try to follow the changes, but I did not have the needed information to understand what was happening.

So, you were aware of my work on llvm 11 at this point, even before you started rebasing your repo to llvm 10. But you did not tell me anything about what I was missing. Neither in the pull request, nor in the mailing list thread (while you did reply to both, acknowledging that you had solved the issue with building llvm11 there). Why give the warning and the full explanation only months later? If I had known about it earlier, it would have saved time and frustration for everyone

Yes, I can work on that part. Probably after GSoC is over and we decide which part of the work done by Saloni is ready to include.

owenca · August 8, 2021, 12:19am

You didn’t have to guess as I did mention it in this email last year.

Like I said, it turned out that I didn’t have enough time to sort it all out at the time.

I just reviewed our comments on that PR, and the following transpired:
Because you mentioned that you merged your repo with LLVM 11, I did try to target LLVM 11 as well when updating haiku-format 6. As LLVM 11 could not be built on Haiku then, I had no choice but to target LLVM 10 instead.

I didn’t know you wanted your merge of haiku-format 6 with LLVM 11 to be somehow included because you didn’t create a PR. As explained above, I could not target LLVM 11 even if I wanted to. In the end, I couldn’t really use korli’s merge of haiku-format 6 with LLVM 9 either.

I thought it was obvious that you would fork LLVM 10.0.1 and apply the diff file from haiku-format 10 as you had not added anything of substance to either haiku-format 6 or clang-format 10 or 11 by then. However, your repo remained stale for at least a few months while no more real issues were created on mine either.

I didn’t really know what you wanted to do and how you mentored saloni. It appeared to me that you wanted to discard haiku-format 10 and start with a new customized clang-format based on LLVM 11 or the main trunk, perhaps cherry picking haiku-format 10 along the way. Nevertheless, I did give warnings and advice before (see here and here), but they were by and large ignored. Anyway, if you had asked, I would have explained. If I had known what you were up to, I would have warned.

It is what it is now, so let’s stop blaming each other and move forward. If you want my help, just let me know.

PulkoMandy · August 8, 2021, 6:42am

My goal is not to blame anyone, and I think the issue is mainly miscommunication, where all sides are probably responsible in some way. It doesn’t really matter anyway, what I want is fully understand the situation and see how we can better coordinate things. Another problem is lack of engagement from Saloni at the moment, as she communicates with GSoC mentors mostly through private Telegram chats, and blog posts are for a large part on her private blog that no one in Haiku even knows exists. I will discuss this with her. It is a recurring problem in Google Summer of Code that several students choose to discuss things in private with their mentors only, and I am not sure what to do about it.

In the mailing list thread there was a message from nielx explaining how to fix the issue, and you acknwledged that, but then still remained on llvm 10, with no further explanation as to why at the time.

Saloni created several issues on your repository. Some were closed as duplicates (rightfully so, apparently it was because of testing with my broken version as a base), but for others they were closed with this message:

This is not something that should be handled by clang-format/haiku-format. You may have to use clang-format off if you want to keep the indentations.

For example issues #19 and #20. However these are very important to being able to use Haiku-format, since otherwise, it goes completely against the Haiku coding guidelines, and I think turning clang-format off whenever we use these constructs (basically, whenever we declare a class) is not going to be usable. Saloni has made good progress in getting these to work already.

It was not obvious. Now I understand it because I have more information about what happened.

It’s a combination of:

Wanting to use a git repo normally so we can have branches, merge requests, and tracking of the history of commits (I think you can understand now, how not having this in your repository made things confusing for me as it is hard to understand the history of the repository and see what was done and why)
Wanting to use a more up to date version of llvm (but this requires more careful rebasing)
Not being aware of any problems with my rebasing effort (your mention that you were working direclty in upstream clang-format did not allow me to deduce that it would introduce regressions, and no one reported problems to me with my version either)
And when Saloni started creating bug reports against your repository with the main features we wanted to work on, you closed them with a message saying it’s not possible to do it, which also didn’t set a very good starting point for more collaboration.

I will create a branch in my repository including your patch now, and see how to progress from there.

owenca · August 8, 2021, 11:14pm

Agreed.

If you had asked, “Can we target LLVM 11 using the patch from nielx?” or “Why are we still targeting LLVM 10?” I would have explained: as you were aware, that patch had yet to land at LLVM. (In fact, it’s not even in the recently released LLVM 12.0.1.) Again, I thought it was an obvious choice that warranted no explanations if no one cared to ask.

I also pointed it out in Saloni’s first progress report and near the top of this blog. What really puzzled me was that after you created a number of issues for haiku-format 6 and I addressed them in haiku-format 10 last year, the same issues were created and different fixes were attempted in this GSOC project several months later. The only conclusion I could reach is that haiku-format 10 was not even cloned or forked.

You asked about this on GitHub, and I already explained there.

I still don’t quite understand why you didn’t use haiku-format 10 as a starting point, i.e., clone LLVM 10 and apply the diff file from haiku-format 10, which would give you a clean slate to add additional customizations. If you wanted to understand how and why I had done something in haiku-format since 2018, you could always ask. Nevertheless, as mentioned above, I have already agreed to cloning or forking LLVM 10 and applying the diff file for you after you package haiku-format 10 in HaikuPorter.

As a developer, when you update to a version from upstream, you should always run the unit tests at least. Even if I had not done anything to clang-format, changes by other LLVM developers could impact haiku-format in one way or another.

Please read my comments on GitHub again. I never said it would be impossible to do it. Anyway, if you didn’t agree why I closed an issue, you could just ask me to reconsider.

PulkoMandy · August 9, 2021, 7:06am

In my mind the LLVM 10 patch was just a rebase of the LLVM 6 patch to LLVM 10, just the same thing I had already done with LLVM 11, so I saw no reason to go back. I didn’t understand why you went back a version from my work. I was wrong because I missed some of the information. Now I understand,