RFC Coding Guidelines: Formatting Class Member Declarations | Haiku Project

This RFC proposes to change the Haiku coding guidelines to change the formatting of class and struct member declarations in class and struct definitions from a Table Class Member Declaration Style, to a Normalized Class Member Declaration Style. The arguments are that (1) the current format has severe limitations which limits the aesthetic value of the current formatting, especially when modern C++ language features are used, and (2) it is not a good use of the time of Haiku’s contributors to modify and maintain custom logic in the haiku-format tool (derived from clang-format). If the proposal is adopted, any new code contributions will have to use the new formatting style, and contributors are required to reformat any struct and class definitions that they modify. There will be an exception for shared headers.


This is a companion discussion topic for the original entry at https://www.haiku-os.org/blog/nielx/2024-02-05_rfc_coding_guidelines_formatting_class_member_declarations
10 Likes

Very well written! Thank you for putting in the time for the blog post.

I didn’t see anything for function attributes, such as [[nodiscard]] virtual bool DoesThingExist();.

Should any comments be added about their use?

1 Like

Like Zakero said, well met!

I might go so far as to say that consistent formatting is the goal. As such, if a file is touched, it must be reformatted to the new rules (provided that the formatting is automatic–not requiring manual intervention–and doesn’t require much extra effort on the part of the volunteers). Parts of a file can be escaped but must include short-ish comments explaining why that part retains the old formatting (to avoid someone in the future scratching her head wondering why it wasn’t reformatted).

In time, much/most of the source should contain the new formatting. Then a committee can decide whether or not to reformat the remaining sources.

Kind of a side issue that 1) relates only to the API include files, 2) doesn’t appear to be a problem here, and 3) has nothing to do with C++ coding – I occasionally entertain myself by working on software that generates API interfaces for other languages, starting with those include files.

I’m pretty sure the proposed format will parse just fine, so no difference there. It helps that the existing files are consistently formatted, but that’s more where there’s white space and where not. It would be great if there are C++ features that would allow you to explicitly denote things like

  • a return (“write”) parameter
  • a parameter passed by address that will be retained by the called function and should not be freed (e.g. BMessage passed to BMenuItem)
  • a function that will run in its own thread like BWindow::Show()
  • (… probably more)

Also …

Indeed, for me WebPositive can’t even operate the horizontal scrollbar that would bring the right side of the text into view.

Hello Niels; Again, thank you for your work on this. I would happy if you want to make this change. I sometimes find the “table layout” quite awkward to work with and would prefer the more compact style.

1 Like

In which website? Did you open a bug report?

To my knowledge both scrollbars work correctly.
Please open a bug if that isn’t the case: https://dev.haiku-os.org

Hello,

Here are my thoughts and opinion on this.

Going back to basics, the reason for the “table” style layout is to provide a clear overview. When you read through code, you can very easily find method names (they are all nicely aligned), which methods are virtual (looking at the modifier) or find a getter for some specific type of object (just looking at the return type).

Indeed, this ease of reading comes at the price of making it harder to write classes this way. Both for humans and for robots like haiku-format, since the rules are so complicated and almost into typesetting, rather than just text formatting.

It is also true that in modern C++, there are a lot more things to put in method declarations: attributes, override and final annotations, and so on.

So, can we find a way to still preserve some of the readability, while doing something simpler? For example, what about this?

class BHandler : public BArchivable
{
public:
	// The constructor doesn't have a return type and so it looks a bit strange. Is this a problem?
		BHandler(const char* name = NULL);
	virtual
		~BHandler();

	// BHandler guts.
	virtual void
		MessageReceived(BMessage* message);
	BLooper*
		Looper() const;
	const char*
		Name() const;

	// Fictional very long member function declaration
	// The parameters on next lines get indented one step further
	status_t
		Launch(const entry_ref* ref, const BMessage* initialMessage = NULL,
			team_id* _appTeam = NULL) const volatile noexcept;

// NOTE: compared to the example in the article, I have separates the private methods
// from the private members. I think this is not part of the changes being discussed here,
// in both the current and new style, these should be kept separated
private:
	friend inline int32
		_get_object_token_(const BHandler*);

private:
	// Do we want newlines also in the fields between the type and the name? or is it not needed?
	typedef BArchivable _inherited;
	friend class BLooper;

	int32 fToken;
	char* fName;
};

This proposal is a lot simpler: for method declarations in classes, the type goes on one line, and the method name goes in the next line, with one extra level of indentation.

It preserves the readability advantage of the previous style, but gives a lot more space for both the return type and the method name. And I think it is easier to implement.

It also solves another concert (raised in previous discussions by x512): now it is possible to display the code with the “wrong” tab size, and still get understandable results.

This is just one quick attempt at solving the problem. Maybe there are other, better ways.

2 Likes

It might make him happy to put the trailing “const volatile noexcept” on its own line, too (and then you’d have to decide whether the final semicolon would always be on its own line, or only when there are items following the parentheses.)

Consider what you’d get out of text searches. If the rules have enough flexibility to allow one-liners where they’re short enough to fit, then you get everything about that item when it matches a search key.

I’m only a small tiny time developer, but I like PulkoMandy’s suggested style much better than the one proposed in the RFC. It’s much easier to read and still does away with the indention issues.

The RFC should definitely be optimized for readability in mind.

What would be the downsides of just globally disabling haiku-format inside classes? i.e., for as long as it doesn’t have an automatic class formatting mechanism, it just ignores everything inside class declarations, and doesn’t touch that?

I think the advantages of the easy-readibility of the current format kind of outweigh the disadvantages of having to manually format classes. If we want to relax the rules a bit and make it optional outside .h files, I would potentially be open to that, but I wouldn’t vote for dropping it altogether.

I think the style proposed above by @PulkoMandy is interesting, but it’s not nearly as compact as the one we currently use, and I’m not sure I really like it too much, either…

I guess we should also solicit feedback from long-time developers (who may remember more about how the tabular style came about): @axeld, etc.

Then we have to explain it manually in code reviews for 3rd party contributors.

In my paid work project, where there is no time for perfect solutions because it’s a commercial thing, we set up a coding style that is acceptable given clang-format constraints. Then we have it integrated in our buildsystem, everyone just runs the “fix-format” script before pushing their changes for review, and there is 0 time spent in code review for style fixes, because whatever the tool decides to do is right, by definition.

This saves a ton of time. Then this time can be used for other code review aspects: reviewing the overall architecture of the code, checking for bugs, making sure the documentation is updated as needed, and so on.

If we have a tool that does only half the work, we lose that, we will still have to review that part manually, and need an extra round of review to get things right. Or several rounds, since there are no clear rules for this and we are unable to implement an automated formatter to do it, and so we can’t expect people who contribute for the first time to get it right by hand, either.

Given the choice of the two options nielx offered, I would take the automated formatting, and I will workaround the lack of readability by other means (for example, syntax highlighting in my text editor). However, if there is a “best of both worlds” option, where we have readable class declarations and a tool to automatically format them, I’ll take that. Even if the style is very different from what we have now. And I’m ready to compromise a little readability for a large improvement in code review flow and process, so I would accept a slightly less good formatting if it can be done by an automated tool.

5 Likes

The big downside is that very troublesome to do all that table formatting by hand. You will forget what you attempted to write.

1 Like

Thanks for reminding me of this! [[nodiscard]] is great and indeed cannot be accounted for at all in the current table-formatting style.

1 Like

Agreed, and like I acknowledged in my article, when the class definition does not have too many frills, it looks great!

An aspect I did not touch on in the article though, is that I also think the development tools have improved a lot since the time the style was introduced. Code editors have gotten a lot better at highlighting. IDEs with code completion can go a long way in helping you find methods you need without having to open the header. Even the code formatting in the article illustrates how color and font choice help with reading the function declaration. Furthermore, we have the Haiku Book which quite literally puts all the methods a table (see BApplication as an example).

If I were to describe the style into plain English, I’d say we were going for the rules that for class member function declarations we (1) always want to break after the return type and (2) indent all lines by one extra tab against the baseline, except for the first line of the declaration. Note that my rule 2 would not indent the constructor and a non-virtual destructor.

For pragmatic reasons, I am looking at the options that are currently in clang-format to see what we can do to come closer to the proposed style. As per the economic argument, I’d rather use something that already exists rather than modifying the clang-format where possible.

For rule 1 (always want to break after the return type), there is the aptly named AlwaysBreakAfterReturnType option. It is currently set to TopLevelDefinitions to get the desired style for function definitions according to our style guides. I think we would then go for the configuration All in order to accomplish this, though the side effect is that this would also apply for function declarations outside of classes and structs.

For rule 2 (extra indention of subsequent lines), I think the one that we should use is IndentWrappedFunctionNames, which is currently set to false.

Calling haiku-format -style="{BasedOnStyle: Haiku, SpaceAfterCStyleCast: false, IndentWrappedFunctionNames: true, AlwaysBreakAfterReturnType: All}" then results in:

class BHandler : public BArchivable
{
public:
		// The constructor doesn't have a return type and so it looks a bit strange. Is this a
		// problem?
	BHandler(const char* name = NULL);
	virtual ~BHandler();

	// BHandler guts.
	virtual void
		MessageReceived(BMessage* message);
	BLooper*
		Looper() const;
	const char*
		Name() const;

	// Fictional very long member function declaration
	// The parameters on next lines get indented one step further
	status_t
		Launch(const entry_ref* ref, const BMessage* initialMessage = NULL,
			team_id* _appTeam = NULL) const volatile noexcept;

	// NOTE: compared to the example in the article, I have separates the private methods
	// from the private members. I think this is not part of the changes being discussed here,
	// in both the current and new style, these should be kept separated
private:
	friend inline int32
		_get_object_token_(const BHandler*);

private:
		// Do we want newlines also in the fields between the type and the name? or is it not
		// needed?
	typedef BArchivable _inherited;
	friend class BLooper;

	int32 fToken;
	char* fName;
};


// Unintended side-effect: also definitions outside class declarations are affeced by
// IndentWrappedFunctionNames
void
	Foo::FooFunction()
{
}

So clang-format can somewhat accommodate these style changes, though it does not support making an exception for a break after a return type for non-member function declarations (which I am not sure is that bad), and it indents the function names also outside of class definitions (which I think is a departure of our current style).

The aim of using a code formatter is to introduce and maintain consistency in a code base, which is ultimately the aim of our coding guidelines as well. My assumption here is that using haiku-format will increase the consistency in the code base, which I think is evidenced by what I am seeing running it over the changes, even if it needs a little bit more tweaking.

If we then exclude class definitions from the scope of haiku-format, we would then exclude a non-trivial part of the code and would be counter to that aim. I am on the line of thinking as PulkoMandy, which is that ultimately the value of the code formatter outweighs dropping style preferences it does not support.

Note that I genuine think there are objective and subjective reasons to change the formatting style of class members, so I am tip-toeing around the discussion on whether automated formatting tools are the goal by and for itself here.

The proposal outlines the rule changes, which includes an opting into the old style for now if that’s preferred (or less bothersome than reformatting a big class). I do use the term opting in deliberately here, because I do want to propose adding the // clang-format off: legacy class member formatting explicitly to code that opts into the legacy style.

1 Like

I don’t think that current table formatting actually makes definitions easier to read. Maybe an opposite. It is a matter of personal taste.

2 Likes

Not sure I agree with that (but also not sure if I understand what you say exactly).

Basically we are balancing several costs and benefits here:

  • The cost of developping a tool to format code exactly as we do it now (which is complex, so high cost)
  • The cost of manually formatting and manually reviewing other people’s change for formatting problems (also relatively high, but more spread out over time)
  • The benefit of an easily readable format where return type, qualifiers, and method names are neatly separated (sure, it is a little subjective)
  • The cost of maintaining two coding styles in parallel and migrating from one to the other, either in one big “reformat everything” over a few weeks, or in a slower and more manual way
  • The cost of change: it is already “paid” by us all spending time in this discussion. If the new formatting is automated, there is not so much time wasted “learning” the new style, we can hopefully soon let haiku-format handle it for us.

I think once the new formatting is decided, we should rather aim to migrate quickly. It seems not so hard to run haiku-format on, say, headers/os, and make one big commit that reformats everything to the new style (or maybe split it into one commit for each subdirectory of headers/os). Then, it is done for a large majority of class declarations in the codebase. We then have a single new style again and we can move on. For me that is better than allowing both the old and new styles to co-exist for a possibly very long time.

1 Like

Global reformatting have an issue that it will invalidate many pending patches in Gerrit.

3 Likes