On The Merits Of LLMs

humdinger · May 30, 2026, 12:58pm

I’m not quite following…
If it’s decided people should include their original text when they use translation services, we know the true origin of the text, because that original text is the origin. If they don’t provide the original, we most probably won’t know they’ve used a translation service. Which leaves all of us (the poster included) blind to potential mistranslations that could be caught if the original language text were provided.

This is of course with the caveat provied by X512:

This is a good point, I haven’t yet considered.
Personally, I’m not that concerned about people using translation services, LLM or otherwise. People are capable of using bad English all of their own or the bad translation of some service. Their fellow forumites may choose to ignore their posts, if they tend to be gibberish or misleading.
Providing the original language text (if available) may help in those cases. It may even help the original poster when they try to make sense of what their translation service coughed up when they read it after a year or two…

phoudoin · May 30, 2026, 1:12pm

Isn’t what is exactly happening here in that forum: open discussion with everyone?

Contrary to someones namecalling gulag, fachism, censorship, the topic is HIGHLY debated, openly.

But for open discussion to happen, some rules still must be respected, and sometimes enforced : everyone should have space to express himself, which means it should be himself that express his though, not a machine for him. And insulting people disagreeing with you can’t be tolerated.

Considering the number of posts about AI topic available in that forum while being a very hot topic, I fail to see how one could claim that the discussion is forbidden, censored, blocked.

It’s just that like everything, there is no absolute freedom, and therefore no absolute freedom to do whaetever one would want to defend his opinion. The rights of other people not sharing the same opinion are also to be protected. Stating that if their opinion is that a software which code was generated by AI must be identified as it are like people who enforced to put an yellow star on other people is, for example, the kind of violation that open discussion can’t let happens. Calling people not sharing your opinon nazis was off limits, way off.

Open discussion no problem, but beware, Godwin point is never far away in any hot discussion.

Zardshard · May 30, 2026, 1:39pm

I like translations A and D the best. I can’t judge their similarity to the source text because I don’t know Russian, but they both sound more natural than B and C.

I will point out that if the community decides to ban LLMs for translation, I would appreciate it if you didn’t use them, even if it is virtually impossible to tell.

suhr · May 30, 2026, 1:42pm

People angrily express their most extreme opinions, but then they calm down and a decent discussion may start. But after a while there’s a new thread and the pattern continues.

It seems like Andrea loves to start heated discussions of this sort.

Akakor · May 30, 2026, 1:56pm

Nothing prevents you, or anyone else, from replying…

mohammedrattia · May 30, 2026, 2:10pm

I think that mean translations can be wrong, so using multiple translations can help in knowing the meaning better. On the other hand, when I used google translator, it gave me the incorrect translation which mostly means that LLMs translate better then traditional translators (at least in this part because LLMs sometimes change meaning or hallucinate while it try to merely translate).

In my opinion, people should be accountable for what they write either translation or writing it by themselves. The “I didn’t mean to write that” should not be an acceptable excuse. If someone want to use this excuse, so he can choose to write the original text in the message.

However, I believe the problem never was with translation but with generation. Many people just don’t write anymore and use AI to write for them not to translate for them. Frankly, I feel frustrated about that because it feels like I’m not talking to a human. But as we know LLMs can generate native text also! I don’t know if this consequence have a solution anyways.

PulkoMandy · May 30, 2026, 2:38pm

One of the developers is expressing his opinions (which indeed seem to be pretty extreme).

So far, nothing is decided, and we won’t decide things just because of one developer. If we really can’t agree, there will be a vote, but I think it is in everyone’s interest that we have a discussion and see if we can make some compromises.

Indeed, my position is evolving quite a bit as I read arguments here. That’s what a discussion is about.

There is no “the administration” and there is no one forcing their thouhts here. Merely expressing them.

What can ever be objective? The rules are defined by the community in the forum. That’s what we’re doing now in this very thread.

I wrote or assembled all of the “Haiku internals” documentation at Welcome to Haiku internals’s documentation! — Haiku internals documentation . Is it complete? No, certainly not. Does it contain all my forum posts? Fortunately not.

But the time I spend answering questions on the forum also gives me a good hint of which parts of the documentation I should priorize writing or updating. When I answer the same question for the second or third time, it’s a pretty good indication that something deserves to be documented.

And it also works the other way around: when someone asks a question, sometimes the question itself is a hint that they are going in a “wrong” or at least unexpected direction. Sometimes, taking a step back and suggesting a completely different way to achieve what they’re trying to do is the right answer. Or sometimes it turns out they already tried the obvious approach and failed for a very good reason, then I learn something from their experience.

In general, talking to each other is how we build consensus. And that’s why I spend so much time on the forum, reading everything and answering a lot of things. For me, this is what preserves the unity and coherence of Haiku, as a single project where people have roughly aligned goals. Some people see it as me trying to push my own goals, but that’s not it. I also understand where other people want to go, and incorporate that into my own vision.

I think having both the original and the translation makes things worse. You will have a set of people who understand the original language and read the original, another set of people who can read only the translation, and a smaller set who can read both. Now they all have a different understanding of what the original poster meant (and that’s assuming good faith, the original poster could even intentionally use different text in the original and traduction).

I would say it’s better if everyone works from the same text. If there is a misunderstanding, people will notice it. The original poster can say “I did not mean that, I used the wrong word” (it happens to me occasionally when I write English, even when not using translation tools).

If there should be any rule about this, let’s focus on the quality of the output rather than how it was written. People should make themselves understandable and check what they write. They can use machine translation if they don’t have any better way.

Mis-translations can happen (even when writing text manually). When it is pointed out, people can indeed say “sorry, I didn’t mean to write that”, and either clarify in a new reply, or edit the original post if it was really offensive.

Problems start when people refuse to do that, and double down on what they said. But this also can happen without any translation. Language barrier is not the only thing that can make it difficult, sometimes impossible, to communicate. But, in most cases, I don’t think we will reach that point.

mohammedrattia · May 30, 2026, 2:49pm

Well, you have a point. Maybe the context made me think of a very specific scenario when people just write carelessly then use a random excuse to get out of the consequences. But generally yes excuses are normal. However, in most of the cases, an excuse without proper apology (if the reply caused harm) may be useless. That’s why you added “sorry, …” I think.

waddlesplash · May 30, 2026, 3:28pm

Actually they seem to hallucinate wildly even for very simple, but slightly grammatically incorrect, statements. For example:

sabcha1 · May 30, 2026, 4:00pm

I feel it is necessary to speak up.

Since we’ve previously discussed the fact that there is linguistic inequality within the community (because some people were born in the U.S. or the U.K. and others were not), and since the forum generally requires everyone to speak English, I feel it is necessary to say that we cannot accept this state of affairs.

By what right should Augustin, for example, have an advantage in conversations simply because he was born somewhere that someone else wasn’t?

That’s unfair.

I believe we need to switch to a neutral, non-national language that everyone can speak, but one that is also easy to learn.

How fortunate that such a language already exists, and not just in theory, but in the form of a substantial body of texts, dictionaries, and resources - and this language is called Esperanto.

So yes, I think our forum (and the community as a whole) should become Esperanto-speaking.

And this isn’t just for the sake of communication; it’s also an excellent way to fight against the authorities and corporations that manipulate the public and our lives by exploiting our divisions, including linguistic ones.

Zardshard · May 30, 2026, 4:16pm

Yes, it probably would be better for Esperanto to be the lingua franca over English. It would be fairer for everybody.

But I’m not sure it’s a good idea for us since I think it would raise the barrier for entry for newcomers to Haiku significantly. Worst case, we get virtually zero. Or maybe the Esperanto-speaking community will hear of us and join en-masse

us3r1d · May 30, 2026, 4:32pm

Nope, not reliably.

Research has shown pretty clearly that humans are not particularly good at identifying LLM-generated text, especially since the various “tells” for it (like lots of em-dashes) got popularized. When researches put their sample sets online with the papers, I usually run through them, and on the last one I got exactly 50%.

And software that tries to identify LLM-output does even worse.

So I may as well flip a coin.

I really don’t get this argument that rules that can’t be perfectly enforced have no value, but then utilitarianism never makes sense to me.

The point of having rules at all is:

they’re the right thing to do or
when people disagree on what’s right, they establish a way for folks who disagree to coexist.

Being able to perfectly identify violations isn’t even a consideration for me.

nipos · May 30, 2026, 5:20pm

I’ve read the name Esperanto a few times so far (assuming it to be the native language of some small country) but your description made me have a look at it for the first time.
I really like the idea of having a universal language that works for everyone and is easy to learn.
Ideally,the whole world would speak in the same universal language so that everyone can directly communicate with anyone else.

Unfortunately,only very few people speak Esperanto,so switching the community to it would put a pretty high barrier to enter it.
Even if the language is so easy that everyone can learn it,it means learning thousands of new words which will take a while,and not everyone is willing to invest so much time to participate in a operating system community.
I think that English is a pretty good common ground because millions of people understand it (even if not as native language,it’s a mandatory subject in school in many countries) and it is rather easy to learn compared to most other languages.

nephele · May 30, 2026, 5:27pm

When going by the “what many people know” category Spanisch and Chinese would be better candidates…

Esperanto is interesting, and easy to learn for westerners it’s not quite as universal as you might expect, it does have many germanic or latin derived root words, but not that many from languages that fall outside that. It is easier to learn primarily also because of it’s grammar.

Also that “very few” people speak it is quite wrong. In the past it was even considered for use in the UN.

For the most part this forum uses english… because we have used english in the past, and the cost of migration towards anything else would be quite large, not for any practical reasons that would make english a good choice.
I would apreciate more people using the International section, and also if people want to learn esperanto to use it there, feel free to.

nipos · May 30, 2026, 5:50pm

Chinese and Spanish are the number one and two when it comes to native speakers but English is by far the most common language when counting both native and non-native speakers.
Also,I think that the language being easy to learn is rather important,additionally to the number of speakers.
English is quite easy to learn,while I’d say that Chinese might be one of the most difficult languages for someone who has never read or written these signs instead of letters.

“Very few” depends on your definition of many and few,but Wikipedia says there may be between 100 000 and a few million Esperanto speakers.
Compared to 1493 million people who speak English (both native or non-native),I’d define that as rather few.

KapiX · May 30, 2026, 6:45pm

Creating laws/rules that cannot be enforced is problematic, because if people can violate one rule with impunity, then other rules will increasingly get ignored too. When you enforce one rule but not the other (because you can’t), you make it easy to question the whole rulebook or your behavior in each of these cases. Essentially, you make your job of upholding the rules harder.

nephele · May 30, 2026, 7:02pm

The thing is, they are totally enforceable. Just not fully, similar to how you can’t fully enforce the “be agreeable” rule, not everything will seem agreeable to everyone

anon11892322 · May 30, 2026, 7:17pm

@phoudoin Fair point on Godwin, taken. I’ve already withdrawn the “yellow star” analogy and apologised for it earlier in this thread, and I won’t defend it. You’re right that comparisons of that kind don’t belong in an open discussion, regardless of intent.

You’re also right that this thread itself is evidence that the conversation isn’t being silenced. I don’t think it is.

What I’d still want to say, separately from anything about me, is the underlying concern: over the past months several people have left this community, some quietly, some after stating their reasons. That worry isn’t about who is right or wrong in any single thread; it’s about whether the way we disagree is sustainable. I think it’s worth looking at honestly, without it becoming yet another argument about one person.

Riepilogo

@phoudoin Justa punkto pri Godwin, akceptite. Mi jam retiris la analogion pri la “flava stelo” kaj pardonpetis pro tio pli frue en ĉi tiu fadeno, kaj mi ne defendos ĝin. Vi pravas, ke tiaj komparoj ne lokas en malferma diskuto, sendepende de la intenco.

Vi ankaŭ pravas, ke ĉi tiu fadeno mem estas pruvo, ke la konversacio ne estas silentigata. Mi ne pensas, ke ĝi estas.

Kion mi ankoraŭ volas diri, aparte de io ajn pri mi, estas la subesta zorgo: dum la pasintaj monatoj pluraj homoj forlasis ĉi tiun komunumon, iuj kviete, iuj post klarigo de siaj kialoj. Tiu zorgo ne temas pri tio, kiu pravas aŭ malpravas en iu ajn unuopa fadeno; ĝi temas pri tio, ĉu la maniero, kiel ni malkonsentas, estas daŭrigebla. Mi opinias, ke indas rigardi tion honeste, sen ke ĝi fariĝu ankoraŭ unu argumento pri unu persono.

us3r1d · May 30, 2026, 7:17pm

You’re not wrong, but you’re also kinda talking about a different thing.

No one here is actually saying a rule against LLM-generated content cannot be enforced; they’re complaining that violations would be difficult to identify with certainty. And that’s a different thing.

This makes enforcing it perfectly impossible, but it also negates the argument that violations would erode trust in rules as a concept.

I don’t personally care if people want to use translators, or even LLM-based translators. I think if they do that they should want to post their source text too, because that lets moderators base decisions on what they meant rather than what the translation says. But If they want to YOLO the translation and risk getting banned over the machine making a mistake, then I personally don’t mind.

But that isn’t what the current FAQ says:

Don’t post content generated by large language models (“LLMs”) or similar tools.

You could argue over what counts as “similar”, but LLM-backed system are explicitly listed as forbidden at the moment.

KapiX · May 30, 2026, 7:29pm

I feel you’re comparing apples to oranges here. “be agreeable” is subjective (unless you have an objective metric on your mind?), using an LLM to translate (which is what I was talking about, not generally about posting LLM content) is about a fact. Either someone used it or they didn’t.

If you’re going to miss 80% of occurences of that fact, then the rule doesn’t make sense, for practical reasons. If you can’t tell a difference then why do you have a problem with it?