One scraper is hurting but could be handled. In fact, most questions to a LLM will generate scrapping. So, the problem comes that there are about as many scrappers instances as people using LLM. And since those things are quite popular at the moment…
I seriously doubt the DDoS against my OpenGrok instance (that allows to browse and search Haiku source code online) is triggered by user actions.
Why would it use my server,rather than the official Haiku cgit or the Github mirror?
Is it realistic that there are so many Haiku developers that they spam hundreds of requests every second?
Looks more like automated indexing that’s out of control.
It’s not directly. You ask something to LLM, it will start indexing whatever it thinks relative to the subject. Most questions lack precision so it ends up scrapping everything. Another question on same subject may not trigger more scrapping but damage is done. Now, consider the number of LLM and the number of versions each has. I doubt that they share their indexes. In fact, I suspect an index per user instance.
That’s not how it works. The high volume scraping is for collecting training data. Essentially they are feeding the entire internet into their training. They could dowload a copy and work from there, but they don’t, and so they download a lot of things. Especially in the case of exposed git repositories (through whateâer web interface), they will download every source file in every versione which is stupid and inefficient (that’s why we use git to do it in a better way).
According to this article, the promised productivity gains aren’t even showing up: Are We in an AI Bubble? - The Atlantic
And my non controversial opinion is that LLMs, AI, whatever you wanna call it, needs to be kept far away from Haiku. I would sooner stop using computers altogether if this is something unavoidable. We still have books.
“Open”Grok. So, this version is the neckbeard version of the racism and genocide spewing Grok AI? Yuck. You shouldn’t use it. Nobody should. And yes, I am for banning every use case, no matter what. “Train” these on my music and you will get sued.
OpenGrok has nothing to do with the Grok AI. It’s been around for much longer, and is a code search engine.
If your music is publicly available, most likely it has already been used for training, like millions of other songs (and books, webpages, etc). Good luck suing and wasting your money.
Can the experience of reverse-compiling system 7 help to guide us?
We may not use LLM to finish the coding part, but we may analyze the original BeOS binary without copyright issues
BeOS compatibility has not been an issue for a very, very long time.
Funnily enough, I’ve been reverse engineering a UEFI Thunderbolt DXE lately and thinking about using LLM to reduce mental load and speed up the process.
Though I can only think of two cases where this might be useful for Haiku: reverse engineering of the BeOS PowerPC kernel to help with porting Haiku to PowerPC, or reverse engineering of some 32-bit only BeOS apps (like Code Warrior) to port them to x86-64.
How about close sourced device drivers?
Do you have any specific hardware you need support for? Do you happen to have one of the handful of 25 year old devices that BeOS happened to support, and that Haiku didn’t already rewrite a driver for? Do you also happen to have it set up on a machine that has enough RAM to run Haiku?
If not, then you have to look elsewhere for drivers.
I assume it’s about closed sourced Windows/macOS drivers which can be reverse engineered. There’s little value in old BeOS drivers.
Sorry for reactivating this controversial topic,but seeing an increase of AI spam on this forum,I really want to share a documentary that I saw yesterday.
Having never used the bullshit machine myself,I found it absolutely shocking to see how easy it is to generate nonsense,fake news and low quality content that other people even pay for (how do you know you’re buying a book full of AI slop before having read it?).
The internet is full of AI slop already,and it’s only becoming worse.
I hope that at least some nice technology communities like this one can be protected from AI slop.
And on what basis can you judge? From a gut feeling?
We can see the crap that people post from it all across the internet.
In any case, even official technical documentation or official approved documents may not correspond to the current reality on the ground. Obviously, this is a variable matter that depends on many factors.
And you have no way to know what output is hallucinated garbage and what isn’t
You have posted hallucinated garbage on this forum about virtual filesystems for instance. You believed it. Same with a table full of made up stuff about “Maxtor” (the hard drive company) graphics cards, because whatever slop-machine you were using didn’t know the difference
You will eventually realise the error of thinking that “AI” works. Until then, please stop exposing us to its garbage.