Why aren't the packages visible?

greekboy · April 25, 2022, 1:57pm

The following addresses do not show the haiku packages files.

" https://eu.hpkg.haiku-os.org/haikuports/master/x86_64/current/packages/ "

" https://eu.hpkg.haiku-os.org/haikuports/master/x86_gcc2/current/packages/ "

Releases the message: 403 Forbidden

nipos · April 25, 2022, 4:12pm

You can find the list of Haiku packages at Haiku Depot Server

kallisti5 · April 25, 2022, 6:10pm

To cut down on http scrapers. We get a lot of random bots scraping our repos when we leave indexes on which eats up bandwidth which could be better spent mirroring to mirrors via rsync.

Diver · April 25, 2022, 7:29pm

Thats a bummer, I used them quite often.

kallisti5 · April 25, 2022, 9:02pm

Hm. If there is a good use-case I don’t want to block it from functioning either. We have around 250 GiB of haikuports packages that bad bots were randomly scraping on a loop and raising our bandwidth usage.

Do you have some details on the usage cases @Diver ?

Zenja · April 26, 2022, 5:04am

I’d occassionaly grab older packages by navigating Haiku Depot Server, since HaikuDepot does not have the facility to list/retrieve older versions. Software ships with bugs so going back a couple of versions isn’t such a rare user behaviour. Also, SoftwareUpdater is all or nothing, we still cannot manually select what to update unless we use Terminal based pkgman.

Is that a valid use case?

Starcrasher · April 26, 2022, 8:14am

Maybe I’m wrong but, I think that if you move a package to your /home/config/packages directory, it won’t be updated by SoftwareUpdater.

tqh · April 26, 2022, 4:08pm

Do we have a robots.txt at the root?

kallisti5 · April 26, 2022, 7:20pm

Nah. I can try adding it, but the user agents at the time were identifying as Internet Explorer. Bad bots don’t care about robots.txt

So. we really need better tools for managing repos and packages. (something like aptly for haiku comes to mind)

My wish list includes server-side:

de-duplication (might be limited though in benefit)
package version tracking
supporting cold object storage for old versions of software
haikuporter buildmaster using s3 for storage
- This one almost needs a complete rewrite around haikuporter buildmaster. The code is heavily dependent on local files. I think if we could develop something to manage / goom repositories via rest api calls, that haikuporter buildmaster should interface with it directly.

Several of the above depend on native support for our hpkg’s and hpkr’s. I can write various server-side tooling in rust to do all this stuff… but have to get Alexander von Gluck IV / hpkg-rs · GitLab functioning before any of it happens.

If anyone knows rust and wants to help out parsing hpkg files and hpkr files in it… PR’s welcome Our only native code for managing hpkg / hpkr files is HaikuDepot (java) and the package kit.

tqh · April 26, 2022, 9:13pm

I’ve seen robots.txt do miracles before, so I think it is a very cheap thing to try.

jmairboeck · April 29, 2022, 10:19am

What would also be helpful is that “any” architecture packages are only built once and then used for all architectures. But I suppose this would be part of the de-duplication part. This would certainly help for big packages like texlive (see don't package all of texlive in haikuports · Issue #6771 · haikuports/haikuports · GitHub)