Dpkg-haiku. Debian package management for Haiku (working port)

I was abstaining from this discussion because I expressed my thoughts on this topic in the past. Deb package format in control file can specify much more fine graded categories of packages this package inter-operate:

  • Depends - is a hard dependence on specific package
  • Conflicts - is a hard incompatibility with specific package
  • Provides - specifies reverse-dependence
  • Suggested - packages that are nice-to-have with this package

Here are real use-cases for all them:

  1. Package A needs package B to be installed to work properly. In this case when installing A, package B is automatically installed too. But this is only half of the story. In most cases, package B is a library about which end user has no idea what it does. So, package A is marked as installed while package B is marked as installed as needed by A. If user will remove package A then package B will be removed as well (in case no other package depends on it).
  2. Package X implements the very same functionality (and executables) as package Y, e.g. 2 different versions of the same library (Python 2.7 and Python 3.2 as an example). If package maintainer decides the user only needs either X or Y (to not mix up different versions of the same library during linking) then package X may be declared as conflicting with Y and also package Y as conflicting with X. In this case when X is installed, Y is removed automatically after user confirmation and when Y is installed X is removed automatically.
  3. We can use meta-packages, which donā€™t contain any file, but only specify a functionality they are going to provide. This kind of virtual packages always depend on some implementation package. Suppose we have several desktop environments (KDE, Gnome, XFCE, etc). To work properly, they all need some desktop manager. The desktop manager is independent of DE and there are plenty of them (KDM, GDM, XDM, LightDM). So each DE package depends on virtual package DM and each *DM provides DM. The default implementation of DM package may be any of them, say LightDM. So when user installs KDE package, the package manager also pre-selects DM package with its default implementation LightDM. When user installs different DM provider, say KDM then LightDM is not more necessary and package manager suggests to remove it as no more used. When user removes KDE, it also removes DM and KDM packages.
  4. Suggested packages are not hard dependencies but packages that are installed alongside with given package in most cases, but which can be removed independently. Font packages are good illustration of this use case. When, for package A, the suggested package is B then by selecting A for installation, the package manager suggests also package B, but user is free to not select it, or select some other package C with similar functionality. For example CJK fonts are well covered by Code2000 font and also by Noto fonts. If user doesnā€™t speak any of these languages he is ignorant about what font is better. Still he appreciates that Asian sites he visits show something meaningful. However he prefers to have as little space occupeid by CJK fonts as possible. So while installing an DE, the package manager suggests also installation of Noto fonts (which are nicer than Code2000), the user can choose Code2000 instead (which take less space). When user removes package A, the package manager also informs him about packages that were installed as suggested (B) and gives the option to uninstall them too.
1 Like

We have all of this as well. Our dependency resolution is based on libsolv which is also used by OpenSUSE. They are just as flexible as Debian in that regard, and so are we.

When I said Haiku required a CJK font, I mean Haiku requires a CJK font. Not suggests. requires. Can I make this clearer?

The font is also part of the Haiku look and feel. We want to provide the same polished experience to all users, therefore, a choice of a default font that fits everyone is important. Itā€™s not just about rendering CJK characters in some way, itā€™s also about rendering them with a typography that matches with other fonts rendered througout the system. Itā€™s about being able, as an application developer, to know which unicode symbols exactly are covered, and be sure that you can display them (in some apps I also use symbols from these fonts, as icon labels, for example).

You are right that we need to implement ā€œautomatically installedā€ packages and the associated automatic cleanup, but thatā€™s completely unrelated to the discussion here.

2 Likes

BTW, libsolv has SOLVER_CLEANDEPS:

The solver will try to also erase all packages dragged in through dependencies when erasing the package. This needs SOLVER_USERINSTALLED jobs to maximize user satisfaction.

libsolv/doc/libsolv-bindings.txt at master Ā· openSUSE/libsolv Ā· GitHub. Maybe we could utilize that?

1 Like

It is indeed very clear. My point is that a CJK font should be virtual package (taking no space) with different implementing packages: Noto, Code2000, etc.

Because Unicode is the standard, the same unicode character (as a unicode number) will represent the same glyph (as a graphic representation of a letter / hyerogligh) regardless of the font, which covers given range (in out case CJK ranges) as soon as the font is unicode compliant (which nowadays is the case). So I as an application developer am only interested in providing the unicode texts as the UTF-8 (or UTF-16BE / UTF-16LE / etc) bytes. Font engine together with font file creates the glyph representation from caracters and layout manager computes how to place this image of glyphs to look sane. I as an application developer should not rely on the specific font (face and size) if I actually write something pretending to be portable.

The apt-* utilities also work

apt_haiku_1

2 Likes

Except when it doesnā€™t: Han unification - Wikipedia

1 Like

You can much more efficiently argue Chinese usage of non-unicode encodings, such as BIG5. The Unicode as standard is changing and extending and mostly in respect to CJK range(s). Still standard covers all these cavities.

ā€œIn the formulation of Unicode, an attempt was made to unify these variants by considering them different glyphs representing the same ā€œgraphemeā€, or orthographic unit, hence, ā€œHan unificationā€, with the resulting character repertoire sometimes contracted to Unihan.ā€

This means that:

  • Present day software / fonts do not make use of Han unification, and Noto fonts with HarfBuzz font engine are not an exception;
  • When / if Han unification will make part of Unicode standard, all fonts and font engines, which are Unicode compliant will support it. Most probably, Noto font and HarfBuzz font engine will do so.

You can already do this. We need to streamline the process and not force users to start by copying the whole deskbar symlinks directory, but it is already more than possible; I think BeSly has an app for it.

Adding something to a blacklist is as fast as adding one path to a file. Deleting or renaming buggy drivers is as fast as typing rm of that path. Is the difference really non-negligible?

The non-packaged/lib directory already has precedence over /system/lib. So you can already replace libbe.so by throwing one into /system/non-packaged/lib and then starting applications; or rebooting to get everything using your new copy. No write access to the package area is needed for this.

Nope, not on Haiku it canā€™t, because / does not exist as a real path. Any files placed there will disappear on next reboot, if I recall correctly.

And as PulkoMandy already mentioned, hpkg can and is created or extracted on Linux; we do this as part of the cross-compilation process. And I donā€™t know what you refer to by ā€œa lot of utilities that work with itā€ ā€“ what are we missing for HPKG here?

  • You cannot (trivially, i.e. from the bootloader) revert to previous states in deb
  • Manipulating the installed packages from a non-running install or an entirely separate OS is not possible in deb
  • Update transactions are non-atomic in deb, i.e. your system will be unbootable in the case of a power failure / kernel crash during updating
  • If the ā€œinstalled filesā€ database gets corrupted, you will never be able to install anything again, it is not ā€œun-corruptibleā€
  • It is integrated with the system as a whole. Haiku is designed as a unit; removing part of that unit removes a large part of the cohesiveness.

Why donā€™t you do it? (Or does BeSly already have a utility here?)

The truth is that blacklisting should be (and is) a rare occurrence. The number of drivers that actually crash the system is reaching all-time lows. If you still have some, please report that, and we will fix it so you donā€™t need to blacklist them anymore. And as I mentioned above, you donā€™t need blacklisting just to replace libbe.so or other such things.

This is a completely unrelated issue from the package manager, so letā€™s not mix it up with this.

The problem is changing how people do things is usually a non starterā€¦ MS always fails every attempt they make at doing this, and only succeed when they find a non invasive way to do what they want. Technical details of why doing something a new way is better than the old way almost never are enough to persuade users to transition/update.

Current implementation of packagefs also uses negative logic (non-packaged) which is also typically a non starter for a good UI.

The way packagefs is currently designed leaves the user feeling less in control of thier system and more at the behest of the packager even though there itā€™s merely a UI distinction.

macOS and Linux do things very differently than Windows, and Linux is in the process of changing from ifconfig to ip, among other things. So, I think Haiku is ā€œallowedā€ to do the same.

Yeah Linux changes every 3 years and it sucks.

Windows has like 5 different ways to do the same thing and none of them work in all sutiations or provide all features, and it sucks, the latest defacto interface is absolutely completely brokenā€¦ set a static IP it does nothing set it back to DHCP nothingā€¦ its just completely broken. You have to dig through control panel, get the ā€œoldā€ network and sharing center back up, and then acess the network addapter settings dialog from thereā€¦ which gets you back to a dialog that has been around since probably windows 95ā€¦

Mac OS, honestly just follows FreeBSDā€™s lead on the low level stuff mostly but they have a UI like Haiku does, and as long as there is one complete UI to that you follow the same path to get to for ever, most people wonā€™t complain, as this is a minor change, an extension of what they already know.

Unicode uses Han Unification. These characters use the same codepoint, yet depending on the language context, one has to render them differently. The Noto font provides all the variants, and we have to integrate something to make the font rendering engine aware of the language so that it can pick the correct one.

Han Unification has been there since the early days of Unicode (when it was thought a 16 bit encoding would be enough), itā€™s not a new thing, itā€™s a problem that dates back to the 90s. There is no when/if here.

This means we canā€™t just say ā€œhere is an UTF8 string, here is a font, please do the renderingā€ and expect the result to be correct. We need some metadata. And if we canā€™t be sure there is one single font that covers all the languages (as Noto does), we canā€™t properly mix text in different languages without it looking super ugly because the glyphs will be picked from different fonts with some mix and match. We donā€™t want this, so we provide and require one font that can cover this need.

1 Like

Maybe we could get rid of packages, directories, everythingā€¦ and just have all of Haikuā€™s components and applications compiled as one giant static binary. It would solve all the problems, reduce storage requirements, and boost performance by 119.3%. :smiley:

1 Like

I realize you are trying to make a joke here, but, uh, I actually donā€™t see what he joke is, considering none of these things are true.

That was the joke, my attempt at comic relief & nerd humour.

1 Like

Yeah it creates as many problems as it solves but sometimes you canā€™t have one solution to rule them allā€¦

My point is, we can just say ā€œhere is an UTF8 string, here is a font, please do the renderingā€ when the following condition are met:

  • The font is one of the types: TrueType, OpenType, Type 1;
  • The font has coverage for requested Unicode ranges;
  • The font engine knows how to work with these font formats.

More than that, even if the text is provided in some non-Unicode but rather standard encoding (koi-*, cp125x, iso85yz, BIG5, etc) there is libiconv library which knows how to understand encoded text and translate one encoding to another.

As for metadata, there are several points:

  • The encoding used for some text is usually not part of the text, but rather of locale settings. This always leads to problems when some text is encoded differently than user locale. An example is MP3 tags using non-ASCII characters prepared under Windows and used under any other OS, including Haiku.
  • The language of the text is also provided by locale settings. But having recognized the right encoding, language is completely deducible from the character codes (I mean non Latin script).
  • CJK fonts usually cover glyphs for all Asian languages (Chinese, Japanese and Korean, including traditional and simplified for Chinese, Kanji, Hiragana and Katakana for Japanese and Hangeul with Hanja for Korean) and usually use the same glyph for the same ideogram even if they have different code for different ranges (this is often true also for European languages, say Latin ā€˜Aā€™, Chyrilic ā€˜Šā€™ and Greek Alpha). All aforementioned font formats allow multiple glyph mapping.

It means all necessary for rendering metadata is either in text or in font file.

Noto font is not a single font, it is separated into many font files which even donā€™t share the name of font face. Usually the creators of glyphs for different Unicode ranges are different and it is not so much a matter of style as much a matter of taste (i.e. super ugly look for somebody). I often prepare texts using many different non-Latin alphabets and usually use different fonts for different languages, even if some font covers them all.

And finally, all you said and can say about Noto font is also true for Code2000 font, which covers almost each and every writing system in just 8.5 MB.

Now, returning to packaging the right font. I fully understand that Haiku philosophy is to provide by default the fonts for as much writing systems on the earth as possible and to make it look polished. Still, Haiku allows user to choose custom fonts for UI elements. If somebody will choose not default font for CJK texts (I would install and use SIMSUN font for this purpose, despite the fact it is not free), there is no possibility to free up the space occupied by Noto font.

What I was talking until now is:

  • Make base Haiku system depend on CJK font virtual package, which itself has 0 size.
  • Make Noto font default provider of CJK font meta-package, which ensures it will be installed with Haiku by default, but also allows uninstalling it given alternative provider of CJK font meta-package.

Somebody may want to create Code2000 font package, which also provides CJK font. So by installing this small package, user will be able to uninstall Noto font package and to align with Haiku UI guidelines regarding CJK fonts. Somebody else, I think @s40in will do, may prepare for himself Dummy CJK font package which declares itself as provider of CJK font meta-package, but actually doesnā€™t care about CJK languages and texts.

But this is exactly the point: you cannot, from a given text string, determine with 100% accuracy what glyph mapping to use. There is no ā€œChinese Traditional Control Characterā€ in Unicode yet; you must know something besides the string and the font to pick the right glyphs.

1 Like

We speak probably different languages, because I fail to understand how the second cite conflicts with the first one. When I say ā€œmultiple glyph mappingā€ I mean the same image for different Unicode codes and not vice-versa. For example in the following 3 strings: ā€œAlexanderā€, ā€œŠŠ»ŠµŠŗсŠ°Š½Š“рā€ and ā€œĪ‘Ī»Ī·Ī¾Ī±Ī½Ī“ĻĪæĻ‚ā€, the first letter is 3 different characters: Latin Capital A (Unicode = 0x41), Cyrillic Capital A (Unicode = 0x410) and Greek Capital Alpha (Unicode = 0x391). Still, many fonts use the same image for all 3 codes, and historically these letters have the same origin. That is, the same glyph is at least triple mapped: 0x41, 0x391 and 0x410. When human reads the rendered text, he receives glyph information, which is identical. However when one copies these texts to clipboard, he copies in most cases utf-8 encoded Unicode bytes, and this works, because the same font contains information about both character code and glyph image. And all Unicode fonts works the same in this regard.

This is never necessary. When the same glyph is multiple mapped, it is so because all its mappings look identical and even if your eyes look at wrong code, your brain receives right glyph image. Still this kind of problems do not arise when somebody prepares a localization file for some application, because he prepares certain character codes and not certain glyph images. The problem may arise in OCR software, which tries to recognize typed text (i.e. receives glyph image and tries to determine its character code, which may be not unique).

In this case what is Haiku way to provide these characters? Are they in private use area? What happens when Noto font creators decide to change the glyphs in that area (e.g. when Chinese Traditional Control Character will make part in future Unicode standard)? Should the application translation also be rewritten? When some application relies on user defined glyphs in some font, it usually provides this font within application and does not use system font.

Hire is the kind of texts I often prepare:


It uses Latin, Greek, Hebrew and Syriac languages and each language uses its own font: LinLibertine for Latin, KadmosU for Greek, SILEOT for Hebrew and SyrCOMNisibin for Syriac - all serif fonts (2 ttf and 2 otf). This kind of fonts are expected in professional looking books, even multilingual, because this fonts are easy to read. Now I tried to use unified Noto font for all them and immediately ran into a problem: the full Noto package does not have any Syriac serif font. So I changed to sans font. After some playing I ended with Noto Sans for Latin and Greek, Noto Sans Hebrew for Hebrew (Noto Sans does not cover Hebrew area) and Noto Sans Syriac Estrangela for Syriac (Noto Sans does not cover Syriac area). And here is the result:

This may be uniform and looking good for smartphone screen, but polytonic Greek and Syriac look extremely ugly, while vocalized Hebrew looks newspaperish, not as it usually appear in books. Same for Latin.