Running LLMs on Haiku

I get this error when building llama.cpp

/Dati/workspace/llama.cpp/src/llama-mmap.cpp:476:34: error: 'RLIMIT_MEMLOCK' was not declared in this scope; did you mean 'RLIMIT_STACK'?
  476 |         if (suggest && getrlimit(RLIMIT_MEMLOCK, &lock_limit)) {
      |                                  ^~~~~~~~~~~~~~
      |                                  RLIMIT_STACK

Did you need to patch llama.cpp to build?

Here is my patch for the current version of llama-cpp. Also llama-cpp requires a different allocator to work, it will not work with the system one - it will not be able to allocate enough memory for the model. Mimalloc has shown itself very well. To build using mimalloc, specify the path to the mimalloc.o object module in this way:
-DCMAKE_EXE_LINKER_FLAGS=“path to the mimalloc.o file”.

4 Likes

To some extent, mods taking the liberty to make title of the thread something like “running Llama 3 and other generative software” would help us know in advance what topic we are to discuss.

Done.
Title changed from AI on haiku? to Running LLMs on Haiku

3 Likes

It is still about AI on HAIKU,
with the program Llama 3, not about LLMs or I am wrong?

Llama is a LLM. I think the LLMs are the kind of ‘AI’ being talked about here.

1 Like

Model: DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf

Smooth as a sushi roll,
Built for the user, not the screen,
No crashes, just smooth.

7 Likes

Now it is totally misleading title!
It is still about AI so name it “Running AI on Haiku!”

Geez, who cares what the topic is called? It doesn’t make any difference to me AI, LLMs, Llama.cpp or whatever. The essence from it does not change - and for those who understand what we are talking about there is no difference.

7 Likes

I’m in the process of experimenting with llm (in particular, using llama-cpp locally). There is a vast ecosystem of Python libraries and frameworks but I’m facing some troubles installing them on Haiku.

  1. I’ve managed to install llama-cpp-python with some tricks, so I now can interface with llama-cpp from Python
  2. I’m not able to install ChromaDB as it requires packages in C++ that need to be compiled from source. Very much like Numpy, these probably require some patch
  3. OpenAI, this package is huge and requires some artefacts in Rust to be compiled from source.

Speaking of OpenAI, the long build fails at compiling platform-info v 2.0.3.
I’m not familiar with Rust, I hope some more knowledgeable could come to the rescue.
The error is as follows:

Compiling platform-info v2.0.3
                 Running `rustc --crate-name platform_info --edition=2018 /boot/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/platform-info-2.0.3/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C embed-bitcode=no --check-cfg 'cfg(docsrs)' --check-cfg 'cfg(feature, values())' -C metadata=327ab1740f951bbc -C extra-filename=-327ab1740f951bbc --out-dir /boot/system/cache/tmp/pip-install-qmxoluzn/maturin_03aae7d8f90947c7a50cc31a553e8e6a/target/release/deps -C strip=debuginfo -L dependency=/boot/system/cache/tmp/pip-install-qmxoluzn/maturin_03aae7d8f90947c7a50cc31a553e8e6a/target/release/deps --extern libc=/boot/system/cache/tmp/pip-install-qmxoluzn/maturin_03aae7d8f90947c7a50cc31a553e8e6a/target/release/deps/liblibc-edca2e4dcba8f96d.rmeta --cap-lints allow`
            error[E0609]: no field `domainname` on type `utsname`
               --> /boot/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/platform-info-2.0.3/src/platform/unix.rs:130:84
                |
            130 |             debug_struct = debug_struct.field("domainname", &oss_from_cstr(&self.0.domainname));
                |                                                                                    ^^^^^^^^^^ unknown field
                |
                = note: available fields are: `sysname`, `nodename`, `release`, `version`, `machine`
      
            error[E0609]: no field `domainname` on type `utsname`
               --> /boot/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/platform-info-2.0.3/src/platform/unix.rs:165:38
                |
            165 |             equal = equal && (self.0.domainname == other.0.domainname);
                |                                      ^^^^^^^^^^ unknown field
                |
                = note: available fields are: `sysname`, `nodename`, `release`, `version`, `machine`
      
            error[E0609]: no field `domainname` on type `utsname`
               --> /boot/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/platform-info-2.0.3/src/platform/unix.rs:165:60
                |
            165 |             equal = equal && (self.0.domainname == other.0.domainname);
                |                                                            ^^^^^^^^^^ unknown field
                |
                = note: available fields are: `sysname`, `nodename`, `release`, `version`, `machine`
      
            For more information about this error, try `rustc --explain E0609`.
            error: could not compile `platform-info` (lib) due to 3 previous errors

domainname does not seem to be a valid field of the utsname structure.
Digging into platform_info source code it seems that this field is not added for Haiku.
Has anyone any advice on how to solve this issue?

PS: OpenAI package is installed with pip install openai

I’m also attempting to run Qdrant instead of Chroma, but again the compilation halts with an error here:

Compiling socket2 v0.4.9
error[E0425]: cannot find value `IP_RECVTOS` in module `sys`
    --> /boot/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/socket2-0.4.9/src/socket.rs:1450:22
     |
1450 |                 sys::IP_RECVTOS,
     |                      ^^^^^^^^^^ not found in `sys`
     |
note: found an item that was configured out
    --> /boot/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/socket2-0.4.9/src/sys/unix.rs:93:22
     |
93   | pub(crate) use libc::IP_RECVTOS;
     |                      ^^^^^^^^^^
note: the item is gated here
    --> /boot/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/socket2-0.4.9/src/sys/unix.rs:82:1
     |
82   | / #[cfg(not(any(
83   | |     target_os = "dragonfly",
84   | |     target_os = "fuchsia",
85   | |     target_os = "illumos",
...    |
91   | |     target_os = "nto",
92   | | )))]
     | |____^

error[E0425]: cannot find value `IP_RECVTOS` in module `sys`
    --> /boot/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/socket2-0.4.9/src/socket.rs:1474:70
     |
1474 |             getsockopt::<c_int>(self.as_raw(), sys::IPPROTO_IP, sys::IP_RECVTOS)
     |                                                                      ^^^^^^^^^^ not found in `sys`
     |
note: found an item that was configured out
    --> /boot/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/socket2-0.4.9/src/sys/unix.rs:93:22
     |
93   | pub(crate) use libc::IP_RECVTOS;
     |                      ^^^^^^^^^^
note: the item is gated here
    --> /boot/home/.cargo/registry/src/index.crates.io-6f17d22bba15001f/socket2-0.4.9/src/sys/unix.rs:82:1
     |
82   | / #[cfg(not(any(
83   | |     target_os = "dragonfly",
84   | |     target_os = "fuchsia",
85   | |     target_os = "illumos",
...    |
91   | |     target_os = "nto",
92   | | )))]
     | |____^

If I understand it correctly, IP_RECVTOS does not seem to be available in Haiku. Do I need to configure the build in some way or need to apply a patch (which I don’t know what it could be, of course)?

Try adding where required

target_os = "haiku"

Good luck on that one (although I think we managed that at one point) :slight_smile:

This one is specificly a pain, we’ve been experimenting with pip/numpy for a long time without succes. :frowning:

The problem with numpy is that every package that depends on it tries to download and build it. I’ve managed to work this around with llama-cpp-Python but not with FAISS, for example. I have instructed pip to skip the dependencies with —no-deps.
Am I missing something?

1 Like

Don’t think I did it with --no-deps, but could be mistaken, if it works then +1 (and we should try this more often) :slight_smile:

Ah I missed that post! This is interesting, I was collecting all my findings and the dependencies and actually maturin and pydantic-core are among the ones that blocked me.
Could you explain how you did it? My attempts failed building pydantic and other components written in Rust…
With a collective effort, I wish we could get a minimal AI stack for Haiku.

My personal TODO list:

  • a vector DB/store (Chroma) or a similarity search library (FAISS :white_check_mark:)
  • An agentic framework (llama-cpp-agent :white_check_mark: or even LangChain)
  • StreamLit (optional)

LangChain is a bit overengineered, to be honest, but a great tool nonetheless. With llama-cpp :white_check_mark:, llama-cpp-python :white_check_mark: and OpenAI :white_check_mark: in place and the three tools above we could have a minimal but complete stack to build assistants and agents that run either locally or remotely (with the OpenAI API).

My goal is to try to build an assistant for Genio which is optimized for coding tasks, tooling, function calling and RAG.

EDIT: using a remote LLM may require some kind of data masking but it looks like the least of the problems, here.

The patch to numpy looks pretty simple, did you guys try to get it merged upstream? We may be able to get it with pip and build the wheel locally, I presume.

I’ll poke @BiPolar (OscarL) when he’s around, if he can’t, I could try to see, or maybe you could upstream it if needed? OscarL is our python man :smiley: (me grins)

I’ve succesfully built FAISS and its Python bindings!

The runtime (approx 20s), using OpenBLAS, is in line with the documentation. If we ever get Intel Math Kernel Library, FAISS would get much faster. So they say.

I’ve also check-marked the components of the stack that build/install, at least.

2 Likes