Skip to content

Conversation

@Thireus
Copy link
Contributor

@Thireus Thireus commented Nov 17, 2025

First, thank you for maintaining this project — it has been very useful, and I appreciate the work that has gone into it.

I initially created a fork to add automated Windows builds for my own use, since I needed ready-to-use binaries. Since this is functionality that could benefit other users as well, I’m submitting this pull request so the Windows build workflow can live directly in the main repository instead of in my fork.

This PR includes:

  1. GitHub Actions workflow for building the project on Windows
    
  2. Automatic artifact uploads so users can download ready-to-use Windows builds
    
  3. Other small tweaks to ensure the code compiles on Windows.
    

Builds must be manually triggered (I believe they could be automated after each commit, but I have not had the chance to dig into it). There is also a bit of cleanup to do, specifically regarding other automated jobs that run to conduct meaningless checks. The original code was obtained from mainline with some tweaking to adapt it to ik_llama.cpp.

My goal was to make the project more accessible to Windows users, specifically users who do not find the time to set up a dev env or don't have the knowledge to do it on Windows, such as myself. If you’d prefer changes to structure, naming, workflow triggers, or anything else, I’m happy to adjust the PR accordingly.

Thanks again for the project and for taking the time to review this!

inline __m512i operator^(const __m512i& a, const __m512i& b) { return _mm512_xor_si512(a, b); }

// AVX2 integer bitwise operators
inline __m256i operator|(const __m256i& a, const __m256i& b) { return _mm256_or_si256(a, b); }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary.

inline __m256i operator^(const __m256i& a, const __m256i& b) { return _mm256_xor_si256(a, b); }

// NEON
#ifdef __ARM_NEON
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary. The |,&,^ operators ended up by mistake in the AVX512 version, but are not used in any of the other implementations.

@ikawrakow
Copy link
Owner

So, I specifically threw out all of llama.cpp CI and GitHub actions. And now we will have them all back, almost all failing?

@Thireus
Copy link
Contributor Author

Thireus commented Nov 17, 2025

Thank you for the comments.

If the goal is wider adoption then removing CI and builds works against it. The builds I produce have a few dozens of users (and other projects, e.g. Jan: janhq/jan#6917, are actively asking for ready-to-download ik_llama.cpp artifacts).

The demand is there. But without CI and artifact generation, only power users can realistically adopt ik_llama.cpp which is unfortunate, since users with lower-end consumer hardware - the ones who benefit most from ik_llama.cpp’s optimisations - are in my opinion the least likely to build from source. Same reasoning applies I suppose to why users keep sticking to Windows OS while clearly it's not the best OS to run LLMs.

From my perspective, facilitating integration with other frameworks and lowering the barrier to use will naturally boost adoption.

@ikawrakow
Copy link
Owner

I can see the benefit of producing ready builds. But if so, then only for the platforms that are actually supported. To me it looks like this has been copied over from llama.cpp, so there is SYCL, RiscV, and the kitchen sink.

llama.cpp has put way more effort into making it work on as many platforms as possible. Their back-ends are loaded dynamically, which allows to provide build artifacts more easily. None of this is true here. I would have thought that you have done a few Windows configurations (CPU-only, CPU+CUDA, etc.), and that was that. A lot of the llama.cpp stuff just does not apply here anymore.

@Thireus Thireus marked this pull request as draft November 17, 2025 16:41
@Thireus
Copy link
Contributor Author

Thireus commented Nov 17, 2025

I did what I could with the knowledge and resources I had. I had to start from something that worked and hacked my way into making it function for ik_llama.cpp. I acknowledge that this CI is definitely “dirty” code and much of it could be thrown away.

I can’t afford to spend the amount of time required for full DevOps and cross-platform support right now. That’s why everything except Windows CUDA is disabled. I tried multiple times to build without CUDA and for other OSs, but I couldn’t get it to work.

This setup certainly requires cleaning and isn’t ready to merge if the goal is fully clean, production-quality code. It was intended more as a starting point.

@Kreatifchk
Copy link

I did what I could with the knowledge and resources I had. I had to start from something that worked and hacked my way into making it function for ik_llama.cpp. I acknowledge that this CI is definitely “dirty” code and much of it could be thrown away.

I can’t afford to spend the amount of time required for full DevOps and cross-platform support right now. That’s why everything except Windows CUDA is disabled. I tried multiple times to build without CUDA and for other OSs, but I couldn’t get it to work.

This setup certainly requires cleaning and isn’t ready to merge if the goal is fully clean, production-quality code. It was intended more as a starting point.

do you have a CPU only version, I didn’t find it?

@Thireus
Copy link
Contributor Author

Thireus commented Nov 24, 2025

Unfortunately not, I tried that a while ago and spent countless hours on it but it fails to build without Cuda.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants