PHP and AI: What Actually Exists at the Language Level
PHP runs about three-quarters of the web, and yet whenever AI comes up in conversation, PHP is the friend who wasn’t invited to the party. The received wisdom is that AI belongs to Python and that PHP developers are limited to politely calling someone else’s API. That story is out of date. Over the past two years a quiet but surprisingly substantial ecosystem has grown up around PHP, made of native extensions, Composer packages, model formats and data tooling that bring AI capabilities straight into the runtime. This post is a tour of what genuinely exists today, starting with a relic and ending somewhere close to the cutting edge.
A Starting Point: FANN
The tale of PHP and neural networks begins earlier than most people expect. The fann PECL extension, a PHP wrapper around the Fast Artificial Neural Network C library, has sat in the official PHP manual for years. It gives you a full API for creating, training and running multilayer neural networks without ever leaving PHP, and its functions map almost one to one onto the FANN C API.
These days FANN is best understood as a museum piece with the lights still on. The C library underneath it has barely moved in a decade, and the extension wears that age openly. Still, it deserves a mention, because it makes a quietly important point: PHP has had a compiled, in-process neural network interface for longer than most of us remember. Everything that follows rests on the same basic bargain. Native code does the hard maths, PHP holds the steering wheel. The difference now is simply what we’re driving towards.
Training in Pure PHP: Rubix ML
If you want to build and train machine learning models entirely in PHP, without summoning a C compiler from the depths, Rubix ML is the grown-up choice. It has passed 1.15 million Composer installs and ships more than forty supervised and unsupervised algorithms, covering the whole life cycle from loading data to preprocessing, training, cross-validation and production inference.
Rubix ML happily handles classification, regression, clustering, anomaly detection and a respectable amount of natural language work, with its rubix/tensor dependency doing the matrix mathematics underneath. It needs no native extensions and runs wherever PHP runs, which is most places.
The catch, stated honestly, is scope. Rubix ML is excellent at classical machine learning such as decision trees, support vector machines and k-means, but it is not a deep learning framework and was never trying to be. It will not run a transformer or anything resembling a modern large language model. For that, the ecosystem heads off in a different direction entirely.
Running Pre-Trained Models: ONNX in PHP
ONNX, the Open Neural Network Exchange, is the open standard for describing a model independently of whatever framework gave birth to it. Train something in PyTorch or TensorFlow, export it to ONNX, and it will run anywhere ONNX Runtime is installed. As of March 2026 the format sits at version 1.21 and is maintained by the Linux Foundation, which is the software equivalent of having a sensible adult cosign your loan.
PHP gets to ONNX by two roads.
The first is FFI, the Foreign Function Interface that has shipped with PHP since 7.4. The package ankane/onnxruntime-php offers a low-level wrapper that hews closely to the Python ONNX Runtime API, complete with an inference session class and fine control over execution. A smaller package, veka-server/onnx-php, takes the same FFI route with a gentler API. Both want the native ONNX Runtime binary present on the system and the FFI extension switched on.
The second road is the more ambitious one, and it leads to TransformersPHP, published as codewithkyrian/transformers. The goal here is bold: to be the PHP counterpart of Hugging Face’s Python Transformers library. It downloads ONNX-format model weights from the Hugging Face Hub the first time you use them, hands you a pipeline API that feels almost identical to the Python original, and quietly takes care of tokenization, preprocessing and post-processing. Sentiment analysis, text classification, named entity recognition, embeddings, image classification and more all sit behind a single pipeline() call. ONNX Runtime does the work via FFI, and any PyTorch or TensorFlow model can be converted to ONNX for the occasion using Hugging Face Optimum.
For those who like to live closer to the metal, there is krakjoe/ort, known to its friends as ext-ort. This is a compiled C extension that provides tensor mathematics directly inside the PHP process, with SIMD-accelerated operations, multi-core parallelism, nine data types and zero-copy tensor slicing, plus optional ONNX Runtime integration bolted on top. The clever part is that the maths and the inference are independent, so you can use it purely as a fast tensor library and ignore ONNX altogether. It also introduces persistent tensors that survive a PHP request cycle and can be shared between threads, which matters a great deal in long-running runtimes such as FrankenPHP.
Running LLMs Directly in PHP: GGUF and Native Extensions
Here is the development that genuinely raises an eyebrow. PHP now has extensions that can load and run a large language model inside the PHP process itself. No Python sidecar, no HTTP call to a local server humming away in another window. Just PHP and a model file.
What makes this possible is the GGUF format, short for GGML Universal File. The llama.cpp project introduced it in August 2023 to keep both the model weights and all the metadata together in a single binary file. It supports several quantization levels, which is the polite way of saying it shrinks enormous models down until they fit on hardware you can actually afford. By 2026 there are tens of thousands of GGUF checkpoints on Hugging Face covering Llama, Mistral, Qwen, DeepSeek, Gemma, Phi and most of the rest of the alphabet. GGUF is the inference-friendly cousin of SafeTensors, the format that models usually arrive in from the Hub and get converted from before they go local.
The standout here is rlerdorf/ext-llama, written by Rasmus Lerdorf, the man who created PHP in the first place and has apparently decided to keep an eye on where it’s going. It is a compiled C extension wrapping llama.cpp, and its API is refreshingly tidy. A model class loads a .gguf file, a context class runs the inference. It offloads layers to the GPU on request, memory-maps the model so that PHP-FPM workers share the weights instead of each hoarding their own copy, loads and hot-swaps LoRA adapters, tokenizes and detokenizes, and reads GGUF metadata. A companion FFI-based package, helgesverre/local-llm-php, reaches the same destination without compilation, trading a little speed for a much easier install.
Then there is displace/ext-infer, which arrives at the same problem from the Rust direction. It is built on ext-php-rs and the llama-cpp-2 bindings, targets PHP 8.3 and up, and is thread-safe by design so that it behaves itself under ZTS PHP and the parallel extension. Its prompt builder is fluent and reads like English rather than a soup of special tokens. It also knows about reasoning models, so it will neatly separate a model’s visible answer from the thinking it muttered to itself on the way there.
Both of these mean the same remarkable thing. The model lives in the process. For command-line workers, background jobs and persistent runtimes, that translates to inference with no network round-trip and no extra process to babysit.
Dataset Formats: Parquet and Apache Arrow
AI workflows are hungry for structured data, and the default format for large datasets, including most of what lives on Hugging Face, is Apache Parquet. Parquet stores data in columns rather than rows, which makes it compress well and read quickly in bulk, and it is the format Hugging Face serves dataset files in.
PHP has two serious Parquet implementations. The package codename/parquet is a pure PHP, Thrift-based reader with a memory-efficient iterator that walks through data one page at a time rather than swallowing the whole file at once. The package flow-php/parquet, part of the larger Flow PHP data framework, is more ambitious. It uses a pluggable engine system: a pure PHP engine that works everywhere, and an Arrow engine powered by ext-arrow, a native Rust-based PHP extension that brings Apache Arrow support with every compression codec built in. The default engine is adaptive and politely picks Arrow if it is installed, falling back to PHP if it isn’t.
Apache Arrow itself, the language-agnostic columnar memory format that sits beneath fast Parquet reading, has no official PHP library of its own, but ext-arrow fills that gap for the Flow PHP world. Arrow earns its place in an AI article because it is the lingua franca between data tools such as DuckDB, Pandas, Polars and Spark and the pipelines that feed models.
Tokenizers in PHP
Every interaction with a language model begins with tokenization, the unglamorous step where raw text is chopped into subword tokens, turned into numbers and handed to the model. Get it wrong and you miscount your context window, chunk documents in the wrong places and discover your prompt has quietly overflowed without telling anyone.
The package codewithkyrian/tokenizers handles this in pure PHP, with no FFI and no extensions to compile. It implements the same tokenizers that drive Hugging Face models, covering Byte-Pair Encoding as used by GPT-2 and Llama, WordPiece as used by BERT, and the Unigram approach behind SentencePiece models. It has been checked against BERT, GPT-2, Llama, Gemma, Qwen and others. If you need an accurate token count in PHP before firing a prompt at an API or a local model, this is the real tool rather than the back-of-an-envelope estimate most code reaches for.
A Format for LLM Prompts: TOON
There is also a format aimed at a very specific and very modern annoyance: sending structured data to a language model costs money by the token, and JSON is gloriously wasteful with its endless quotes and braces. TOON, the Token-Oriented Object Notation, was introduced in 2024 and reached version 3.0 in November 2025. It is a compact, lossless way of encoding the JSON data model, borrowing YAML-style indentation for nested objects and a CSV-style tabular layout for uniform arrays. At its best, on uniform arrays of objects, it trims token counts by between 30 and 60 percent while staying readable to humans.
PHP, never one to do something once when it could do it four times, has produced several independent TOON packages, including helgesverre/toon-php, toonphp/toon, abdelhamiderrahmouni/toon-php and a Laravel-flavoured laravelplus/toon. The abundance is its own warning sign. The TOON specification is still evolving, so implementations don’t always agree with one another across versions, and you’ll want to check which spec a library targets before trusting your token savings to it.
PHP Code as AI Training Data
There is one more angle, less visible but quietly significant: PHP as the subject of AI rather than merely a consumer of it. Several datasets of PHP source code now live on Hugging Face specifically for training and fine-tuning models. One, ajibawa-2023/PHP-Code-Large, gathers more than twelve million lines of PHP for pre-training and code intelligence research. Another, Nan-Do/code-search-net-php, is the PHP slice of CodeSearchNet with function summaries attached, useful for turning code into descriptions and back again. The big multi-language corpus behind StarCoder, bigcode/the-stack, includes PHP among its hundreds of languages. These matter because they are the raw material for PHP-aware code models, and because you could fine-tune one of those models and then run it locally using the very extensions described earlier in this post, which is a pleasing sort of loop.
The Shape of the Ecosystem
Step back and a pattern comes into focus. PHP’s AI capabilities are not arriving as language design. There is no AI-specific syntax, no official extension blessed into the standard distribution. Instead the ecosystem is assembling itself in three layers that stack rather neatly.
At the top sit userland Composer packages, the accessible surface where TransformersPHP, Rubix ML, the tokenizers, the TOON libraries and the Parquet readers all install with a single command and run wherever PHP runs. Beneath them are the FFI bridges, available since PHP 7.4, which make ONNX Runtime and llama.cpp reachable without compilation, asking only that you keep a native binary around. At the foundation are the compiled C and Rust extensions, the frontier, where ext-ort, ext-llama, ext-inferand ext-arrow bring hardware-level performance directly into the process for anyone willing to compile them.
The honest summary is that PHP’s AI story is mostly a story of wrapping other people’s hard work, llama.cpp, ONNX Runtime, Apache Arrow, rather than building inference engines from scratch. But the wrapping has grown confident enough that a PHP developer can now embed a quantized language model, run transformer inference, count tokens accurately for any Hugging Face model and read enormous Parquet datasets, all from PHP, without a single Python process anywhere in sight. Not bad for the friend who wasn’t invited to the party.

