AhanaZoo is a model repository where every model is stored in losslessly compressed .aarm format. Browse, download, and run LLMs at 51% smaller size โ with full bit-perfect fidelity.
Every model in AhanaZoo is pre-compressed using a WeightTransformer trained on the distribution of neural network parameters across hundreds of architectures โ not generic data compression.
Our WeightTransformer is trained specifically on the statistical patterns of neural network weights โ DPCM delta coding, per-block quantization, and zigzag encoding optimized for float16 tensors.
Load any single layer from a compressed 70B model in milliseconds. The .aarm JSON table of contents maps every tensor to its exact byte offset โ no sequential scan required.
AhanaZoo models load transparently with the standard HuggingFace API via our ACP5LazyStateDict adapter. Your existing code works โ you just download 51% less data.
A 32B parameter model (60 GB in SafeTensors) compresses to ~29 GB in .aarm. Multiply across a rack of model servers โ the savings accumulate fast.
AhanaZoo hosts a curated set of state-of-the-art open models โ Llama, Qwen, Mistral, Phi, and others โ all pre-compressed and verified for exact weight fidelity.
Models from AhanaZoo are pre-indexed for CAB (Compressed Activation Buffer) inference โ enabling 70B models to run on consumer GPUs via layer-streaming without quantization.
AhanaZoo is in beta. Join early access to be among the first to browse and download compressed models.
Get Early Access โ