Mac Mini M4 Pro — Hardware Specification
HIIE Phase 1 runs on a single Mac Mini M4 Pro node, hosted at Registered Agentics facilities. Apple Silicon's unified memory architecture eliminates the PCIe bandwidth bottleneck present in traditional CPU/GPU/RAM split configurations.
Target Configuration
| Component | Specification |
|---|---|
| Model | Apple Mac Mini M4 Pro |
| CPU | 12-core M4 Pro (8 Performance + 4 Efficiency) |
| GPU | 20-core M4 Pro GPU |
| Neural Engine | 16-core Apple Neural Engine — 38 TOPS (dedicated fine-tuning substrate) |
| Unified RAM | 64 GB (recommended) |
| Primary SSD | 256 GB — OS, models, hot cache, active sessions, LoRA adapter store |
| Secondary HDD | 5 TB external — vector DB, archive, datasets, versioned adapter archive |
| Network | 10 GbE Ethernet |
| Power Draw | 30–80 W under full inference load |
Storage Architecture — SSD-Constrained Design
Given the 256 GB SSD versus 5 TB HDD constraint, HIIE implements a streaming-first data architecture. Internet content, patent databases, research papers, and GitHub repositories are analyzed entirely in RAM and never written to disk. Only extracted structured embeddings — typically < 0.1% of the original corpus size — are persisted to ChromaDB on HDD.
- SSD — Working Tier: Model weights, hot vector shards, active project outputs, agent state checkpoints, and current LoRA adapters. The inference pipeline reads exclusively from SSD during active operation.
- HDD — Archive Tier: Completed project directories, the full ChromaDB collection suite, versioned adapter history, and curated datasets. HDD is not read during active inference — only at project initialization and on-demand archive retrieval.
Per-project SSD budget is approximately 30 GB. Given the 60 GB SSD allocation for hot cache and active outputs, two concurrent active projects is the practical maximum.
Memory Allocation Strategy
| Resource | Allocated To | Notes |
|---|---|---|
| RAM 40 GB | Active model inference | Quantized 14B–32B weights in unified pool |
| RAM 12 GB | Multi-agent session state | All active project agents held in memory |
| RAM 8 GB | Live internet streaming buffer | Analyzed in RAM, never written to disk |
| RAM 4 GB | System + Coolify overhead | macOS + Docker + containerization |
| RAM <2 GB | ANE fine-tuning state | LoRA adapter + gradient buffer (background) |
CPU / GPU Task Assignment
| Component | Assigned Tasks |
|---|---|
| GPU (20-core) | LLM inference, embeddings, parallel domain processing |
| ANE (16-core) | LoRA fine-tuning on specialist models (background) |
| CPU Performance Cores (8) | Orchestration, agent delegation, simulation runners |
| CPU Efficiency Cores (4) | I/O, HDD read/write, web retrieval, ethics scoring |
Apple Silicon Advantage
The M4 Pro's unified memory pool is shared between CPU and GPU with no PCIe copy overhead — a significant advantage for large context windows, multi-agent state management, and continuous inference on a single-node system.