Show HN: Multi-agent autoresearch for ANE inference beats Apple's CoreML by 6× https://ift.tt/zskwVKy
Show HN: Multi-agent autoresearch for ANE inference beats Apple's CoreML by 6× We ran an experiment over the weekend to explore whether multiple autonomous agents could collaboratively optimize inference on Apple’s Neural Engine (ANE). Each agent ran locally on a different Mac (M1–M4), repeatedly modifying how a DistilBERT model is executed on the ANE, benchmarking latency, and sharing results and insights with other agents in real time. Instead of exploring independently, agents could: - see what others had tried - reuse working strategies - avoid known failure modes Across all tested chips, the agents ended up outperforming Apple’s CoreML baseline, with up to 6.31× lower median inference latency on the same hardware. An interesting pattern we observed: an agent stuck at ~2.1ms latency on M4 was able to break through after incorporating strategies discovered by agents on different chips (M2, M4 Max), eventually reaching ~1.5ms and surpassing CoreML. Full write-up: https://ift.tt/2E9PWb0 Detailed results: https://ift.tt/61kQyx9 https://ift.tt/pjo1RyT Curious what other optimization problems this kind of setup could be applied to, especially in systems, compilers, or ML infra. Would be interested in exploring similar experiments. https://ift.tt/R8x7Ud4 April 1, 2026 at 01:01AM
0 टिप्पणियाँ:
एक टिप्पणी भेजें