Creativity, Algorithms, DARPA Corey Hubbard Creativity, Algorithms, DARPA Corey Hubbard

Mobile AI Chaos Solved? M4's Firmware-Like LLM Brain Takes Control!

Tired of fragmented mobile AI? Meet M4, the pioneering OS-hardware co-managed foundation model acting like firmware for your smartphone! This LLM-powered solution tackles 38+ diverse tasks across vision, language, audio, and multimodal inputs with accuracy parity in 85% of cases. M4 dramatically simplifies NPU design and boosts on-device intelligence and data privacy, overcoming the "ad-hoc" mess with lightweight "adapters" and significantly fewer operators. Get ready for mobile AI that finally makes sense!

Read More
Creativity, Digital Age, Computer Science Corey Hubbard Creativity, Digital Age, Computer Science Corey Hubbard

The 20TB Multilingual LLM Data Revolution | Scale to 1000+ Languages with One Pipeline

Unlock the full potential of state-of-the-art multilingual LLMs with FineWeb2, a groundbreaking 20 terabyte (5 billion document) dataset. This new pre-training data is generated by a revolutionary curation pipeline that automatically adapts to support any language. Overcoming the inherent difficulty of tailoring filtering and deduplication for a large number of languages, FineWeb2 has been scaled to over 1000 languages using Common Crawl snapshots. It produces more performant models than prior datasets for non-English corpora and includes a principled approach to rebalance datasets for additional performance uplift. Access the released dataset, pipeline, training, and evaluation codebases today!

Read More