This breakthrough, centering around an innovative use of flash memory, could revolutionize the way AI is integrated into mobile technology.
Large language models, which power AI-driven chatbots, are known for their extensive data and memory requirements. These demands present a significant challenge for devices with limited memory, such as iPhones. Apple researchers have tackled this by developing a unique technique that leverages flash memory, typically used for storing apps and photos, to house the AI model’s data. Their research paper, “LLM in a flash: Efficient Large Language Model Inference with Limited Memory”, details this approach.
The method involves two key techniques:
- Windowing: This process involves reusing already processed data, reducing the need for constant memory fetching and enhancing efficiency.
- Row-Column Bundling: This technique enables more efficient grouping of data, allowing for quicker reading from flash memory and faster AI processing.
These innovations allow AI models to operate up to twice the size of the iPhone’s available memory. This results in a performance boost, with up to a four to five times increase in speed on standard processors (CPUs), and an impressive 20 to 25 times acceleration on graphics processors (GPUs).
This advancement is pivotal for deploying advanced LLMs in devices with limited resources, broadening their use and accessibility. It opens up exciting prospects for future iPhones, including enhanced Siri capabilities, real-time language translation, and sophisticated AI-driven features in photography and augmented reality. Additionally, this could enable iPhones to host complex AI assistants and chatbots directly on the device.
Apple’s AI advancements don’t stop here. Internally dubbed “Apple GPT”, the company is developing its own generative AI model, “Ajax”, rivaling OpenAI’s GPT-3 and GPT-4. Ajax, with its 200 billion parameters, indicates a substantial leap in language understanding and generation. This model is part of Apple’s broader strategy to deeply integrate AI across its ecosystem.
Despite Ajax’s capabilities, which are reportedly more advanced than ChatGPT 3.5, it is suggested that OpenAI’s newer models might have surpassed Ajax as of September 2023. Nevertheless, Apple’s commitment to AI innovation is clear, with plans to incorporate AI features in the iPhone and iPad around late 2024, coinciding with the release of iOS 18.
According to The Information and analyst Jeff Pu, Apple is building AI servers and will offer a mix of cloud-based and on-device AI processing, marking a new era in mobile AI technology.